Abstract
Objectives
There is continuing pressure to improve the cost effectiveness of quality control (QC) for clinical laboratory testing. Risk-based approaches are promising but recent research has uncovered problems in some common methods. There is a need for improvements in risk-based methods for quality control.
Methods
We provide an overview of a dynamic model for assay behavior. We demonstrate the practical application of the model using simulation and compare the performance of simple Shewhart QC monitoring against Westgard rules. We also demonstrate the utility of trade-off curves for analysis of QC performance.
Results
Westgard rules outperform simple Shewhart control over a narrow range of the trade-off curve of false-positive and false negative risk. The risk trade-off can be visualized in terms of risk, risk vs. cost, or in terms of cost. Risk trade-off curves can be “smoothed” by log transformation.
Conclusions
Dynamic risk-models may provide advantages relative to static models for risk-based QC analysis.
Introduction
Clinical laboratories perform over 7 billion tests per year, the results of which affect patient care decisions, so it is imperative that the results of these tests are as accurate as possible [1]. Unfortunately, all test results suffer some errors (systematic error, measurement error, or both), and clinical laboratories devote considerable resources to quality control (QC) programs to minimize the impact of errors. Although the cost of quality (COQ) is a widely used concept, estimates of cost of quality are rarely published, however, a survey of COQ in business settings found that the COQ ranged from 2 to 35% of sales revenue [2]. The components of the COQ in the clinical laboratory setting have been described in detail and a study estimated that the COQ represented 32% of clinical laboratory costs. Studies also have shown that changes in QC monitoring procedures [3, 4]. Research has demonstrated that changes in QC monitoring policies can lead to substantial cost savings [5]. As a result, there is a an incentive to find ways to improve the cost effectiveness of QC monitoring.
QC costs result from two types of events: (1) false positive (FP) and (2) false negative (FN) events. QC systems are designed to detect the state of the system with respect to the presence of systematic errors. The system is said to be in control (IC) when there is no systematic error and out of control (OOC) when a systematic error is present. A QC monitoring system defines rules for classifying the system as IC or OOC, depending on the observed QC results. An FP event occurs when the system misclassifies the system as OOC (i.e., the rules raise a flag) when the system is, in fact, IC. An FN event occurs when the system is OOC but the monitoring system fails to detect the systematic error. FP events incur costs associated with unnecessary troubleshooting, repeat testing, and recalibration. FN events incur costs associated with the impact of suboptimal medical care. Recent approaches to QC have sought to balance these two costs using risk-based approaches [6], [7], [8], [9].
The Parvin model is one of the most commonly used risk-based approaches [10]. This model is based on the expected number of unacceptable final results,
The NOOCTA assumption is questionable, as it requires one to believe that the IC is somehow special and distinct from the OOC states with respect to transition behavior. If the assay can make transitions away from an IC to an OOC state, why is the system incapable of making further transitions once it has arrived at an OOC state? We investigated the impact of this assumption and found that the NOOCTA assumption led to unexpected behavior [11]. We would expect that the risk of FN events would increase with the width of the control limits and that the risk of FP events would increase as the width of the control limits decrease. Overall, one would expect a monotonically decreasing trade-off curve between FP risk and FN risk, parameterized by the width of the control limits. We found that the NOOCTA assumption led to non-monotonic trade-off curves that had a maximum at specific control limits. In contrast, the trade-off curves had the expected monotonically decreasing shape when the NOOCTA assumption was relaxed to allow for transitions between OOC states (OOCTA) [11]. These findings led us to explore alternative approaches to the risk-based analysis of QC. To that end, we developed a new approach based on a dynamic model of assay behavior [12, 13]. We refer to this model as Precision QC (PQC) because similar to precision medicine, it is flexible and can be adapted to a wide range of QC monitoring methods, QC behaviors, and clinical scenarios.
The objective of this paper is to provide an overview of the PQC model and demonstrate its practical applicability.
Materials and methods
Overview of the precision method
Definition of risk
We begin with a formal definition of risk that will be applied to the case of a clinical assay. Risk is defined as the expected loss over a set of potential events:
where E designates an event,
Mathematical model for probabilities
We view an assay as a dynamic system that evolves from state to state over time. The state of the system corresponds to the mean of the QC observations μ. We assume that samples are produced in batches of size b. We further assume that the mean remains constant during any batch but may change between batches. The observed QC values are given by the following:
where the index n designates the batch number. The mean
Shifts are described by a probability distribution
The system can be envisioned as a random walk in which the system takes steps of size
The shift frequency and the shift size distribution characterize the stability or transition behavior of the assay as it evolves from state to state. The transition behavior can be summarized using a transition matrix. The rows represent the starting state, and the columns represent the destination state. Each entry of the matrix gives the conditional probability of moving to each destination, given that the system is in a particular starting state:
The transition matrix
The system is monitored by a QC monitoring plan described by a power curve (a power curve gives the probability of detecting an OOC state). The model is flexible and can accept any QC monitoring plan, such as a simple Shewhart control chart, Westgard rules, cumulative sum (CUSUM), or exponentially weighted moving average (EWMA). The probabilities in the transition matrix are determined by two factors: the intrinsic stability of the assay (shift frequency and shift distribution) and the influence of the QC monitoring plan [12]. The probability of moving from the IC state to an OOC state is determined by assay stability. Given a shift, the assay stability (shift distribution) also determines the likelihood of each OOC state. Once in an OOC state, the system can move to a new state either by detection or by making another shift. If the probability of detection is low (e.g., a state with a small systematic error), the system is likely to remain in the state for many batches. However, if the probability of detection is high (e.g., a state with a large systematic error), the system will only occupy such a state for a short period before the systematic error is detected.
Loss functions
Each QC monitoring plan will have criteria for classifying QC results as positive or negative. Furthermore, each QC observation is evaluated and classified as positive or negative. Losses are associated with incorrect classifications. A TN result occurs when the system is in the IC state, and the result is classified as negative. A TP result occurs when the observation is classified as positive and
FP results create costs for the laboratory. For example, each QC flag leads to costs associated with troubleshooting, reagent use, downtime, and repeat analyses. FN results create costs for patients due to the harm associated with inaccurate results. In principle, there should be a willingness to pay (WTP) to avoid erroneous results. There is a trade-off between the rate of FP and FN results. FP results increase when the QC monitoring policy is stringent (e.g., narrow control limits on a Shewhart control chart) and decrease when the policy is less stringent. FN results decrease when QC monitoring is stringent but increase when QC monitoring is less stringent. Given these trade-offs, the selection of parameters for a QC monitoring process can be viewed as an optimization problem that seeks to find the optimum balance between costs to the laboratory (FP risk) and costs to patients (FN risk).
We used an incremental costing approach as is commonly used in cost-effectiveness analysis [16]. In such an approach, costs are assigned relative to a baseline. In our model, FP costs represent the incremental cost relative to the cost associated with true positive results and FN costs represent the costs relative to TN results.
Solution methods
There are two approaches to studying the behavior of a dynamic system: (1) simulations and (2) analytic methods. Simulations are straightforward, but depending on the required accuracy, they can require significant computing time. Simulation studies can take days to obtain precise estimates of low-probability events. Analytic methods are more complex but provide exact results relatively quickly. We used both approaches in previous studies [11], [12], [13]. In this study, we used simulations to compare the performance of a simple Shewhart control chart and Westgard rules.
Simulation study design
We obtained trade-off curves for simple Shewhart control plans and compared these to the trade-off points obtained using Westgard rules [17, 18]. Trade-off curves for Shewhart control were obtained by varying the control limits k∊{1.5, 2.0, …, 5}. Trade-off points for Westgard method were obtained by running simulations for following rule sets: 13S, 13S22S, 13S22S41S, 13S22S41SR4S, 13S22S41SR4S10X [19]. TEA was varied from 2 to 5, TEA∊{2.0, 2.5, 3, …, 5}. We assumed a uniform shift distribution
Trade-off visualizations
Trade-off curves can be visualized in several ways. We used three methods: (1) FP risk vs. FN risk, (2) FP cost vs. FN risk, and (3) total cost vs. control limit. We converted FP risk to FP cost by assuming a cost of $50 per FP event. We converted FN risk to FN cost by assuming a willingness to pay $5 to avoid an unacceptable result.
Results
We generated curves showing the trade-off between FP risk and FN risk. As expected, the curve shows a monotonically decreasing relationship between FP risk and FN risk (Figure 2). Wide control limits (e.g., k=4) were found to be associated with low FP risk and high FN risk. Narrow control limits (e.g., k=1.5) were associated with high FP risk and low FN risk.
Trade-off curves can be visualized differently (Figure 3). If one can determine the cost of an FP event, one can plot the FP cost as a function of FN risk. Similarly, if one can estimate the cost of an FN event, one can plot the total cost as a function of the control limits to determine the optimum (i.e., cost-minimizing) control limit.
Trade-off curves are easier to visualize and compare when they are log transformed. For example, the log-transformed curves show the impact of TEA on risk (Figure 4) or the impact of shift probability (Figure 5). TEA has no impact on FP risk, and given a fixed control limit, FN risk decreases as TEA increases. Similarly, shift probability has little impact on FN risk; however, FN risk increases as shift probability increases.
We used log-transformed trade-off curves to compare the performance of simple Shewhart control rules against Westgard control rules (Figure 6). Westgard rules incurred a slightly lower FP risk than Shewhart control rules for a given level of FN risk when TEA was low (i.e., TEA=2). This disadvantage disappeared as the TEA increased to 4.
Discussion
This paper provides an overview of the PQC model. The PQC model is a risk-based model that views an assay as a dynamic system that evolves through various states over time. The system starts in the IC state and, at some point, moves to an OOC state. The system remains in an OOC state until the QC monitoring system raises a detection signal, after which the system is restored to the IC state and the cycle repeats itself. The overall behavior of the system is determined by the proportion of time spent in each state (a state corresponds to a level of systematic error), which determines the risk. One of the key features of the PQC model is that it explicitly includes the shift probability p, which links the rates of FP and FN events. This feature makes it possible to construct trade-off curves between FP risk and FN risk, which characterize the performance of the QC monitoring system. Previous methods, such as the Parvin method, focused on a single OOC event when the system actually evolves through repetitive cycles of moving from IC to OOC and back to IC.
We showed three different ways of visualizing the trade-off curves generated by the PQC method: (1) FP risk vs. FN risk, (2) FP cost vs. FN risk, and (3) total cost vs. control limits. Using Method 1, one would follow the trade-off curve, decreasing the FN risk and increasing the FP risk until a point is reached at which the FP risk is unacceptable. One would select the control limit corresponding to this point (the trade-off curve is parameterized by the control limit k). This method has the advantage of requiring no additional transformation of risk, but we suspect that most people would have difficulty making trade-offs in terms of risk. In Method 2, FP risk is transformed into the expected cost by multiplying the FP risk by the cost of an FP event. For example, if the FP risk is 0.02 per batch and the cost of an FP event is $50; therefore, the expected cost is $1 per event. One can then use the trade-off curve to find a point at which the FN risk corresponds to an acceptable FP cost. This method has the advantage that the cost of an FP event is relatively easy to estimate, and it is easier to make a trade-off between cost and FN risk. In the last approach, both FP and FN risks are converted to dollar terms. To convert FN risk to dollar terms, one needs to estimate WTP to avoid an FN event. Then, the expected cost of FNs is E[Nuf]*b*WTP, where b is the batch size. This approach results in a curve that shows the optimum (i.e., cost-minimizing) control limit. The problem is that it is difficult to estimate WTP. WTP is commonly applied in health economics and cost-effectiveness studies, but to our knowledge, it has not been applied in clinical laboratory testing. We believe that the concept has utility, even if it is not possible to obtain precise estimates of WTP. First, the general approach indicates that control limits should vary by WTP, and while one might not be able to specify an absolute WTP, one might be able to specify a relative WTP, or at least an ordinal WTP, which could help in setting control limits. For example, one might suppose that WTP for troponin is greater than WTP for creatinine, and all else being equal, the control limits for troponin should be narrower than the control limits for creatinine. Overall, we suspect that Method 2 might be the easiest to apply in practice.
We showed that log transformations of the trade-off forms yield curves that can be more easily compared to trade-off curves presented in native units. This allows one to show the impact of various parameters, such as the TEA, on the performance of a QC monitoring system. We also showed how the transformed trade-off curves can be used to compare the performance of different QC monitoring methods. In this study, we compared the performance of simple Shewhart control to Westgard control rules and found that Westgard rules were superior to Shewhart rules in the limited region (FP risk vs. FN risk) where Westgard rules are applicable. This result was somewhat surprising because some texts have advised against the use of “sensitizing rules” because such rules increase FP risk [18]. We found that Westgard rules fell below the trade-off curve for simple Shewhart control (i.e., for any level of FN risk, the FP risk of Westgard rules was less than the FP risk of Shewhart control) when TEA was less than 3. Our study was based on assumptions such as a threshold loss function and a uniform shift distribution, so these conclusions may not hold more generally.
We limited our analysis to shifts in the systematic error; however, our model can easily accommodate shifts in both systematic error and random error [12]. In our model, states are represented by probability distributions which allows from complete generality. We limited the analysis to shifts in systematic error in order to limit the complexity and because, to our knowledge, most analyses of QC performance focus on systematic error. Finally, very little is known about shifts in systematic error so it is difficult to specify a model. For that reason, we assumed that random error was normally distributed and stationary.
In practice, the selection of multirules is based on assay performance [19]. Our analysis shows the performance of a QC policy (e.g., QC limit for a Shewhart control chart, set of Westgard rules) conditioned on the assay performance (TEA, transition matrix). The analysis is independent of the method used to select the rules.
The PQC method requires four inputs: (1) shift probability, (2) shift distribution, (3) power curve, and (4) loss function [12]. Based on these inputs, the PQC method determines the trade-off between FP risk and FN risk, and by evaluating this trade-off, one can theoretically obtain optimal control limits. Our analysis suggests that setting control limits implies assumptions regarding these inputs. The inputs may be difficult to estimate, but the PQC makes the assumptions explicit. A recent study showed that most laboratories use 2sd QC limits across all assays [20]. Risk-based analysis has shown that this one-size-fits-all approach to QC will lead to suboptimal QC monitoring. Risk-based approaches, such as PQC, can provide guidance when selecting and optimizing QC monitoring methods.
We recognize that it may be difficult to estimate some of the parameters of our model. For example, it may be challenging to estimate the cost of a false negative. However, it is known that FN results incur costs. Unlike other models, our model makes these costs explicit. That way, researchers can test whether FN costs are important and assess the implications of various assumptions. In contrast, FN costs are implicitly assigned by choosing a QC policy.
We used simulation to obtain the results in this paper. We wish to note that we have developed methods to produce analytical solutions for our model [12]. The analytical approach has the advantage that it produces exact solutions and eliminates the noise associated with simulation estimates. This is particularly important for low event simulations (e.g., QC with 6 sigma capability). Also, mathematical analysis enables one to prove that the results are correct. We used simulation in this analysis to simplify the paper. The methods for exact results are complicated and simulation methods are commonly used in the literature in laboratory medicine.
Our model is not intended to be used directly. The mathematical calculations are complex and would most likely be carried out by a decision support system similar to Bio-Rad’s Mission Control. Thus, much like using Google, users would be shielded from the complexity of the underlying algorithms. Also, we should state that this paper is not directed toward practitioners. Rather, it is directed toward researchers with expertise in the mathematical analysis of QC methods who can critically evaluate our approach and, hopefully, improve upon it.
We believe the proposed PQC method provides a promising approach for the risk-based analysis of QC; however, there is a need for further development. For example, we are currently working on a method to estimate shift probability and shift size distributions and to determine the sensitivity of the model to errors in these estimates. There is also a need to explore other loss functions. Risk analysis has focused on E[Nuf]; however, there are many possible loss functions, and these could be tailored to specific assays. We are currently working to determine the sensitivity of risk analysis to assumptions regarding loss functions. Finally, we hope to develop a clinical decision support system that would suggest control limits based on the four inputs and enable lab personnel to explore the impact of their assumptions on the suggested control limits.
-
Research funding: None declared.
-
Author contributions: All authors have accepted responsibility for the entire content of this manuscript and approved its submission.
-
Competing interests: Authors state no conflict of interest.
-
Informed consent: Not applicable.
-
Ethical approval: Not applicable.
References
1. Wolcott, J, Schwartz, A, Goodman, C. Laboratory medicine: a national status report. Virginia, United States: The Lewin Group; 2008.Search in Google Scholar
2. Schiffauerova, A, Thomson, V. A review of research on cost of quality models and best practices. Int J Qual Reliab Manag 2006;23:647–69. https://doi.org/10.1108/02656710610672470.Search in Google Scholar
3. CLSI. Understanding the cost of quality in the laboratory: a report. Wayne, PA: Clinical and Laboratory Standards Institute; 2014.Search in Google Scholar
4. Elbireer, A, Gable, AR, Jackson, JB. Cost of quality at a clinical laboratory in a resource-limited country. Lab Med 2010;41:429–33. https://doi.org/10.1309/lmcz0zfr80qwibem.Search in Google Scholar
5. Katayev, A, Fleming, JK. Past, present, and future of laboratory quality control: patient-based real-time quality control or when getting more quality at less cost is not wishful thinking. J Lab Precis Med 2020;5:10–21037. https://doi.org/10.21037/jlpm-2019-qc-03.Search in Google Scholar
6. Bayat, H. Selecting multi-rule quality control procedures based on patient risk. Clin Chem Lab Med 2017;55:1702–8. https://doi.org/10.1515/cclm-2016-1077.Search in Google Scholar PubMed
7. Parvin, CA. Planning statistical quality control to minimize patient risk: it’s about time. Clin Chem 2018;64:249–50. https://doi.org/10.1373/clinchem.2017.282038.Search in Google Scholar PubMed
8. Parvin, CA, Baumann, NA. Assessing quality control strategies for HbA1c measurements from a patient risk perspective. J Diabetes Sci Technol 2018;12:786–91. https://doi.org/10.1177/1932296818758768.Search in Google Scholar PubMed PubMed Central
9. Westgard, JO, Bayat, H, Westgard, SA. Planning risk-based SQC schedules for bracketed operation of continuous production analyzers. Clin Chem 2018;64:289–96. https://doi.org/10.1373/clinchem.2017.278291.Search in Google Scholar PubMed
10. Parvin, CA, Gronowski, AM. Effect of analytical run length on quality-control (QC) performance and the QC planning process. Clin Chem 1997;43:2149–54. https://doi.org/10.1093/clinchem/43.11.2149.Search in Google Scholar
11. Schmidt, R, Moore, R, Walker, B, Rudolf, J. Risk analysis for quality control part 1: the impact of transition assumptions in the Parvin model (in press). J Appl Lab Med 2023.10.1093/jalm/jfac117Search in Google Scholar PubMed
12. Moore, R, Rudolf, J, Schmidt, R. Risk analysis for quality control Part 2: theoretical foundations for risk analysis. In press. J Appl Lab Med 2023.10.1093/jalm/jfac106Search in Google Scholar PubMed
13. Schmidt, R, Moore, R, Walker, B, Rudolf, J. Risk analysis for quality control, Part 3: practical application of the precision quality control model. In press. J Appl Lab Med 2023.10.1093/jalm/jfac116Search in Google Scholar PubMed
14. Yoe, C. Principles of risk analysis: decision making under uncertainty. New York, NY: CRC Press; 2019.10.1201/9780429021121Search in Google Scholar
15. Haimes, YY. Risk modeling, assessment, and management. Hoboken, NJ: John Wiley & Sons; 2005.10.1002/0471723908Search in Google Scholar
16. Detsky, AS, Naglie, IG. A clinician’s guide to cost-effectiveness analysis. Ann Intern Med 1990;113:147–54. https://doi.org/10.7326/0003-4819-113-2-147.Search in Google Scholar PubMed
17. Westgard, JO, Barry, PL. Basic QC practices: training in statistical quality control for medical laboratories. Madison, WI: Westgard QC; 2010.Search in Google Scholar
18. Montgomery, DC. Introduction to statistical quality control. Hoboken, NJ: John Wiley & Sons Incorporated; 2019.Search in Google Scholar
19. Westgard, JO. Statistical quality control procedures. Clin Lab Med 2013;33:111–24. https://doi.org/10.1016/j.cll.2012.10.004.Search in Google Scholar PubMed
20. Rosenbaum, MW, Flood, JG, Melanson, SE, Baumann, NA, Marzinke, MA, Rai, AJ, et al.. Quality control practices for chemistry and immunochemistry in a cohort of 21 large academic medical centers. Am J Clin Pathol 2018;150:96–104.10.1093/ajcp/aqy033Search in Google Scholar PubMed
© 2022 Walter de Gruyter GmbH, Berlin/Boston