Statistical Process Control

Mohammed Amin Mohammed

doi:10.1017/9781009326834

1 Introduction

This Element begins with an intuitive illustration of the two types of variation that underlie statistical process control methodology: ‘common cause’ variation, inherent in the process, and ‘special cause’ variation, which operates outside of that process. It then briefly describes the history, theory, and rationale of statistical process control methodology, before examining its use to monitor and improve the quality of healthcare through a series of case studies. The Element concludes by considering critiques of the methodology in healthcare and reflecting on its future role.

The statistical details for constructing the scores of charts found in statistical process control methodology are beyond the scope of this Element, but technical guides are signposted in Further Reading (see Section 6) and listed in the References section.

1.1 Understanding Variation by Handwriting the Letter ‘a’

In this section, we use the process of handwriting to demonstrate the ideas that underpin statistical process control methodology. Imagine writing the letter or signature ‘a’ by hand using pen and paper. Figure 1a shows seven ‘a’s written by the author. While the seven letters appear unremarkable, what is perhaps remarkable is that even though they were produced under the same conditions (same hand, date, time, place, pen, paper, temperature, light, blood pressure, heart rate, and so on) by the same process, they are not identical – rather, they show controlled variation. In other words, even a stable process produces variation or ‘noise’.

(a)

The figure shows a row of eight handwritten lower-case letter ‘a’ on lined writing paper. None of the letters look identical but the first seven letters have an overall visual consistency although they vary in formation and positioning between the faintly ruled lines. The eighth letter looks obviously different from the others in shape, formation, and positioning.

(b)

Figure 1 Handwritten letter ‘a’

In seeking to understand this controlled variation, it might be tempting to separate the ‘a’s into better and worse and try to learn from the best and eliminate the worst. This would be a fundamental mistake, since the conditions that produced them were the same, and so no ‘a’ is better or worse than its peers. The total variation seen in the seven ‘a’s has a common cause, which is inherent in the underlying process. Efforts to improve the quality of the letters need to focus on changing that process, not on trying to learn from the differences between the letters.

What changes could we make to the underlying process to reduce the variation and improve the quality of the ‘a’s? We could change the pen, paper, or surface, or we could use a computer instead. Of these suggestions, we might guess that using a computer will result in marked improvements to our ‘a’s. Why? We can draw useful insight from the theory of constraints, which compares processes to a chain with multiple links.Reference Cox and Schleier¹ The strength of a chain is governed or constrained by its weakest link. Strengthen the weakest link and the chain improves. Strengthening other links simply uses up resources with no benefit. In the handwriting process, the weakest link (constraint) is the use of the hand to write the letter. The pen, paper, light, and so on are non-constraints; if we change one of them, we will not make a material difference to the quality of our ‘a’s. Switching to a computer to produce our ‘a’s, however, will see a marked improvement in performance because we would have overcome the weakest link or process constraint (handwriting). So, a stable process produces results characterised by controlled variation that has a common cause, which can only be reduced by successfully changing a major portion of the underlying process.

Now consider the ‘a’ in Figure 1b. It is obviously different from the others. A casual look suggests that there must be a special cause. In this case, the author produced the letter using his non-dominant (left) hand. When we see special cause variation, we need to find the underlying special cause and then decide how to act. Special cause variation requires detective work, and, if the special cause is having an adverse impact on our process, we must work towards eliminating it from the process. But if the special cause is having a favourable impact on our process, we can work towards learning from it and making it part of our process (see the Elements on positive devianceReference Baxter, Lawton, Dixon-Woods, Brown and Marjanovic² and the Institute for Healthcare Improvement approachReference Boaden, Furnival, Sharp, Dixon-Woods, Brown and Marjanovic³).

In summary, the handwritten ‘a’s demonstrate two types of variation – common cause and special cause – and the action required to address each type of cause is fundamentally different. The origins of this profound understanding of variation are described in the next section.

1.2 A Brief History of Statistical Process Control Methodology

This understanding of variation – which underpins statistical process control methodology – comes from the physicist and engineer Walter Shewhart.Reference Shewhart⁴ His pioneering work in the 1920s at Bell Laboratories in Murray Hill, New Jersey, successfully brought together the disciplines of statistics, engineering, and economics and led to him becoming known as the ‘father of modern quality control’.Reference Shewhart⁵

Shewhart noted that the quality of a product is characterised by the extent to which the product meets the target specification, but with minimum variation. A key insight was his identification of two causes of variation:

common cause variation, which is the ‘noise’ intrinsic to the underlying process
special cause variation, which ‘signals’ an external cause.

This distinction is crucial: reduction of common cause variation needs action to change the process, whereas special cause variation needs identification of the external cause before it can be addressed.

Shewhart developed a theory of variation which classified variation according to the action required to address it, turning his abstract concept into something that can be measured in the form of statistical process control methodology. The methodology has proven to be very useful in efforts to improve the quality of manufactured products. Its migration to healthcare appears to have happened initially via applications to quality control in laboratory medicine in the 1950s.Reference Karkalousos and Evangelopoulos⁶ Since the 1980s, the use of these methods has continued to expand, especially in monitoring individual patients,Reference Tennant, Mohammed, Coleman and Martin⁷ for example following kidney transplantation,Reference Piccoli, Rizzoni and Tessarin⁸ for asthmatic patients,Reference Gibson, Wlodarczyk and Hensley⁹ and for patients with high blood pressure.Reference Solodky, Chen, Jones, Katcher and Neuhauser¹⁰ Statistical process control is now used across a wide range of areas in healthcare, including the monitoring and improvement of performance in hospitals and primary care, monitoring surgical outcomes, public health surveillance, and the learning curve of trainees undertaking medical or surgical procedures.Reference Suman and Prajapati¹¹^–Reference Bolsin and Colson¹⁴

2 What Is Statistical Process Control Methodology?

Statistical process control methodology offers a philosophy and framework for learning from variation in data for analytical purposes where the aim is to act on the underlying causes of variation to maintain or improve the future performance of a process. It is used in two main ways:

to monitor the behaviour or performance of an existing process (e.g. complications following surgery), or
to support efforts to improve an existing process (e.g. redesigning the pathway for patients with fractured hips).

By adopting this methodology, the user is going through the hypothesis-generation and testing cycle of the scientific method, as illustrated by the plan-do-study-act (PDSA) cycle (see the Element on the Institute for Healthcare Improvement approachReference Boaden, Furnival, Sharp, Dixon-Woods, Brown and Marjanovic³), supported by statistical thinking to distinguish between common and special cause variation.

Box 1 highlights various descriptions and features of common versus special cause variation. In practice, a graphical device – known as a statistical process control chart – is used to distinguish between common and special cause variation. In the next section, we look at the three main types of statistical process control charts commonly used in healthcare.

Box 1 Features of common versus special cause variation

Common Cause Variation

Special Cause Variation

Is caused by a stable process (like writing a signature)
Is sometimes referred to as random variation, chance variation, or noise
Depicts the behaviour of a stable process and affects all those who are part of the process
Can only be reduced (but not eliminated) by changing the underlying process
Can be predicted, within limits, with the aid of a statistical process control chart
The variation between individual data points from a stable process has no assignable cause extrinsic to the underlying process

Is variation which is extrinsic to a stable process arising from an assignable cause
Can be favourable or unfavourable
Does not affect all those who are part of the process
Is a distinct signal which differs from the usual noise of the process
Is sometimes referred to as non-random variation or a signal of systematic variation
Signals of special cause variation can be seen on a control chart but need further detective work to identify the assignable cause

2.1 The Statistical Process Control Chart

Statistical process control methodology typically involves the production of a statistical process control chart (also known as a process behaviour chart) that depicts the behaviour of a process over time and acts as a decision aid to determine the extent to which the process is showing common or special cause variation. Scores of control charts exist,Reference Provost and Murray¹⁵ but three main types have been used successfully in healthcare:

run chartsReference Perla, Provost and Murray¹⁶
Shewhart control chartsReference Mohammed, Worthington and Woodall¹⁷
cumulative sum (CUSUM) charts.Reference Noyez¹⁸

This section introduces the three main types of charts by using systolic blood pressure data from a patient with high blood pressure (taken over 26 consecutive days at home before starting any corrective medication). Figure 2 shows the blood pressure data over time using a run chart, a Shewhart control chart, and a CUSUM chart.

The run chart (top panel) shows the behaviour of the blood pressure data over time around a central horizontal line.
The Shewhart chart (middle panel) shows the same data around a central line along with upper and lower control limits.
The CUSUM chart (bottom panel) doesn’t show the raw data, but instead shows the differences between the raw blood pressure data and a selected target, accumulated over time.

The blood pressure varies up and down, between 142 at the lowest and 194 at the highest.Under the readings three types of control charts show the blood pressure data over time. The left panel is a run chart showing the blood pressure data over time around a central horizontal line. The blood pressure dips down and rises up over time, so that the data is plotted as a zigzag with the central horizontal line at its midpoint.The middle panel is a Shewhart control chart, which shows the same data around a central line along with upper and lower control limits.The right panel is a cumulative sum chart. Instead of showing the raw data, it shows cumulative deviations of the blood pressure above and below a central zone.

Figure 2 Three types of control charts based on the blood pressure readings of a hypertensive patient. In part b, the top panel is a run chart, the middle panel is a Shewhart control chart, and the bottom panel is a cumulative sum chart

As Figure 2 demonstrates, several charts can usually be used to examine the variation in a given data set. In general, run charts are the simplest to construct and CUSUM charts are the more complex. This highlights an important point: although several (appropriate) chart options are usually available to choose from, there is usually no single best chart for a given data set. The ideal is to consider multiple charts, but in practice people may lack the time, skill, or inclination to do so – and may opt for a single chart that suits their circumstances.

We will now consider each of the three charts in Figure 2 in more detail.

2.1.1 The Run Chart

The first chart (Figure 2, top panel) is known as a run chart.Reference Perla, Provost and Murray¹⁶ The simplest form of a chart, the run chart plots the data over time with a single central line that represents the median value.

The median is a midpoint value (=174) that separates the blood pressure data into an upper and lower half. This is useful because, in the long run, the output of a stable process should appear above the median half the time and below the median the other half of the time. For example, tossing a coin would, in the long term, show heads (or tails) half the time. On a run chart, the output of a stable process will appear to bounce around the central line without unusual, non-random patterns.

The appearance of unusual (non-random) patterns would signal the presence of special cause variation. A run of six or more consecutive points above (or below) the median constitutes an unusual run, because the probability of this happening by random chance alone is less than 2% (=0.5^6) – for example the equivalent of tossing a coin and getting six heads in a row.

As illustrated by Figure 3, four commonly used rulesReference Perla, Provost and Murray¹⁶ may detect special causes of variation with run charts (although other rules have been suggestedReference Anhøj and Olesen¹⁹^, Reference Anhøj²⁰).

Rule 1: A shift.
Rule 2: A trend.
Rule 3: Too few or too many runs above or below the median.
Rule 4: A data point judged by eye to be unusually large or small.Reference Perla, Provost and Murray¹⁶

The figure shows four run charts reflecting four rules for detecting special cause variation.The first chart shows Rule 1: a shift. A measure or characteristic on the y-axis is shown over time on the x-axis. The data zigzags with a run of six points circled highlighting the shift.The second chart shows Rule 2: a trend. Data is plotted over time. In this graph, two runs of six to eight plot points are highlighted where the data in both instances shows a trend.The third chart shows Rule 3: too few or too many runs above or below the median. The data is plotted across fewer points of measurement and shows a total of only two runs.The fourth chart shows Rule 4: a data point judged by eye to be unusually large or small. The data forms a zigzag with one point standing out as unusually high.

Figure 3 Four rules for detecting special cause variation on a run chart

Adapted from Perla et al.¹⁶

The run chart can be especially useful in the early stages of efforts to monitor or improve a process where there is not enough data available to reliably calculate the control limits.

When there is enough data (typically we need 20–25 data points), we can plot a Shewhart control chart (middle panel in Figure 2).Reference Provost and Murray¹⁵

2.1.2 Shewhart Control Charts

This, like the run chart, also shows the blood pressure data over time but now with three additional lines – an average central line, and lower and upper control limits – to help identify common and special cause variation.

Data points that appear within the control limits (without any unusual patterns) are deemed to be consistent with common cause variation. Signals of special cause variation are data points that appear outside the limits or unusual patterns within the limits.

Five rules are commonly used for detecting special cause variation in a Shewhart control chart (also shown in Figure 4, enclosed by an oval shape).Reference Provost and Murray¹⁵

Rule 1 identifies sudden changes in a process.
Rule 2 signals smaller but sustained changes in a process.
Rule 3 detects drift in a process.
Rule 4 identifies more subtle runs not picked up by the other rules.
Rule 5 identifies a process which has too little variation.

The figure shows a series of five Shewhart control charts, each demonstrating a commonly used rule for detecting special cause variation.The first chart shows two single points outside the control limits.The second chart highlights two runs of eight or more points in a row, one above and one below the centre line (control limit).The third chart shows six consecutive points decreasing (trend down) and six consecutive increasing (trend up).The fourth chart has two lines delineating the outer one-third of the chart. Data points in ellipses show where two out of three consecutive points fall near a control limit.In the fifth chart, two additional lines delineate the inner one-third of the chart. The data points highlight 15 consecutive points close to the centre line, within the inner one-third of the chart.

Figure 4 Rules for detecting signals of special causes of variation on a Shewhart control chart. Signals of special cause variation are enclosed by an oval

Adapted from Provost et al.¹⁵

In a Shewhart control chart, the central line is usually the mean/average value. The upper and lower control limits indicate how far the data from a process can deviate from the central line based on a statistical measure of spread known as the standard deviation. Typically, about

60%–70% of data from a stable process will lie within ± one standard deviation of the mean.
90%–98% of data points lie within ± two standard deviations of the mean.
99%–100% of data points lie within ± three standard deviations of the mean.

Upper and lower control limits are usually set at ± three standard deviations from the mean. Setting the control limits at ± three standard deviations from the mean will capture almost all the common cause variability from a stable process. In practice, it is not uncommon to see control charts with two and three standard deviation limits shown – usually as an aid to visualisation, but also as a reminder that a judgement has to be made about where to set the limits. That judgement needs to balance the cost of looking for special cause variation when it doesn’t exist against the cost of overlooking it when it does.Reference Shewhart⁴^, Reference Provost and Murray¹⁵

It is important to understand that the variability in the data is what determines the width between the lower and upper control limits. For example, Figure 5 illustrates the impact of variability on the control limits. We see two randomly generated data sets (y1 and y2) with 100 numbers having the same mean (10) but different standard deviations (1 and 2, respectively). These data are shown with control limits set at ± three standard deviations from the mean. The increased variability in y2 is associated with wider control limits. Both processes are stable in that they show random variation, but the process on the right has greater variation and hence wider control limits.

The figure shows two control charts with randomly generated data sets with the same mean (10) but different standard deviations. Each data set has 100 points. The first chart is labelled y1 and the second y2. Each chart features 0 to 100 on the x-axis and 0 to 15 on the y-axis. Each chart has a horizontal line reflecting the same mean and control limits are set at plus or minus three standard deviations from the mean.Y1 control limits are shown by dotted horizontal lines at 7.5 and 12.5. Y2 control limits are shown by dotted horizontal lines at 4 and 16. The data points on both charts zigzag above and below the centre line, but y2 shows greater variability and hence wider control limits than y1.

Figure 5 Control charts for two simulated random processes with identical means (10) but the process on the right has twice the variability

We next consider another approach to charting based on accumulating differences in the data set using CUSUM charts.

2.1.3 Cumulative Sum Charts

The bottom panel in Figure 2 shows a CUSUM chart.Reference Noyez¹⁸ Unlike the other two charts, it doesn’t show the raw blood pressure measurements. Instead, it shows the differences between the raw data and a selected target (the mean in this case) accumulated over time. For a stable process, the cumulative sums will hover around zero (the central line is zero on a CUSUM chart), indicating common cause variation. If the CUSUM line breaches the upper or lower control limit, this is a sign of special cause variation, indicating that the process has drifted away from its target.

CUSUM charts are more complex to construct and less intuitive than run charts or Shewhart charts, but they are effective in detecting signals of special cause variation – especially from smaller shifts in the behaviour of a process. The CUSUM chart in Figure 2 is a two-sided CUSUM plot because it tracks deviations above and below the target. But in practice, a one-sided CUSUM plot is often used because the primary aim is to spot an increase or decrease in performance. For example, when monitoring complication rates after surgery, the focus is on detecting any deterioration in performance – for which a one-sided CUSUM plot is appropriate.Reference Rogers, Reeves and Caputo²¹

An important use for CUSUM charts in healthcare is to monitor binary outcomes on a case-by-case basis, such as post-operative outcomes (e.g. alive or died) following surgery. As an illustration, we can indicate patients who survived or died with 0 and 1, respectively. Let’s say we have a sequence for 10 consecutive patients, as 0,1,0,1,1,0,1,0,0,0. Although we can plot such a sequence of 0s and 1s on a run chart or Shewhart chart, this proves to be of little use because the data steps up or down on the chart constrained at 0 or 1 (Figure 6, top panel). However, the CUSUM chart (Figure 6, bottom panel) uses these data more effectively by accumulating the sequence of 0s and 1s over time. A change in slope indicates a death, and a horizontal shift (i.e. no change in slope) indicates survival.

The figure shows two charts. The chart above shows the outcome from 0 to 1 on the y-axis and patient number from 1 to 10 on the x-axis. The sequence of outcomes for 10 consecutive patients is plotted on the chart as 0,1,0,1,1,0,1,0,0,0 and displayed as simple binary steps.The chart below shows cumulative outcome from 0 to 4 on the y-axis and patient number from 1 to 10 on the x-axis. The same sequence of outcomes is plotted cumulatively over time. The data displays as a diagonal line climbing gradually up to the top right of the chart, where a change in slope indicates a death, and a horizontal shift indicates survival.

Figure 6 Plots showing the outcomes (alive = 0, died = 1) and cumulative outcomes for 10 patients following surgery

3 Statistical Process Control in Action

In this section, we look at how statistical process control charts are used in practice in healthcare, where they generally serve two broad purposes: (1) to monitor an existing process, or (2) as an integral part of efforts to improve a process. The two are not mutually exclusive: for example, we might begin to monitor a process and then decide that its performance is unsatisfactory and needs to be improved; once improved, we can go back to monitoring it. The following case studies show how statistical process control charts have been used in healthcare for either purpose. We begin with the run chart.

3.1 Improving Patient Flow in an Acute Hospital Using Run Charts

Run charts offer simple and intuitive ways of seeing how a process is behaving over time and assessing the impact of interventions on that process. Run charts are easy to construct (as they mainly involve plotting the data over time) and can be useful for both simple and complex interventions. This section discusses how run charts were used to support efforts to address patient flow issues in an acute hospital in England.Reference Silvester, Mohammed, Harriman, Girolami and Downes²²

Patients who arrive at a hospital can experience unnecessary delays because of poor patient flow, which often happens because of a mismatch between capacity and demand. No one wins from poor patient flow: it can threaten the quality and safety of care, undermine patient satisfaction and staff morale, and increase costs. Enhancing patient flow requires healthcare teams and departments across the hospital to align and synchronise to the needs of patients in a timely manner. But this is a complex challenge because it involves many stakeholders across multiple teams and departments.

A multidisciplinary team undertook a patient flow analysis focusing on older emergency patients admitted to the Geriatric Medicine Directorate of Sheffield Teaching Hospitals NHS Foundation Trust (around 920 beds).Reference Silvester, Mohammed, Harriman, Girolami and Downes²² The team found a mismatch between demand and capacity: 60% of older patients (aged 75+ years) were arriving in the emergency department during office hours, but two-thirds of subsequent admissions to general medical wards took place outside office hours. This highlighted a major delay between arriving at the emergency department and admission to a ward.

The team was clear that more beds was not the answer, saying that an operational strategy that seeks to increase bed stock to keep up with demand was not only financially unworkable but also ‘diverts us from uncovering the shortcomings in our current systems and patterns of work’.Reference Silvester, Mohammed, Harriman, Girolami and Downes²²

The team used a combination of the Institute for Healthcare Improvement’s Model for Improvement (which incorporates PDSA cycles – see the Element on the Institute for Healthcare Improvement approachReference Boaden, Furnival, Sharp, Dixon-Woods, Brown and Marjanovic³), lean methodology (a set of operating philosophies and methods that help create maximum value for patients by reducing waste and waits), and statistical process control methodology to develop and test three key changes: a discharge to assess policy, seven-day working, and the establishment of a frailty unit. Overall progress was tracked using a daily bed occupancy run chart (Figure 7) as the key analytical tool. The team annotated the chart with improvement efforts as well as other possible reasons for special cause variation, such as public holidays. This synthesis of process knowledge and patterns on the chart enabled the team to assess, in real time, the extent to which their efforts were impacting on bed occupancy. Since daily bed occupancy data are not serially independent – unlike the tossing of a coin – the team did not use the run tests associated with run charts and so based the central line on the mean, not the median.

A run chart shows bed occupancy from 200 to 350 on the y-axis and regular weekly dates from 5 January to 16 September 2012 on the x-axis. The data points on the chart plot occupancy over time, in a zigzag line with a downward trend overall.The chart is annotated with moments relating to improvement efforts, such as change to consultant rota and opening of a frailty unit, as well as other indications of special cause variation, such as public holidays. It is clear from the run chart that bed occupancy has fallen over time. The mean occupancy is listed at regular intervals across the time period and falls from 311.6 in January to 244.3 by September.

Figure 7 Daily bed occupancy run chart for geriatric medicine with annotations identifying system changes and unusual patterns

Adapted from Silvester et al.²²

The run chart enabled the team to see the impact of their process changes and share this with other staff. It is clear from the run chart that bed occupancy has fallen over time (Figure 7).

The team also used run charts to concurrently monitor a suite of measures (shown over four panels in Figure 8) to assess the wider impact of the changes:Reference Silvester, Mohammed, Harriman, Girolami and Downes²² bed occupancy, in-hospital mortality, and re-admission rates over time before and after the intervention (vertical dotted line). Bed occupancy was the key indicator of flow. In-hospital mortality was an outcome measure, while admission and re-admission to hospital were balancing measures. The latter are important because balancing measures can satisfy the need to track potential unintended consequences of healthcare improvement efforts (see the Element on the Institute for Healthcare Improvement approachReference Boaden, Furnival, Sharp, Dixon-Woods, Brown and Marjanovic³). Plotting this bundle as run charts alongside each other enabled visual inspection of the alignment between the measures and changes made by the team. The charts in Figure 8 show:

a fall in bed occupancy after the intervention
a drop in mortality after the intervention
no change in re-admission rates
a slight increase in the number of admissions (116.2 (standard deviation 15.7) per week before the intervention versus 122.8 (standard deviation 20.2) after).

While introducing major changes to a complex adaptive system, the team was able to use simple run charts showing a suite of related measures to inform their progress. They demonstrated how improving patient flow resulted in higher quality, lower costs, and improved working for staff: ‘As a consequence of these changes, we were able to close one ward and transfer the nursing, therapy, and clerical staff to fill staff vacancies elsewhere and so reduce agency staff costs.’Reference Silvester, Mohammed, Harriman, Girolami and Downes²² As Perla et al. note: ‘The run chart allows us to learn a great deal about the performance of our process with minimal mathematical complexity.’Reference Perla, Provost and Murray¹⁶

A run chart with four panels shows bed occupancy, in-hospital mortality, admission rates, and re-admission rates over time before and after an intervention. The point of intervention is marked on the charts with a vertical dotted line. The mean is marked on each chart by a horizontal line, and is recalculated after the intervention, making it easy to see an overall increase or decrease in the data.Alongside each other, the charts show a fall in bed occupancy after the intervention; a drop in mortality after the intervention; no change in re-admission rates; and a slight increase in the number of admissions. The number of admissions was 116.2 with a standard deviation of 15.7 per week before the intervention versus 122.8 with a standard deviation of 20.2 after it.

Figure 8 Run charts for bed occupancy, mortality, readmission rate, and number of admissions over time in weeks (69 weeks from 16 May 2011 to 3 September 2012) with horizontal lines indicating the mean before and after the intervention (indicated by a vertical dotted line in week 51, 30 April 2012)

Adapted from Silvester et al.²²

The next example shows the use of control charts for managing individual patients with high blood pressure.

3.2 Managing Individual Patients with Hypertension through Use of Statistical Process Control

Chronic disease represents a major challenge to healthcare providers across the world. A crucial issue is finding ways for healthcare professionals to work in partnership with patients to better manage it. In this section, we look at a case study that shows how this was achieved using statistical process control methodology.

Hebert and Neuhauser describe a case study of a 71-year-old man with uncontrolled high blood pressure and type 2 diabetes.Reference Hebert and Neuhauser²³ Managing high blood pressure presents difficulties for both physicians and patients. A key challenge is obtaining meaningful measures of the level of blood pressure control and of changes in blood pressure after an intervention. In this case, the patient’s mean office systolic blood pressure was 169mmHg over a three-year period (spanning 13 visits to general medical clinics) compared with the target of 130mmHg. The patient was then referred to a blood pressure clinic.

At the initial clinic visit in April 2003, the first pharmacologic intervention was offered: an increase in the dose of hydrochlorothiazide from 25 mg to 50 mg daily, along with advice to increase dietary potassium intake. The physician also ordered a home blood pressure monitor and gave the patient graph paper to record his blood pressure readings from home in the form of a run chart. On his second visit, the patient brought his run chart of 30 home blood pressure readings. The physician later plotted these data on a Shewhart control chart (Figure 9, left panel). The mean systolic blood pressure fell to 131.1mmHg (target 130mmHg), with upper and lower control limits of 146mmHg and 116mmHg, respectively. The patient agreed to continue recording his blood pressure and returned for a third visit in September 2003. Figure 9 (right panel) shows these blood pressure observations with a reduced mean value of 126.1mmHg, which is below the target value with no obvious special cause variation.

Two blood pressure control charts side by side show a patient’s self-recorded results between consecutive clinic visits. The first chart shows the first 30 home blood pressure readings. The mean systolic blood pressure is 131.1mmHg, where the target was 130mmHg, with upper and lower control limits of 146mmHg and 116mmHg, respectively. The second chart shows the patient’s next set of numbered observations from 30 to 75. These blood pressure observations have a reduced mean value of 126.1mmHg, which is below the target value with no obvious special cause variation.

Figure 9 Blood pressure control charts between two consecutive clinic visits

Adapted from Herbert and Neuhauser²³

The perspectives of both patient and physician are recorded in Box 2, which highlight how partnership working was enhanced by the use of control charts.

Box 2 A patient’s and a physician’s perspectives on using control chartsReference Hebert and Neuhauser²³

The Patient’s Perspective

The Physician’s Perspective

I enjoyed plotting my readings and being able to clearly see that I was making progress. The activity takes about 10 minutes out of my day, which is only a minor inconvenience. After several weeks of recording daily readings, I settled on readings approximately three times a week. After five months, I think this is an activity that I will be able to continue indefinitely. I feel that my target blood pressure has been met because the systolic blood pressure is generally below 130mmHg. Since I began this activity I have a good idea of the status of my blood pressure, whereas prior to starting, I had only a vague idea, which bothered me. Occasionally, a reading would be unusually high, for example 142. In such cases, I worried that the device may not be working, and I would check my wife’s blood pressure. She too has high blood pressure, and if her reading was close to her typical pressure, I would say that my own pressure really was high that day. I would not change what I do because of a single high reading and I would not be alarmed. If my pressure was more than 130mmHg for a week or so then I’d probably call the doctor.

My job was made easier by the presence of a continuous stream of data. I was able to learn with a fair degree of certainty that the intervention was effective at lowering blood pressure, and if the level of elevated blood pressure persists, then the intervention should lower cardiovascular risk. … I have preliminary data on the 33 patients in our clinic with high blood pressure, who have follow-up data to compare baseline and current blood pressure. This group consists exclusively of patients with a history of poorly controlled blood pressure. Of the 33 patients, 31 have lowered their blood pressure, by a mean of 20 points.
Of these patients, 22 are presently keeping run charts and periodically bringing them back to the office, whereas the others are more comfortable with recording the values in tabular form. A few were initially uncomfortable with graphing, but then began after seeing copies of run charts created by their peers in the programme. Among the patients using run charts, a consistent message is that it is not a burden, and furthermore many have expressed the opinion that it is an enlightening activity.

A systematic review of statistical process control methods in monitoring clinical variables in individual patients reports that they are used across a range of conditions – high blood pressure, asthma, renal function post-transplant, and diabetes.Reference Tennant, Mohammed, Coleman and Martin⁷ The review concludes that statistical process control charts appear to have a promising but largely under-researched role in monitoring clinical variables in individual patients; the review calls for more rigorous evaluation of their use.

The next example shows the use of statistical process methods to monitor the performance of individual surgeons.

3.3 Monitoring the Performance of Individual Surgeons through CUSUM Charts

The Scottish Arthroplasty Project aims for continual improvement in the quality of care provided to patients undergoing a joint replacement in Scotland.Reference Macpherson, Brenkel, Smith and Howie²⁴ Supported by the Chief Medical Officer for Scotland and wholly funded by the Scottish government, the project is led by orthopaedic surgeons and reports to the Scottish Committee for Orthopaedics and Trauma. Its steering committee includes orthopaedic surgeons, an anaesthetist, patient representatives, and community medicine representatives. Scotland has a population of 5.2 million and is served by 24 orthopaedic National Health Service (NHS) provider units with about 300 surgeons.

The project analyses the performance of individual consultant surgeons based on five routinely collected outcome measures: death, dislocation, wound infection, revision arthroplasty, and venous thromboembolism. Every three months, each surgeon is provided with a personalised report detailing the outcomes of all their operations.

Outcomes are monitored using CUSUM charts, which are well suited to monitoring adverse events per operation for individual surgeons while also accounting for the differences in risk between patients. Figure 10 shows three examples of CUSUM charts.

The left panel is a CUSUM chart for a surgeon who operated from 2004 to 2010. Each successful operation is shown as a grey dot; each operation with a complication is shown as a black dot. The CUSUM rises if there is a complication and falls if there is not. The CUSUM for this surgeon remains stable, indicating common cause variation.
The middle panel shows a surgeon with a rising CUSUM mostly above zero – indicating a consistently higher-than-average complication rate. In 2010, the upper control limit is breached triggering a signal of special cause variation that merits investigation.
The right panel shows a CUSUM chart that is unremarkable until 2009 but suggests a possible change to the underlying process thereafter, such as a new technique or new implant, for example.

Because complications are rare events, they cause a large rise in the CUSUM, whereas multiple operations that have no complication will each cause a small decrease in the CUSUM. The two will therefore tend to cancel each other out, and if a surgeon’s complication rate is close to or below average, their CUSUM will hover not far from zero. On the other hand, a surgeon who has an unusually high number of complications will have a CUSUM that exceeds the horizontal control limit. Such a surgeon is labelled an ‘outlier’ in the Scottish Arthroplasty Project.

The figure shows three cumulative sum charts, with cumulative sum from 0 to 2.5 on the y-axis and a range of years between 2004 and 2010 on the x-axis. A line shows the upper control limit.One chart shows operations by a surgeon between 2004 and 2010. Each successful operation is shown as a small grey dot; each operation with a complication is shown as a black dot. The cumulative sum rises if there is a complication and falls if there is not. The cumulative sum for this surgeon remains stable.The next chart shows operations by a surgeon with a rising cumulative sum mostly above zero. In 2010, the upper control limit is breached.The third chart shows a series of operations that are nearly all successful from 2004 until 2009, when several consecutive operations with complications cause the cumulative sum to rise sharply to the upper control limit.

Figure 10 Example CUSUM charts for three surgeons

Adapted from Macpherson et al.²⁴

The value of the horizontal control limit line (in this case 2) is a management decision based on a judgement that balances the risks of false alerts (occurring by chance when the surgeon’s complication rate is in control), and the risk of not detecting an unacceptable change in complication rate. The project team chose a control limit of 2 because it allows detection of special cause variation for as few as four complications in quick succession.

This CUSUM-based monitoring scheme is part of a comprehensive data collection, analysis, and feedback system focusing on individual surgeons (see Figure 11). If a CUSUM plot for an individual surgeon exceeds the horizontal dotted line (Figure 10), the surgeon will be alerted, asked to review their complications and to complete and return an action plan to the project steering committee (see Figure 11).

A flowchart illustrates the procedure for using data collection, analysis, and feedback to handle consultant outliers. The left-hand side shows a linear process in which consultants are identified as outliers through monthly data. A letter is then sent to the consultant outlier requesting reasons for the anomaly and any action taken. The anomalies are investigated and the action plan prodcued. Actions undertaken are assessed and graded, and monthly data checks continue to monitor for further outlying data.The right-hand side of the flowchart shows the procedures when an element in the linear section does not proceed according to plan. For example, if the consultant outlier does not respond to the letter from the committee requesting reasons for the anomaly, a second letter may be sent, and if there is still no response, the relevant Trust Chief Executive is informed that no response, or a less than satisfactory response, has been received.

Figure 11 Flowchart showing the process of data collection and feedback

Adapted from Macpherson et al.²⁴

A major advantage of this CUSUM scheme is that it identifies signals quickly because the analysis shows the outcome of each operation. This allows for the rapid identification of failing implants or poor practices and allows implants to be withdrawn or practices to be changed in a timely manner.

The CUSUM chart is reset to zero once the project steering committee receives an explanation from the surgeon involved. A comprehensive case note review reflecting a difficult casemix can also form the basis of a constructive response. Responses are graded into one of four categories, as shown in the table in Figure 12. The chart in Figure 12 shows how responses have changed over time. The authors note: ‘As surgeons have become more aware of the feedback system, particularly with the introduction of CUSUM, their responses have become more rapid and more comprehensive.’Reference Macpherson, Brenkel, Smith and Howie²⁴

A table shows grades given to consultant outlier responses and action plan outcomes by the Scottish Arthroplasty Project Steering Committee. The grades range from Exemplary, meaning a constructive response with evidence of progress, to Excellent, meaning a constructive response, to Satisfactory, meaning meeting the minimum requirement, to Less than satisfactory, meaning unacceptable.A bar graph then illustrates the grades awarded to responses between 2003 and 2009. The number of outliers varies from year to year, between 6 and 18. The graph shows that in 2003 and 2004 a high number of Satisfactory and a low number of Less than satisfactory grades were given, in 2005 and 2006 only Exemplary, Excellent, and Satisfactory grades were given, and that in 2007, 2008, and 2009 Exemplary, Excellent, and Satisfactory grades dominated but with the introduction of a fifth category – Late or no response.

Figure 12 Table and accompanying graph showing how action plans were graded

Adapted from Macpherson et al.²⁴

The authors report that

[w]ithin the Scottish orthopaedic community, there has been a general acceptance of the role of Scottish Arthroplasty Project as an independent clinical governance process. From surgeons’ feedback, we know that notification of an outlying position presents a good opportunity for self-review even if no obvious problems are identified. When local management has questioned individual practice, Scottish Arthroplasty Project data are made available to the surgeon to support the surgeon’s practice. This type of data has also been valuable in appraisal processes that will feed into the future professional revalidation system. Data also can be useful to the surgeon in medical negligence cases. Although there were initially concerns about lack of engagement from the orthopaedic surgeons, our methodology has resulted in enthusiasm from the surgeons and 100% compliance. We have found that the process has nurtured innovation, education, and appropriate risk aversion.Reference Macpherson, Brenkel, Smith and Howie²⁴

The next example is a landmark study that showed how statistical process control supported reductions in complications following surgery in France.

3.4 Reducing Complications after Surgery Using Statistical Process Control

Healthcare-related adverse events are a major cause of illness and death. Around 1 in 10 patients who undergo surgery are estimated to experience a preventable complication.Reference Duclos, Chollet and Pascal²⁵ In a landmark randomised controlled trial, a multidisciplinary team from France investigated the extent to which major adverse patient events were reduced by using statistical process control charts to monitor post-surgery outcomes and feed the data back to surgical teams.Reference Duclos, Chollet and Pascal²⁵

Duclos et al. randomised 40 hospitals to either usual care (control hospitals) or to quarterly control charts (intervention hospitals) monitoring four patient-focused outcomes following digestive surgery: inpatient death, unplanned admission to intensive care, reoperation, and a combination of severe complications (cardiac arrest, pulmonary embolism, sepsis, or surgical site infection).Reference Duclos, Chollet and Pascal²⁵ Our focus is primarily on how the team used statistical process control methods in the intervention hospitals.

P-charts (where p stands for proportion or percentage) are useful for monitoring binary outcomes (e.g. alive, died) as a percentage over time (e.g. percentage of patients who died following surgery).Reference Duclos and Voirin²⁶ The 20 intervention hospitals used a p-control chart to monitor the four outcomes (example in Figure 13). The charts included three and two standard deviation control limits set around the central line. A signal of special cause variation was defined as a single point outside the three standard deviation control limit or two of three successive points outside the two standard deviation limits.

A set of control charts, known as p-charts, illustrate the monitoring of four surgical outcomes, including death, intensive care, reoperation, and complications, as a percentage over time. Each chart shows how a given outcome varies over time on a control chart with lower and upper control limits and a mean central line. Another chart combines the data from the four surgical outcomes charts into a single major adverse event data set and shows that on a control chart. In all charts the outcomes zigzag around the mean with no obvious indication of special cause variation because no points appear outside the control limits.

Figure 13 Example statistical process control charts used in a study to reduce adverse events following surgery

Adapted from Duclos et al.²⁵

The authors recognised that successful implementation of control charts in healthcare required a leadership culture that allowed staff to learn from variation by investigating special causes of variation and trying out and evaluating quality improvement initiatives.Reference Duclos, Chollet and Pascal²⁵ To enable successful implementation of the control chart, ‘champion partnerships’ were established at each site, comprising a surgeon and another member of the surgical team (surgeon, anaesthetist, or nurse).Reference Duclos, Chollet and Pascal²⁵ Each duo was responsible for conducting meetings to review the control chart and keeping a logbook in which changes in care processes were recorded. Champion partners from each hospital met at three one-day training sessions held at eight-month intervals. Simulated role-play at these sessions aimed to provide the skills needed to use the control charts appropriately, lead review meetings for effective cooperation and decision-making, identify variations in special causes, and devise plans for improvement.

Over two years post-intervention, the control charts were analysed at perioperative team meetings.Reference Duclos, Chollet and Pascal²⁵ Unfavourable signals of special cause variation triggered examination of potential causes, which led to an average of 20 changes for each intervention hospital (Figure 14). Compared with the control hospitals, the intervention hospitals recorded significant reductions in rates of major adverse events (a composite of all outcome indicators). The absolute risk of a major adverse event was reduced by 0.9% in intervention compared with control hospitals – this equates to one major adverse event prevented for every 114 patients treated in hospitals using the quarterly control charts.Reference Duclos, Chollet and Pascal²⁵ Among the intervention hospitals, the size of the effect was proportional to the degree of control chart implementation. Duclos et al. conclude: ‘The value of control charts and sharing ideas within surgical teams designed to eliminate patient harm has been mostly underappreciated.’Reference Duclos, Chollet and Pascal²⁵

A table shows the compliance of 20 hospitals with implementation of an improvement intervention over two years. Compliance is measured based on a six-item scoring: duo formed with a surgeon, participation in all three training sessions, logbook updated until the end, posters displayed in operating room every quarter, team meetings held for interpreting control charts every quarter, and at least one concrete action tested for care improvement.Each hospital’s score against these items is added to give a total implementation score out of 6. Hospitals scoring 5 or 6 are deemed to have a high degree of compliance; those scoring 3 or 4 a moderate degree, and those scoring only 2 a poor degree of compliance.

Figure 14 Compliance of hospitals in the intervention arm using control charts

Adapted from Duclos et al.²⁵

The next example shows the use of statistical process control to compare the performance of healthcare organisations.

3.5 Comparing the Performance of Healthcare Organisations Using Funnel Plots

Monitoring of healthcare organisations is now ubiquitous.Reference Spiegelhalter²⁷ Comparing organisations has often taken the form of performance league tables (also known as caterpillar plots – see Figure 15a), which rank providers according to a performance metric such as mortality. Such tables have been criticised for focusing on spurious rankings that fail to distinguish between common and special causes of variation.Reference Mohammed, Cheng, Rouse and Bristol²⁸ Despite these concerns, they were widely used to compare the performance of provider units until the introduction of statistical process control-based funnel plotsReference Spiegelhalter²⁷ (see Figure 15b: here, the funnel plot has two sets of control limits corresponding to two and three standard deviations).

$Two charts compare a caterpillar plot with a funnel plot. Both show 30-day age and sex-standardised mortality rates following treatment for fractured hip of over-65s in 51 medium acute and multi-service hospitals in England in 2000−2001.In the caterpillar plot, the 51 hospitals are listed down the y-axis and the percentage of deaths within 30 days are shown on the x-axis. The overall proportion of deaths within 30 days is 9.3%. The hospitals’ individual percentages are plotted with dots, with 95% confidence intervals.In the funnel plot, the percentage of deaths within 30 days is shown on the y-axis and the volume of cases, between 0 and 600, on the x-axis. The mean, shown by a horizontal centre line, is the overall proportion of 9.3%. Dotted lines show upper and lower control limits at three standard deviations from the mean.$

Figure 15 Comparison of a ranked performance league table plot with 95% confidence intervals (part a) versus a funnel plot with 3 sigma control limits (part b) Adapted from SpigelhalterReference Spiegelhalter²⁷

The funnel plot is a scatter graph of the metric of interest (post-operative mortality in Figure 15) on the y-axis versus the number of cases (sample size) on the x-axis across a group of healthcare organisations. Such data are cross-sectional (not over time), so there is no time dimension to the funnel plot. The funnel plot takes a process or systems perspective by showing upper and lower control limits around the overall mean instead of individual limits around each hospital (as shown in the caterpillar plot).

An attractive feature of the funnel plot is that the control limits get narrower as sample sizes increase. This produces the funnel shape that shows how common cause variability reduces with respect to the number of cases (the so-called outcome-volume effect). It makes it very clear that smaller units show greater common cause variation compared to larger units.

Funnel plots are now widely used for comparing performance between healthcare organisations.Reference Verburg, Holman, Peek, Abu-Hanna and de Keizer²⁹^–Reference Mayer, Bottle, Rao, Darzi and Athanasiou³¹ Spiegelhalter gives a comprehensive explanation of funnel plots for institutional comparisons,Reference Spiegelhalter²⁷ and Verburg et al. provide step-by-step guidelines on the use of funnel plots in practice (based on the Dutch National Intensive Care Evaluation registry).Reference Verburg, Holman, Peek, Abu-Hanna and de Keizer²⁹ Steps include selection of the quality metric of interest, examining whether the number of observations per hospital is sufficient, and specifying how the funnel plot should be constructed.

Guthrie et al.Reference Guthrie, Love, Fahey, Morris and Sullivan³⁰ show how funnel plots can be used to compare the performance of general practices across a range of performance indicators. In Figure 16, the left panel shows a funnel plot for one performance indicator over all the practices in Tayside, Scotland. The right panel shows a statistical process control dashboard for 13 performance indicators across 14 practices. The use of red-white-green categories should not be confused with the usual red-amber-green (RAG) reporting seen in hospital performance reports;Reference Riley, Burhouse and Nicholas³² the former is based on statistical process control methodology, and the latter is not.

The left panel shows a funnel plot illustrating a single performance indicator. The percentage of patients with type 2 diabetes with HBA1c less than or equal to 7.4% is measured between 0 and 100 against the overall number of patients with diabetes, measured between 0 and 340, in each of 69 general practices in Tayside, Scotland. The Tayside average is marked at 55.7%. The upper control limit shows 0.999 probability, while the lower control limit shows 0.001 probability. The lower warning limit is at 0.025 probability, while the upper warning limit is marked at 0.975 probability.The right panel shows a statistical process control dashboard for 13 performance indicators across the 14 practices. Each of the 14 practices is marked against each indicator according to three grades: much better than, consistent with, or much worse than the Tayside average.

Figure 16 The left panel shows a funnel plot for percentage of patients with type 2 diabetes with HBA1c ≤ 7.4% in 69 Tayside practices. The right panel summarises the signals from 13 other performance indicator funnel plots across 14 general practices

Adapted from Guthrie et al.³⁰

Although funnel plots do not show the behaviour of a process over time, they can still be used to compare performance across time periods through a sequence of funnel plots. This can be illustrated using data from a public inquiry established in 1998 to probe high death rates following paediatric cardiac surgery at Bristol Royal Infirmary.Reference Kennedy³³ The data included a comparison of death rates of children under 1 year of age with data from 11 other hospitals where paediatric cardiac surgery took place. Comparisons were presented over three time periods: 1984–87, 1988–90, and 1991–March 95. Figure 17 shows this data as side-by-side funnel plots.

A sequence of three funnel plots side by side illustrates mortality data across three epochs for children aged under 1 year following cardiac surgery at Bristol Royal Infirmary compared with 11 other hospitals.The total number of cases is shown on the x-axis, while the proportion of children who died is shown on the y-axis. The comparison is presented over three epochs: 1984 to 87, 1988 to 90, and 1991 to March 1995.Each funnel plot has a horizontal line showing the mean for that epoch, and lines showing three-sigma upper and lower control limits. Each plot also has 12 circled numbers representing the 12 hospitals. Bristol Royal Infirmary is number 1 and appears between the mean and the upper control limit on the first two plots, but as an outlier above the upper control limit on the third plot.

Figure 17 The Bristol data, showing mortality following cardiac surgery in children under 1 year of age. Each panel of the figure shows a control chart for the three epochs (panels, left to right: 1984–87, 1988–90, and 1991–March 95). The numbers in the panel indicate centres (1–12), the horizontal line is the mean for that epoch, and the solid lines represent three-sigma upper and lower control limits. Bristol (centre 1) clearly shows special cause variation in the third time period (1991–95) as it appears above the upper control limit

Bristol (centre 1) exhibits a signal of special cause variation in the third epoch (time period) only. The factors that contributed to the high death rates at Bristol were subject to a lengthy inquiry (1998–2001), which identified a range of issues.Reference Kennedy³³ A closer look at all three panels suggests Bristol’s death rate stood still, whereas all other centres experienced reduced mortality. Although external action to address concerns about paediatric cardiac surgery at Bristol Royal Infirmary took place in 1998, monitoring using control charts might have provoked action earlier, in 1987. The control chart usefully guides attention to high-mortality centres (above the upper control limit), but it also identifies opportunities for improvement by learning from centres with particularly low death rates (those below the lower control limit). For example, centre 11 appears to have made remarkable reductions in mortality over the three epochs. This clearly merits investigation and, if appropriate, dissemination of practices to other hospitals.

Statistical process control methodology, then, offers an approach to learning from both favourable (see the Element on positive devianceReference Baxter, Lawton, Dixon-Woods, Brown and Marjanovic²) and unfavourable signals of special causes of variation. So, how might we systematically investigate signals of special cause variation?

3.6 Investigating Special Cause Variation in Healthcare Using the Pyramid Model

The key aim of using statistical process control charts to monitor healthcare processes is to ensure that quality and safety of care are adequate and not deteriorating. When a signal of special cause variation is seen on a control chart monitoring a given outcome (e.g. mortality rates following surgery), investigation is necessary. However, the chosen method must recognise that the link between recorded outcomes and quality of care is complex, ambiguous, and subject to multiple explanations.Reference Lilford, Mohammed, Spiegelhalter and Thomson³⁴ Failure to do so may inadvertently contribute to premature conclusions and a blame culture that undermines the engagement of clinical staff and the credibility of statistical process control. As Rogers et al. note: “If monitoring schemes are to be accepted by those whose outcomes are being assessed, an atmosphere of constructive evaluation, not ‘blaming’ or ‘naming and shaming’, is essential as apparent poor performance could arise for a number of reasons that should be explored systematically.”Reference Rogers, Reeves and Caputo²¹

To address this need, Mohammed et al. propose the Pyramid Model for Investigating Special Cause Variation in HealthcareReference Mohammed, Rathbone and Myers³⁵ (Figure 18)Reference Smith, Garlick and Gardner³⁶ – a systematic approach of hypothesis generation and testing based on five theoretical candidate explanations for special cause variation: data, patient casemix, structure or resources, process of care, and carer(s).

A pyramid shape is divided into five layers from the tip to the base. From base to tip the layers are labelled Data, Casemix, Structure or Resources, Process of care, and Carer. The bottom two layers are grouped under a further label of Validation Investigation, while the top three layers are grouped as Root Cause Investigation.An arrow points upwards from base to tip. Further information is provided about each of the pyramid layers – for Data read data accuracy, reliability, and completeness. For Casemix read appropriateness of risk adjustment. For Structure or Resources read equipment, facilities, or organisational processes. For Process of Care read generic treatment and clinical pathways. For Carer read group or individual practice and treatment methods.

Figure 18 The Pyramid Model for investigating special cause variation in healthcare

Adapted from Mohammed et al.Reference Mohammed, Rathbone and Myers³⁵ and Smith et al.³⁶

These broad categories of candidate explanations are arranged from most likely (data) to least likely (carers), so offering a road map for the investigation that begins at the base of the pyramid and stops at the level that provides a credible, evidence-based explanation for the special cause. The first two layers of the model (data and casemix factors) provide a check on the validity of the data and casemix-adjusted analyses, whereas the remaining upper layers focus more on quality of care-related issues.

A proper investigation requires a team of people with expertise in each of the layers. Such a team is also likely to include those staff whose outcomes or data are being investigated, so that their insights and expertise can inform the investigation while also ensuring their buy-in to the investigation process. Basic steps for using the model are shown in Box 3.

Box 3 The three basic steps for using the Pyramid Model to investigate special cause variation in healthcare

1. Form a multidisciplinary team that has expertise in each layer of the pyramid, with a decision-making process that allows them to judge the extent to which a credible cause or explanation has been found, based on hypothesis generation and testing.
2. Candidate hypotheses are generated and tested starting from the lowest level of the Pyramid Model and proceeding to upper levels only if the preceding levels provide no adequate explanation for the special cause.
3. A credible cause requires quantitative and qualitative evidence, which is used by the team to test hypotheses and reach closure. If no credible explanation can be found, then the most likely explanation is that the signal itself was a false signal.

Mohammed et al. first demonstrated the use of the Pyramid Model to identify a credible explanation for the high mortality associated with two general practitioners (GPs) flagged by the Shipman Inquiry. Their mortality data showed evidence of special cause variation on risk-adjusted CUSUM charts (see Box 4).Reference Mohammed, Rathbone and Myers³⁵

Box 4 The use of the Pyramid Model to investigate high-mortality general practitioners flagged up by the Shipman Inquiry

Harold Shipman (1946–2004) was an English GP who is believed to be the most prolific serial killer in history. In January 2000, a jury found Shipman guilty of the murder of 15 patients under his care, with his total number of victims estimated to be around 250. A subsequent high-profile public inquiry included an analysis of mortality data involving a sample of 1,009 GPs. Using CUSUM plots, the analysis highlighted 12 GPs as having high (special cause variation) patient mortality that merited investigation. One was Shipman.

Mohammed et al.Reference Lilford, Mohammed, Spiegelhalter and Thomson³⁴ used the Pyramid Model to investigate the reasons behind the findings in relation to two of the GPs. They assembled a multidisciplinary team which began by checking the data. Once the data was considered to be accurate, the team had preliminary discussions with the two GPs to generate candidate hypotheses. This process highlighted deaths in nursing homes as a possible explanatory factor.

This hypothesis was tested quantitatively and qualitatively. The magnitude and shape of the curves of a CUSUM plot for excess number of deaths in each year were closely mirrored by the magnitude and shape of the curves of the number of patients dying in nursing homes; and this was reflected in the high correlations between excess mortality and the number of deaths in nursing homes in each year for the GPs. These findings were supported by administrative data. Furthermore, it was known that the casemix adjustment scheme used for the CUSUM plots did not include the place of death.

The investigation concluded: “The excessively high mortality associated with two general practitioners was credibly explained by a nursing home effect. General practitioners associated with high patient mortality, albeit after sophisticated statistical analysis, should not be labelled as having poor performance but instead should be considered as a signal meriting scientific investigation.”Reference Lilford, Mohammed, Spiegelhalter and Thomson³⁴

The Pyramid Model has been incorporated into statistical process control-based monitoring schemes in Northern IrelandReference Mohammed, Booth and Marshall³⁷ and Queensland, Australia.Reference Smith, Garlick and Gardner³⁶^, Reference Duckett, Coory and Sketcher-Baker³⁸ In Queensland, clinical governance arrangements now include the use of CUSUM-type statistical process control charts (known as variable life-adjusted display plots) to monitor care outcomes in 87 hospitals using 31 clinical indicators (e.g. stroke, colorectal cancer surgery, depression) derived from routinely collected data.Reference Duckett, Coory and Sketcher-Baker³⁸ Crucially, monitoring is tied in with an approach to investigation, learning, and action that incorporates the Pyramid Model as shown in Table 1.

Table 1 Use of the Pyramid Model to investigate special cause variation in hospitals in Queensland, Australia

Level	Scope	Typical Questions
Data	Data quality issues (e.g. coding accuracy, reliability of charts, definitions, and completeness)	Are the data coded correctly?
		Has there been a change in data coding practices (e.g. are there less experienced coders)?
		Is clinical documentation clear, complete, and consistent?
Casemix	Although differences in casemix are accounted for in the calculation, it is possible that some residual confounding may remain	Are there factors peculiar to this hospital not considered in the risk adjustment?
Casemix		Has the pattern of referrals to this hospital changed (in a way not considered in risk adjustment)?
Structure or resource	Availability of beds, staff, and medical equipment; institutional processes	Has there been a change in the distribution of patients in the hospital, with more patients in this specialty spread throughout the hospital rather than concentrated in a particular unit?
Process of care	Medical treatments of patients, clinical pathways, patient admission and discharge hospital policies	Has there been a change in the care being provided?
Process of care		Have new treatment guidelines been introduced?
Professional staff/carers	Practice and treatment methods, and so on	Has there been a change in staffing for treatment of patients?
Professional staff/carers	Practice and treatment methods, and so on	Has a key staff member gained additional training and introduced a new method that has led to improved outcomes?

Adapted from Duckett et al.³⁸

The next example shows how statistical process control methods were used to modify performance data in hospital board reports.

3.7 Control Charts in Hospital Board Reports

Hospital board members have to deal with large amounts of data related to quality and safety, usually in the form of hospital board reports.Reference Schmidtke, Poots and Carpio³⁹ Board members need to look at reports in detail to help identify problems with care and assure quality. However, the task is not straightforward because members need to understand the role of chance (or common cause variation) and be able to distinguish signals from noise.

In 2016, Schmidtke et al.Reference Schmidtke, Poots and Carpio³⁹ reviewed board reports for 30 randomly selected English NHS trusts (n = 163) and found that only 6% of the charts (n = 1,488) illustrated the role of chance. The left panel in Figure 19 shows an example chart which displays the number of unplanned re-admissions within 48 hours of discharge but provides no indication that chance played a role. The right panel shows a control chart of the same data but also indicates the role of chance with the aid of control limits around a central line.

Two charts side by side both display the number of unplanned re-admissions within 48 hours of discharge over the period from April 2012 to July 2013. The first chart displays the data simply as a series of points joined by a line plotting the rise and fall of unplanned re-admissions between 0 and 70 across each month during the period.The second chart is a control chart displaying the same data but with the addition of dotted lines showing upper and lower control limits around a solid central line reflecting the mean.

Figure 19 Example chart from a hospital board report (left) represented as a control chart (right)

Adapted from Schmidtke et al.³⁹

Schmidtke et al. conclude: ‘Control charts can help board members distinguish signals from noise, but often boards are not using them.’Reference Schmidtke, Poots and Carpio³⁹ They assumed that members might not be requesting control charts because they were unaware of statistical process control methodology, so they suggested an active training programme for board members. And, since hospital data analysts might not have the necessary skills to produce control charts, they also proposed training for analysts. As a realistic default recommendation, they suggested using a single chart that has proven robust for most time-series data – the individuals or XmR chart. This is a useful chart, but there is controversy about its use across the different types of data in hospital board reports such as percentage data for which a p-chart is recommended (as shown in Section 3.4).Reference Provost and Murray¹⁵^, Reference Duclos and Voirin²⁶^, Reference Woodall⁴⁰

In response, Riley et al.Reference Riley, Burhouse and Nicholas³² devised a training programme – called Making Data Count – for board members in English NHS trusts and developed a spreadsheet tool to allow analysts to readily produce control charts. A 90-minute board training session on the use of statistical process control was delivered to 583 participants from 61 NHS trust boards between November 2017 and July 2019. Feedback from participants was that 99% of respondents felt the training session had been a good use of their time, and 97% agreed that it would enhance their ability to make good decisions. A key feature of the whole-board training programme was using hospitals’ own performance data (from their board reports) to demonstrate the advantages of statistical process control. Participants highlighted this in evaluation interviews (see Box 5).

Box 5 Three board members’ perspectives on control charts

The most powerful intervention was to use our own data and play it back to us. It helped us to see what’s missing.

By exposing the full board to Statistical Process Control (SPC) as a way to look at measures, the training helped us all learn together and have the same level of knowledge. We all gained new insights, and it has helped us to think about where and how to begin experimenting with presenting metrics in SPC format. I thought it was terrific. In fact, I wrote a note to the chairperson to share my observation that it was the best board session I had attended in 4 years. The reason was that the training was accessible, not too basic but not too advanced; it was not too short or too long in length; and it was directly relevant and applicable to the organisation as a whole and also useful for my role.

We are already seeing changes. We have completely overhauled the board report. The contrast from July is that by September, we can see SPC in every individual section. It’s made it easier to go through the board paper, and it’s now significantly clearer about what we should focus on. We also chose to bring in the performance team. We wanted to get a collective understanding of what was needed. So, it was not an isolationist session; it was leaders and people who knew about the data. We wanted everyone to leave the session knowing what we were aiming for, what to do and how. There have been no additional costs. All the changes have been possible within current resources. This is about doing things in a different way. We were lucky that we had staff with good analytical skills and they have been able to do this work quickly and effectively.

Reproduced from Riley et al.³²

Figure 20 shows an extract from a board report with an overview performance summary (upper panel) based on a multiple statistical process control chart (lower panel).

An excerpt from a hospital board report across two panels. The upper panel shows an overview performance summary and the lower panel shows part of a multiple statistical process control chart on which the overview is based. The control chart highlights areas of special cause variation and when it is of potential concern.Examples shown include people with substance misuse problems and children with complex mental health needs. Examples of performance indicators include reporting improvements in quality of life on discharge and the number of service users presenting in crisis.

Figure 20 Example chart from a hospital board report (upper panel) which is underpinned by multiple statistical process control charts (lower panel)⁴¹ Adapted from East London NHS Foundation Trust. Board of Directors Meeting in Public. Thursday 30 March 2023.⁴¹

Our next example shows the use of statistical process control during the COVID-19 pandemic.

3.8 Tracking Deaths during the COVID-19 Pandemic through Shewhart Control Charts

The COVID-19 pandemic, which was declared in March 2020, has posed unprecedented challenges to healthcare systems worldwide. The daily number of deaths was a key metric of interest. Perla et al. developed a novel hybrid Shewhart chart to visualise and learn from daily variations in reported deaths. They note: “We know that the number of reported deaths each day – as with anything we measure – will fluctuate. Without a method to understand if these ‘ups and downs’ simply reflect natural variability, we will struggle to recognize signals of meaningful improvement … in epidemic conditions.”Reference Perla, Provost, Parry, Little and Provost⁴²

Figure 21 shows a chart of daily deaths annotated with sample media headlines. It highlights how headline writers struggled to separate meaningful signals from noise in the context of a pandemic and the risk that the data might provoke ‘hyperreactive responses from policy-makers and public citizens alike’.Reference Perla, Provost, Parry, Little and Provost⁴²

A run chart shows the number of daily reported deaths from COVID−19, numbered from 0 to 1,000 on the y-axis, against a time period from 6 March 2020 to 15 April 2020. As the death toll gradually rises over this period, the line of the graph is annotated with sample media headlines.The headlines provide sensational commentary on each rise and fall in the death rate, however small or large, but do not draw out any meaningful conclusions.

Figure 21 Headlines associated with daily reported deaths in the United Kingdom during the COVID-19 pandemic

Reproduced from Perla et al.⁴²

Using the hybrid chart, the researchers identified four phases (or ‘epochs’) of the classical infectious disease curve.Reference Perla, Provost, Parry, Little and Provost⁴²^, Reference Parry, Provost, Provost, Little and Perla⁴³ The four epochs are shown in Figure 22 and described in Box 6. The researchers used a combination of Shewhart control charts to track the pandemic and help separate signals (of change) from background noise in each phase.

A chart shows a hypothetical disease curve for events during an epidemic. The number of events between 0 and 140 is shown on the y-axis and the number of days since the first reported event, from day 1 to day 55, is shown on the x-axis.The time period is separated into four phases or epochs. Epoch 1 lasts from day 1 to around day 14 and is labelled Pre-exponential growth. Epoch 2 lasts from around day 14 to day 26 and is labelled Exponential growth. It falls away into Epoch 3 labelled Plateau or decline. Epoch 3 lasts from around day 25 to day 45, during which the disease curve follows a regular descent. Epoch 4 is labelled Stability after descent. Lasting from around day 46 to day 55, this epoch shows the disease curve return to a nearly flat line with only a handful of daily events.

Figure 22 A hypothetical epidemiological curve for events in four epochs

Adapted from Parry et al.⁴³

Box 6 The four epochs of an epidemic curveReference Parry, Provost, Provost, Little and Perla⁴³

Epoch 1 ‘pre-exponential growth’ begins with the first reported daily event. Daily counts usually remain relatively low and stable with no evidence of exponential growth. Epoch 1 ends when rapid growth in events starts to occur and the chart moves into Epoch 2.
Epoch 2 ‘exponential growth’ is when daily events begin to grow rapidly. This can be alarming for those reading the chart or experiencing the pandemic. Epoch 2 ends when events start to level off (plateau) or decline.
Epoch 3 ‘plateau or descent’ is when daily events stop increasing exponentially. Instead, they start to ‘plateau or descend’. Epoch 3 can end when daily values start to return to pre-exponential growth values. More troublingly, it can also end with a return to exponential growth (Epoch 2) – a sign that the pandemic is taking a turn for the worse again.
Epoch 4 ‘stability after descent’ is similar to Epoch 1 (pre-exponential growth), when a descent in daily events has occurred and daily counts are again low and stable. Epoch 4 can end if further signs of trouble are detected and there is a return to exponential growth (Epoch 2).

Figure 23 shows these epochs using data from different countries; Figure 24 shows the hybrid Shewhart chart for the United Kingdom. Parry et al. state:

Four Shewhart charts illustrate the four epochs of daily reported COVID−19 deaths from February 2020 to August 2020 in different countries.The first chart illustrates Epoch 1 Pre-exponential growth from February to June 2020 in South Korea. The disease curve is close to being a flat line.The second chart illustrates Epoch 2 Exponential growth from April to May 2020 in Peru. The disease curve moves from a flat line to a steep curve.The third chart illustrates Epoch 3 Plateau or descent from April to May 2020 in the United Kingdom. A special cause signals a reduction in exponential growth. A plateau in deaths follows, and then a second special cause signals a further reduction leading to a decline in deaths.The fourth chart illustrates Epoch 4 Stability after descent from July to August 2020 in Italy. The lower limit drops below 2, signalling the start of the flatlining curve.

Figure 23 Shewhart charts for the four epochs of daily reported COVID-19 deaths in different countries Adapted from Parry et al.Reference Parry, Provost, Provost, Little and Perla⁴³

A hybrid Shewhart control chart shows the number of COVID-19-reported deaths in the United Kingdom daily for the period September to November 2020. COVID-19-reported deaths are shown from 0 to 900 on the y-axis and the time period is shown across the x-axis.

Figure 24 Hybrid Shewhart control chart for monitoring daily COVID-19 deaths in the United Kingdom

Adapted from Parry et al.⁴³

“Shewhart charts should be a standard tool to learn from variation in data during an epidemic. Medical professionals, improvement leaders, health officials and the public could use this chart with reported epidemic measures such as cases, testing rates, hospitalizations, intubations, and deaths to rapidly detect meaningful changes over time.”Reference Parry, Provost, Provost, Little and Perla⁴³

The previous case studies have demonstrated the use of statistical process control methods in healthcare across a wide range of applications. In the next section, we offer a more critical examination of the methodology to identify and address the barriers to successful use in practice.

4 Critiques of Statistical Process Control

Although statistical process control methodology is now widely used to monitor and improve the quality and safety of healthcare, in this section, we consider the strengths, limitations, and future role of the methodology in healthcare.

4.1 The Statistical Process Control Paradox: It’s Easy Yet Not So Easy

As the case studies discussed in Section 3 show, statistical process control is not simply a graphical tool. Rather, it is a way of thinking scientifically about monitoring and improving the quality and safety of care. But while the idea of common versus special cause variation is intuitive, the successful application of statistical process control is not as easy as it might first appear, especially in complex adaptive systems like healthcare.Reference Thor, Lundberg and Ask¹² Successfully using statistical process control in healthcare usually depends on several factors, which include engaging the stakeholders; forming a team; defining the aim; selecting the process of interest; defining the metrics of interest; ensuring that data can be reliably measured, collected, fed back, and understood; and establishing baseline performance – all in a culture of continual learning and improvement. Several systematic reviews of the use of statistical process control in healthcare provide critical insights into the benefits, limitations, barriers, and facilitators to successful application.Reference Suman and Prajapati¹¹^, Reference Thor, Lundberg and Ask¹²^, Reference Koetsier, van der Veer SN, Jager, Peek and de Keizer⁴⁴ Some key lessons are shown in Table 2.

Table 2 Some key lessons from systematic reviews of statistical process control in healthcare

Benefits	Statistical process control is a simple, relatively low-cost approach that facilitates process improvement and can be applied to a wide range of processes It is useful for the management of healthcare, for assessment of the learning curve, and management of individual patients It can enhance engagement of different stakeholders, including patients
Limitations	Presenting data as a statistical process control chart does not automatically lead to improvements A process that is in statistical control is not necessarily clinically acceptable or adequate The correct application of statistical process control requires technical skills
Barriers	Statistical process control can sometimes meet resistance because it may imply a change of thinking and approach Lack of access to reliable data and adequate IT infrastructure to support the use of statistical process control can hinder application in practice Data collection and analysis for statistical process control can be time-consuming
Facilitators	Training users in statistical process control methodology and ensuring expert technical support is available can facilitate successful application Development of easy-to-use IT tools for data management and statistical process control charting can also help Focusing statistical process control on clinical topics can capture the interest of clinicians

A further challenge is that statistical process control charts are not necessarily easy to build. Even when using a run chart, for example, practitioners face differing advice on how to interpret them. Three sets of run chart rules – the Anhoej, Perla, and Carey rules – have been published, but they differ significantly in their sensitivity and specificity to detecting special causes of variation,Reference Anhøj and Olesen¹⁹^, Reference Anhøj²⁰ and there is little practical guidance on how to proceed. So perhaps it is not surprising that the literature features multiple examples of technical errors. After examining 64 statistical process control charts, Koetsier et al.Reference Koetsier, van der Veer SN, Jager, Peek and de Keizer⁴⁴ report that almost half (48.4%) used insufficient data points, 43.7% did not transform skewed data, and 14% did not report the rules for identifying special causes of variation. The authors conclude that many published studies did not follow all methodological criteria and so increased the risk of drawing incorrect conclusions. They call for greater clarity in reporting statistical process control charts along with greater adherence to methodological criteria. All this suggests a need for more training for those constructing charts and greater involvement of statistical process control experts.

4.2 Two Types of Errors When Using Statistical Process Control

Classifying variation into common cause or special cause is the primary focus of statistical process control methodology. In practice, this classification is subject to two types of errorReference Shewhart⁴^, Reference Woodall, Adams, Benneyan, Faltin, Kenett and Ruggeri¹³^, Reference Provost and Murray¹⁵^, Reference Mohammed, Cheng, Rouse and Bristol²⁸^, Reference Deming⁴⁵ (see Box 7) which can be compared to an imperfect screening test that sometimes shows a patient has disease when in fact the patient is free from disease (false positive), or the patient is free from disease when in fact the patient has disease (false negative).

Box 7 Two types of error when using statistical process control

Error 1: Treating an outcome resulting from a common cause as if it were a special cause and (wrongly) seeking to find a special cause, when in fact the cause is the underlying process.
Error 2: Treating an outcome resulting from a special cause as if it were a common cause and so (wrongly) overlooking the special cause.

Either error can cause losses. If all outcomes were treated as special cause variation, this maximises the losses from error 1. And if all outcomes were treated as common cause variation, this maximises the losses from error 2. Unfortunately, in practice, it is impossible to reduce both errors to zero and so a choice must be made to set the control limit. Shewhart concluded that it was best to make either error rarely and that this mainly depended upon how much it might cost to look for trouble in a stable process unnecessarily.Reference Shewhart⁴^, Reference Deming⁴⁵ Using mathematical theory, empirical evidence, and pragmatism, he argued that setting control limits to ± three standard deviations from the mean provides a reasonable balance between making either type of error.

The choice of three standard deviations ensures there is a relatively small chance that an investigation of special cause variation will be unfounded because the chances of a false alarm are relatively low. The sensitivity (to special causes) could be increased by lowering the control limits to, say, two standard deviations. Although this will increase sensitivity, it will also increase the chances of false alarms. The extent to which this is acceptable requires decision-makers to balance the total costs (e.g. time, money, human resources, quality, safety, reputation) of investigating (true or false) signals versus the costs of overlooking these signals (and so not investigating). In practice, this is a matter of judgement which varies with context. Nevertheless, in the era of ‘big data’ in healthcare (see Section 4.6) the issue of false alarms needs greater appreciation and attention.

4.3 Using More Than One Statistical Process Control Chart

Although earlier sections have shown some examples of plotting more than one statistical process control chart for the same data, the literature tends to encourage people to identify the most appropriate single control chart. This offers a useful starting point, especially for beginners, but recognition is growing that use of two (or more) charts of the same data can offer useful insights that might not otherwise be noticed.Reference Mohammed and Worthington⁴⁶^, Reference Henderson, Davies and Macdonald⁴⁷

For example, Figure 25 shows inspection data for the proportion of defective manufactured goods described by Deming.Reference Deming⁴⁵ The data are charted using two types of statistical process control chart: the p-chart (left panel) and the XmR chart (right panel). Each chart shows a central line and control limits at three standard deviations from the mean. While each chart appears to show common cause variation, marked differences in the width of the control limits across the two charts are evident. This suggests something unusual about these data. As Deming explains, these inspection figures were falsified (a special cause) because the inspector feared the plant would be closed if the proportion of defective goods went beyond 10%.Reference Deming⁴⁵ So, a systematic special cause has impacted all the data (not just a few data points), and that’s why the limits between the two charts differ. This means that relying only on one chart risks overlooking the existence of this underlying special cause, whereas using two charts side by side provides additional insight.

Two statistical process control charts show daily inspection data for the proportion of defective manufactured products.The left panel is a p-chart and the right panel is an XmR-chart. Each chart has a central horizontal line as well as lines showing the upper and lower control limits at three standard deviations from the mean. The width of the control limits differs widely across the two charts, with the control limits at just over 0.03 and just under 0.15 on the p-chart but at just under 0.12 and just over 0.06 on the XmR-chart, while the central line is at 0.09 on both charts.

Figure 25 Two side-by-side statistical process control charts showing daily proportion of defective products. The left panel is a p-chart and the right panel is an XmR-chart. The difference in control limits indicates an underlying special cause even though each chart appears to be consistent with common cause variation when viewed alone

Although decision-makers may not routinely have time and space to review multiple types of statistical control charts, analysts working with data might well seek to consider and explore the use of more than one chart. The additional insight gained could prove useful and requires little extra effort, especially if using software to produce the charts. Henderson et al.Reference Henderson, Davies and Macdonald⁴⁷ suggest the combined use of run chart and CUSUM plots, and Rogers et al.Reference Rogers, Reeves and Caputo²¹ and Sherlaw-Johnson et al.Reference Sherlaw-Johnson, Morton, Robinson and Hall⁴⁸ suggest the use of combined CUSUM-type charts.

4.4 The Risks of Risk-Adjusted Statistical Process Control in Healthcare

A distinctive feature of applying statistical process control in healthcare versus industry is the use of risk adjustment to reflect differences between patients.Reference Woodall, Adams, Benneyan, Faltin, Kenett and Ruggeri¹³ Typically, this type of control chart relies on a statistical model to estimate the risk of death for a given patient and then compare this with the observed outcome (died or survived). When using risk-adjusted charts, the explanation for a signal of special cause variation might be thought to lie beyond the risk profile of the patient. But this approach is flawed: it fails to recognise that risk adjustment, although widely used in healthcare, is not a panacea and poses its own risks.Reference Lilford, Mohammed, Spiegelhalter and Thomson³⁴^, Reference Iezzoni⁴⁹^–Reference Nelson⁵⁵

For example, a systematic review of studies that examined the relationship between quality of care and risk-adjusted outcomes found counter-intuitive results: an ‘intuitive’ relationship (better care was associated with lower risk-adjusted death rates) was found in around half of the 52 relationships; the remainder showed either no correlation (there was no correlation between quality of care and risk-adjusted death rates) or a ‘paradoxical’ correlation (higher quality of care was associated with higher risk-adjusted death rates). The authors conclude that ‘the link between quality of care and risk-adjusted mortality remains largely unreliable’. Reference Pitches, Mohammed and Lilford⁵¹

The consequences of prematurely inferring problems with the quality of care on the basis of casemix-adjusted statistical control charts can be serious (see Box 8).

Box 8 Wrongly suggesting that the special cause variation after risk adjustment implies problems with quality of care

A renowned specialist hospital received a letter from the Care Quality Commission informing them that a risk-adjusted hospital mortality monitoring scheme had signalled an unacceptably high death rate: 27.8 deaths were expected, but 46 had been observed. In 2017, senior hospital staff wrote in The Lancet:

One might ask, however, what harm is done? After all, it is better to monitor than not and a hospital falsely accused of being a negative outlier can defend itself with robust data and performance monitoring. That is true but, because of this spurious alert, our hospital morale was shaken; management and trust board members were preoccupied with this issue for weeks; and our already stretched audit department expended over 50 person-hours of work reviewing data and formulating a response to satisfy the Care Quality Commission that we are most certainly not a negative outlier, but a unit with cardiac results among the best in the country.Reference Nashef, Powell, Jenkins, Fynn and Hall⁵²

Another example is the use of the Partial Risk Adjustment in Surgery model, which fails to adjust for certain comorbid conditions and underestimates the risk for the highest-risk patients. This reportedly led to a negative impression of performance in one UK centre that was involved in real-time monitoring of risk-adjusted paediatric cardiac surgery outcomes (for procedures carried out during 2010 and 2011) using variable life-adjusted display plots.Reference O’Neill, Wigmore and Harrison⁵³

Another crucial issue with risk-adjustment schemes is that they operate under the assumption of a constant relationship between patient risk factors and the outcome (e.g. between age and mortality). But if this relationship is not constant, then risk adjustment may increase rather than decreaseReference Nicholl⁵⁰^, Reference Mohammed, Deeks and Girling⁵⁴ the very bias it was designed to overcome.

This misconception – that casemix-adjusted outcomes can be reliably used to blame quality of care – is termed the ‘casemix adjustment fallacy’.Reference Lilford, Mohammed, Spiegelhalter and Thomson³⁴ This bear trap can be avoided by adopting the Pyramid Model of investigation, described in Section 3.6, which underscores the point that casemix adjustment has its own risks, and that care needs to be taken when interpreting casemix-adjusted analyses.Reference Lilford, Mohammed, Spiegelhalter and Thomson³⁴

4.5 Methodological Controversies in Statistical Process Control

Debates between methodologists on the correct way to think about the design and use of statistical process control charts have been a long-standing feature of the technical literature.Reference Woodall⁴⁰ Our purpose here is not to review these issues, but simply to highlight that these controversies have existed for decades. For example, Nelson first wrote a note addressing five misconceptions relating to Shewhart control charts (these are set out in Box 9) in 1999.Reference Nelson⁵⁵

Box 9 Five misconceptions that have led to methodological controversy in respect of Shewhart control chartsReference Nelson⁵⁵

1. Shewhart charts are a graphical way of applying a sequential statistical significance test for an out-of-control condition.
2. Control limits are confidence limits on the true process mean.
3. Shewhart charts are based on probabilistic models, subject to or involving chance variation.
4. Normality is required for the correct application of a mean (or x bar) chart.
5. The theoretical basis for the Shewhart control chart has some obscurities that are difficult to teach.

Contrary to what is found in many articles and books, all five of these statements are incorrect.

In a similar vein, Blackstone’s 2004 analysis provided a surgeon’s critique of the methodological issues of using risk-adjusted CUSUM charts to monitor surgical performance.Reference Blackstone⁵⁶ One feature of CUSUM-based schemes is that they appear to place considerably more emphasis on statistical significance testing than Shewhart control charts. Blackstone notes that while most ‘discussants’ agree that continual testing is ‘in some sense’ subject to the multiple comparisons problem, and one’s interpretation must be affected by how often the data are evaluated, some statisticians maintain that the multiple comparison problem ‘is not applicable to the quality control setting’. Blackstone goes on to say, ‘I am not sure what to believe, frankly, nor do I think this issue will be soon resolved.’ Happily, practitioners can profit from the use of statistical process control methodology without having to address these controversies.Reference Woodall⁴⁰

4.6 The Future of Statistical Process Control in Healthcare

The use of statistical process control to support efforts to monitor and improve the quality of healthcare is well established, with calls to extend its use. Reference O’Brien, Viney, Doherty and Thomas⁵⁷^–Reference O’Sullivan, Chang, Baker and Shah⁶⁰ Although it ‘cannot solve all problems and must be applied wisely’,Reference Thor, Lundberg and Ask¹² the future for statistical process control in healthcare looks promising, including wider use across clinical and managerial processes. However, the use of statistical process control methodology at scale presents some additional unique challenges.Reference Jensen, Szarka and White⁶¹^–Reference Suter-Crazzolara⁶³

As an example, consider a hospital with five divisions, each with five wards: a single measure (such as staff absence) plotted on a control chart leads to 25 charts across the wards plus five charts across the divisions and one chart across the organisation (31 charts in total). Rolling out control charts across an entire organisation would require practical ways for staff to easily produce and collate charts (see Box 10).

Box 10 Scaling up control charts across East London NHS Foundation TrustReference Jensen, Szarka and White⁶¹

East London NHS Foundation Trust (ELFT), established in 2000, provides mental health and community health services to a culturally diverse and socio-economically deprived catchment area of approximately 1.5 million people.

In 2014, ELFT launched its trust-wide quality improvement programme, which has adopted the Institute for Healthcare Improvement’s Model for Improvement using tools such as PDSA cycles, driver diagrams, and statistical process control charts. This commitment developed from a desire to shift power in the organisation so that service users, carers, and staff were better able to understand and improve the quality of care being provided.

An important challenge was to capture the learning at team level. Teams recorded their PDSA tests of change locally using paper or local IT systems. This was not reliable, so, the IT team at ELFT developed an online quality improvement platform to make it much easier for teams to log their PDSAs, create driver diagrams, and input and view their data as control charts. The IT system supports the production of statistical process control charts which usually require fixing of baselines, recalculating limits following a successful change and annotations that highlight the changes. Given the scores of charts to choose from, the automation of charts overcomes an important barrier, especially for new users. The Inpatient Mental Health Analytics app has 34,650 statistical process control charts with over 100,000 charts across the organisation.

Where there are thousands of control charts, users also need an effective way to collate them so they can still see the wood for the trees. This is an active area of research, involving proposals based on summary measures shown on a single graph to visualise the many control charts and spot the ones of most concern.Reference Jensen, Szarka and White⁶¹^–Reference Suter-Crazzolara⁶³

A related issue is the massive proliferation of automatically collected digital data in healthcare.Reference Gopal, Suter-Crazzolara, Toldo and Eberhardt⁶⁴^, Reference Qiu⁶⁵ It has been estimated that up to 30% of the entire world’s stored data is health related.Reference Qiu⁶⁵ This so-called big data is characterised by high volume (e.g. a single patient generates up to 80 megabytes of data annually, which is about 40,000 pages), high velocity (e.g. patient movement can be automatically collected every 30 seconds), and high variety (with multiple sources of data which include test results, images, text, movement, etc.). Although several researchersReference Megahed, Jones-Farmer, In: Knoth and Schmid⁶⁶^–Reference Woodall and Faltin⁶⁸ have suggested that statistical process control charting may be useful to monitor big data over time, a number of methodological challenges need to be addressed, including the cautious choice of the sampling and collection interval.Reference Zwetsloot and Woodall⁶⁹ For example, if data are available every second, then should these data be charted every second, minute, hour, and so on? Also, as more variables are monitored more often, it becomes increasingly important to keep the number of false alarms at a manageable number. A false alarm is a signal of special cause variation which is false; if there are too many false alarms, then the monitoring scheme becomes ineffective and discredited. The successful use of control charts in the era of big data will require low false-alarm rates.

So, while the future of statistical process control methodology appears promising, paradoxically, its use at scale needs to address some unique challenges.

5 Conclusions

Statistical process control methodology is based on a fundamental intuitive insight – that processes are subject to two sources of variation: common cause versus special cause. As the case studies show, this profound insight enables us to understand, monitor, and improve a wide range of processes, such as a person’s handwritten signatures, a person’s blood pressure, the results from surgery, the performance of hospitals, and the progress of a pandemic. The methodology offers a useful, robust, versatile, statistical, practical, and evidence-based approach, but its successful application requires overcoming technical and non-technical barriers. Numerous studies now demonstrate that such barriers are surmountable. This highlights the remarkable progress of statistical process control methodology from manufacturing industry in the 1920s to present-day healthcare.

6 Further Reading

Constructing Statistical Process Control Charts

Mohammed et al.Reference Mohammed, Worthington and Woodall¹⁷ – a step-by-step tutorial paper to show practitioners how to produce commonly used Shewhart control charts.
Provost and MurrayReference Provost and Murray¹⁵ – a comprehensive book that focuses on the use of statistical process control in healthcare with worked examples on how to produce a wide range of control charts.
Rogers et al.Reference Rogers, Reeves and Caputo²¹ – an overview of the use of CUSUM-type plots that are commonly used to monitor outcomes in surgery.

Statistical Process Control Methodology in Healthcare

Thor et al.Reference Thor, Lundberg and Ask¹² – a systematic review of the application of statistical process control in healthcare improvement that also highlights the barriers and enablers to successful use of these methods.
Tennant et al.Reference Tennant, Mohammed, Coleman and Martin⁷ – a systematic review of the use of control charts to monitor individual patients.

Methodological Challenges and Controversies

WoodallReference Woodall⁴⁰ – discusses some of the key methodological controversies in statistical process control.
Woodall and FaltinReference Woodall and Faltin⁶⁸ – highlight some of the key challenges of using statistical process control at scale and how they might be overcome.

Conflicts of interest

None.

Acknowledgements

I thank the peer reviewers for their insightful comments and recommendations to improve the Element. A list of peer reviewers is published at www.cambridge.org/IQ-peer-reviewers. I would also like to thank Steve Flood and Claire Dipple for their help in preparing this Element.

Funding

This Element was funded by THIS Institute (The Healthcare Improvement Studies Institute, www.thisinstitute.cam.ac.uk). THIS Institute is strengthening the evidence base for improving the quality and safety of healthcare. THIS Institute is supported by a grant to the University of Cambridge from the Health Foundation – an independent charity committed to bringing about better health and healthcare for people in the United Kingdom.

About the Author

Mohammed Amin Mohammed is Emeritus Professor of Healthcare Quality and Effectiveness at the Faculty of Health Studies, University of Bradford, and Principal Consultant at the Strategy Unit. His main areas of interest are healthcare quality, performance measurement and monitoring, and more generally health services research in primary and secondary care.

Creative Commons License

The online version of this work is published under a Creative Commons licence called CC-BY-NC-ND 4.0 (https://creativecommons.org/licenses/by-nc-nd/4.0). It means that you’re free to reuse this work. In fact, we encourage it. We just ask that you acknowledge THIS Institute as the creator, you don’t distribute a modified version without our permission, and you don’t sell it or use it for any activity that generates revenue without our permission. Ultimately, we want our work to have impact. So if you’ve got a use in mind but you’re not sure it’s allowed, just ask us at [email protected].

The printed version is subject to statutory exceptions and to the provisions of relevant licensing agreements, so you will need written permission from Cambridge University Press to reproduce any part of it.

All versions of this work may contain content reproduced under licence from third parties. You must obtain permission to reproduce this content from these third parties directly.

Editors-in-Chief

Mary Dixon-Woods
THIS Institute (The Healthcare Improvement Studies Institute)
Mary is Director of THIS Institute and is the Health Foundation Professor of Healthcare Improvement Studies in the Department of Public Health and Primary Care at the University of Cambridge. Mary leads a programme of research focused on healthcare improvement, healthcare ethics, and methodological innovation in studying healthcare.

Graham Martin
THIS Institute (The Healthcare Improvement Studies Institute)
Graham is Director of Research at THIS Institute, leading applied research programmes and contributing to the institute’s strategy and development. His research interests are in the organisation and delivery of healthcare, and particularly the role of professionals, managers, and patients and the public in efforts at organisational change.

Executive Editor

Katrina Brown
THIS Institute (The Healthcare Improvement Studies Institute)
Katrina was Communications Manager at THIS Institute, providing editorial expertise to maximise the impact of THIS Institute’s research findings. She managed the project to produce the series until 2023.

Editorial Team

Sonja Marjanovic
RAND Europe
Sonja is Director of RAND Europe’s healthcare innovation, industry, and policy research. Her work provides decision-makers with evidence and insights to support innovation and improvement in healthcare systems, and to support the translation of innovation into societal benefits for healthcare services and population health.

Tom Ling
RAND Europe
Tom is Head of Evaluation at RAND Europe and President of the European Evaluation Society, leading evaluations and applied research focused on the key challenges facing health services. His current health portfolio includes evaluations of the innovation landscape, quality improvement, communities of practice, patient flow, and service transformation.

Ellen Perry
THIS Institute (The Healthcare Improvement Studies Institute)
Ellen supported the production of the series during 2020–21.

Gemma Petley
THIS Institute (The Healthcare Improvement Studies Institute)
Gemma is Senior Communications and Editorial Manager at THIS Institute, responsible for overseeing the production and maximising the impact of the series.

Claire Dipple
THIS Institute (The Healthcare Improvement Studies Institute)
Claire is Editorial Project Manager at THIS Institute, responsible for editing and project managing the series.

About the Series

The past decade has seen enormous growth in both activity and research on improvement in healthcare. This series offers a comprehensive and authoritative set of overviews of the different improvement approaches available, exploring the thinking behind them, examining evidence for each approach, and identifying areas of debate.

Element contents

Statistical Process Control

Summary

Keywords

1 Introduction

1.1 Understanding Variation by Handwriting the Letter ‘a’

1.2 A Brief History of Statistical Process Control Methodology

2 What Is Statistical Process Control Methodology?

Box 1 Features of common versus special cause variation

2.1 The Statistical Process Control Chart

2.1.1 The Run Chart

2.1.2 Shewhart Control Charts

2.1.3 Cumulative Sum Charts

3 Statistical Process Control in Action

3.1 Improving Patient Flow in an Acute Hospital Using Run Charts

3.2 Managing Individual Patients with Hypertension through Use of Statistical Process Control

Box 2 A patient’s and a physician’s perspectives on using control chartsReference Hebert and Neuhauser23

3.3 Monitoring the Performance of Individual Surgeons through CUSUM Charts

3.4 Reducing Complications after Surgery Using Statistical Process Control

3.5 Comparing the Performance of Healthcare Organisations Using Funnel Plots

3.6 Investigating Special Cause Variation in Healthcare Using the Pyramid Model

Box 3 The three basic steps for using the Pyramid Model to investigate special cause variation in healthcare

Box 4 The use of the Pyramid Model to investigate high-mortality general practitioners flagged up by the Shipman Inquiry

Table 1 Use of the Pyramid Model to investigate special cause variation in hospitals in Queensland, Australia

3.7 Control Charts in Hospital Board Reports

Box 5 Three board members’ perspectives on control charts

3.8 Tracking Deaths during the COVID-19 Pandemic through Shewhart Control Charts

Box 6 The four epochs of an epidemic curveReference Parry, Provost, Provost, Little and Perla43

4 Critiques of Statistical Process Control

4.1 The Statistical Process Control Paradox: It’s Easy Yet Not So Easy

Table 2 Some key lessons from systematic reviews of statistical process control in healthcare

4.2 Two Types of Errors When Using Statistical Process Control

Box 7 Two types of error when using statistical process control

4.3 Using More Than One Statistical Process Control Chart

4.4 The Risks of Risk-Adjusted Statistical Process Control in Healthcare

Box 8 Wrongly suggesting that the special cause variation after risk adjustment implies problems with quality of care

4.5 Methodological Controversies in Statistical Process Control

Box 9 Five misconceptions that have led to methodological controversy in respect of Shewhart control chartsReference Nelson55

4.6 The Future of Statistical Process Control in Healthcare

Box 10 Scaling up control charts across East London NHS Foundation TrustReference Jensen, Szarka and White61

5 Conclusions

6 Further Reading

Conflicts of interest

Acknowledgements

Funding

About the Author

Creative Commons License

References

Save element to Kindle

Save element to Dropbox

Save element to Google Drive

Box 2 A patient’s and a physician’s perspectives on using control chartsReference Hebert and Neuhauser²³

Box 6 The four epochs of an epidemic curveReference Parry, Provost, Provost, Little and Perla⁴³

Box 9 Five misconceptions that have led to methodological controversy in respect of Shewhart control chartsReference Nelson⁵⁵

Box 10 Scaling up control charts across East London NHS Foundation TrustReference Jensen, Szarka and White⁶¹