Can quality improvement improve the quality of care? A systematic review of reported effects and methodological rigor in plan-do-study-act projects

The Plan-Do-Study-Act (PDSA) method is widely used in quality improvement (QI) strategies. However, previous studies have indicated that methodological problems are frequent in PDSA-based QI projects. Furthermore, it has been difficult to establish an association between the use of PDSA and improvements in clinical practices and patient outcomes. The aim of this systematic review was to examine whether recently published PDSA-based QI projects show self-reported effects and are conducted according to key features of the method.

Methods

A systematic literature search was performed in the PubMed, Embase and CINAHL databases. QI projects using PDSA published in peer-reviewed journals in 2015 and 2016 were included. Projects were assessed to determine the reported effects and the use of the following key methodological features; iterative cyclic method, continuous data collection, small-scale testing and use of a theoretical rationale.

Results

Of the 120 QI projects included, almost all reported improvement (98%). However, only 32 (27%) described a specific, quantitative aim and reached it. A total of 72 projects (60%) documented PDSA cycles sufficiently for inclusion in a full analysis of key features. Of these only three (4%) adhered to all four key methodological features.

Conclusion

Even though a majority of the QI projects reported improvements, the widespread challenges with low adherence to key methodological features in the individual projects pose a challenge for the legitimacy of PDSA-based QI. This review indicates that there is a continued need for improvement in quality improvement methodology.

Background

Plan-Do-Study-Act (PDSA) cycles are widely used for quality improvement (QI) in most healthcare systems where tools and models inspired by industrial management have become influential [1]. The essence of the PDSA cycle is to structure the process of improvement in accordance with the scientific method of experimental learning [2,3,4,5]. It is used with consecutive iterations of the cycle constituting a framework for continuous learning through testing of changes [6,7,8,9,10].

The concept of improvement through iterative cycles has formed the basis for numerous structured QI approaches including Total Quality Management, Continuous Quality Improvement, Lean, Six Sigma and the Model for Improvement [4, 6, 10]. These “PDSA models” have different approaches but essentially consist of improvement cycles as the cornerstone combined with a bundle of features from the management literature. Especially within healthcare, several PDSA models have been proposed for QI adding other methodological features to the basic principles of iterative PDSA cycles. Key methodological features include the use of continuous data collection [2, 6, 8,9,10,11,12,13], small-scale testing [6, 8, 10, 11, 14,15,16] and use of a theoretical rationale [5, 9, 17,18,19,20,21,22]. Most projects are initiated in the complex social context of daily clinical work [12, 23]. In these settings, focus on use of these key methodological features ensures quality and consistency by supporting adaptation of the project to the specific context and minimizing the risk of introducing harmful or wasteful unintended consequences [10]. Thus, the PDSA cycle is not sufficient as a standalone method [4] and integration of the full bundle of key features is often simply referred to as the PDSA method (Fig. 1).

Since its introduction to healthcare in the 1990s, numerous QI projects have been based on the PDSA method [10, 24]. However, the scientific literature indicates that the evidence for effect is limited [10, 25,26,27,28,29,30]. The majority of the published PDSA projects have been hampered with severe design limitations, insufficient data analysis and incomplete reporting [12, 31]. A 2013 systematic review revealed that only 2/73 projects reporting use of the PDSA cycle applied the PDSA method in accordance with the methodological recommendations [10]. These methodological limitations have led to an increased awareness of the need for more methodological rigor when conducting and reporting PDSA-based projects [4, 10]. This challenge is addressed by the emergent field of Improvement Science (IS) which attempts to systematically examine methods and factors that best facilitate QI by drawing on a range of academic disciplines and encourage rigorous use of scientific methods [5, 12, 32, 33]. It is important to make a distinction between local QI projects, where the primary goal is to secure a change, and IS, where the primary goal is directed at evaluation and scientific advancement [12].

In order to improve local QI projects, Standards for Quality Improvement Reporting Excellence (SQUIRE) guidelines have been developed to provide a framework for reporting QI projects [18, 34]. Still, it remains unclear to what extent the increasing methodological awareness is reflected in PDSA-based QI projects published in recent years. Therefore, we performed a systematic review of recent peer-reviewed publications reporting QI projects using the PDSA methodology in healthcare and focused on the use of key features in the design and on the reported effects of the projects.

Methods

The key features of PDSA-based QI projects were identified, and a simple but comprehensive framework was constructed. The review was conducted in adherence with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [35].

The framework

Informed by recommendations for key features in use and support of PDSA from literature specific to QI in healthcare the following key features were identified:

Use of an iterative cyclic method [6,7,8,9,10]
Use of continuous data collection [2, 6, 8,9,10,11,12,13]
Small-scale testing [6, 8, 10, 11, 14,15,16]
Explicit description of the theoretical rationale of the projects [5, 9, 17,18,19,20,21,22]

Aiming for conceptual simplicity, we established basic minimum requirements for the presence of the key features operationalizing them into binary (yes/no) variables. General characteristics and supplementary data that elaborated the use of the key features were operationalized and registered as categorical variables. See Table 1 for an overview of the framework and Additional file 1 for a more in-depth elaboration of the definitions used for the key features. Since a theoretical rationale can take multiple forms, the definition for this feature was taken from the recent version of the SQUIRE guidelines [18].

An overview of general characteristics, supplementary features and self-reported effects of the included projects are presented in Table 2.

Iterative cycles

Fifty-seven projects (79%) had a sequence of cycles where one informed the actions of the next. A single iterative chain of cycles was used in 41 (57%), while four (5%) had multiple isolated iterative chains and 12 (17%) had a mix of iterative chains and isolated cycles. Of the 15 projects using non-iterative cycles, two reported a single cycle while 13 used multiple isolated cycles. The majority (55/72) (76%) tested one change per cycle.

Small scale testing

The testing of changes in a small scale was carried out by 10 projects (14%), of which seven did so in an increasing scale, while two kept testing at the same scale. It was unclear which type of scaling was used in the remaining project. Sixty-two projects (86%) carried out testing on an entire department or engaged in full-scale implementation before having tested the improvement intervention.

Continuous data collection

Continuous measurements over time with three or more data points at regular intervals were used by 48 (67%) out of 72 projects. Of these 48, half used run charts, while the other half used control charts. Other types of data measurement such as before and after or per PDSA cycle or having a single data point as outcome after cycle(s) was done by 18 (25%) and 5 (7%), respectively. One project did not report their data. Sixty-five projects (90%) used a baseline measurement for comparison.

Theoretical rationale

Twenty-six (36%) out of 72 projects explicitly stated the theoretical rationale of the project describing why it was predicted to lead to improvement in their specific clinical context. In terms of inspiration for the need for improvement 68 projects (94%) referred to scientific literature. For the QI interventions used in the projects 26 (36%) found inspiration in externally existing knowledge in forms of scientific literature, previous QI projects or benchmarking. Twenty-one (29%) developed the projects themselves, 10 (14%) used existing knowledge in combination with own ideas while 15 (21%) did not state the source.

Discussion

In this systematic review nearly all PDSA-based QI projects reported improvements. However, only approximately one out of four projects had defined a specific quantitative aim and reached it. In addition, only a small minority of the projects reported to have adhered to all four key features recommended in the literature to ensure the quality and adaptability of a QI project.

The claim that PDSA leads to improvement should be interpreted with caution. The methodological limitations in many of the projects makes it difficult to draw firm conclusions about the size and the causality of the reported improvements in quality of care. The methodological limitations question the legitimacy of PDSA as an effective improvement method in health care. The widespread lack of theoretical rationale and continuous data collection in the projects makes it difficult to track and correct the process as well as to relate an improvement to the use of the method [10, 11]. The apparent limited use of the iterative approach and small-scale-testing constitute an additional methodological limitation. Without these tools of testing and adapting one can risk introducing unintended consequences [1, 36]. Hence, QI initiatives may potentially tamper with the system in unforeseen ways creating more harm and waste than improvement. The low use of small-scale-testing could perhaps originate in a widespread misunderstanding that one should test large-scale to get a proper statistical power. However, this is not necessarily the case with PDSA [15].

There is no simple answer to this lack of adherence to the key methodological features. Some scholars claim that even though the concept of PDSA is relatively simple it is difficult to master in reality [4]. Some explanations to this have been offered including an urge to favour action over evidence [36], an inherent messiness in the actual use of the method [11], its inability to address “big and hairy” problems [37], an oversimplification of the method, and an underestimation of the required resources and support needed to conduct a PDSA-based project [4].

In some cases, it seems reasonable that the lack of adherence to the methodological recommendations is a problem with documentation rather than methodological rigor, e.g. the frequent lack of small-scale pilot testing may be due to the authors considering the information too irrelevant, while still having performed it in the projects.

Regarding our framework one could argue that it has too many or too few key features to encompass the PDSA method. The same can be said about the supplementary features where additional features could also have been assessed e.g. the use of Specific, Measurable, Attainable, Relevant and Timebound (SMART) goals [14]. It has been important for us to operationalize the key features so their presence easily and accurately can be identified. Simplification carries the risk of loss of information but can be outweighed by a clear and applicable framework.

This review has some limitations. We only included PDSA projects reported in peer-reviewed journals, which represents just a fraction of all QI projects being conducted around the globe. Further, it might be difficult to publish projects that do not document improvements. This may introduce potential publication bias. Future studies could use the framework to examine the grey literature of evaluation reports etc. to see if the pattern of methodological limitations is consistent. The fact that a majority of the projects reported positive change could also indicate a potential bias. For busy QI practitioners the process of translating a clinical project into a publication could well be motivated by a positive finding with projects with negative effects not being reported. However, we should not forget that negative outcome of a PDSA project may still contribute with valuable learning and competence building [4, 6].

The field of IS and collaboration between practitioners and scholars has the potential to deliver crucial insight into the complex process of QI, including the difficulties with replicating projects with promising effect [5, 12, 20, 32]. Rigorous methodological adherence may be experienced as a restriction on practitioners, which could discourage engagement in QI initiatives. However, by strengthening the use of the key features and improving documentation the PDSA projects will be more likely to contribute to IS, including reliable meta-analyses and systematic reviews [10]. This could in return provide QI practitioners with evidence-based knowledge [5, 38]. In this way rigor in performing and documenting QI projects benefits the whole QI community in the long run. It is important that new knowledge becomes readily available and application oriented, in order for practitioners to be motivated to use it. An inherent part of using the PDSA method consists of acknowledging the complexity of creating lasting improvement. Here the scientific ideals about planning, executing, hypothesizing, data managing and documenting with rigor and high quality should serve as inspiration.

Our framework could imply that the presence of all four features will inevitably result in the success of an improvement project. This it clearly not the case. No “magic bullets” exist in QI [39]. QI is about implementing complex projects in complex social contexts. Here adherence to the key methodological recommendations and rigorous documentation can help to ensure better quality and reproducibility. This review can serve as a reminder of these features and how rigor in the individual QI projects can assist the work of IS, which in return can offer new insight for the benefit of practitioners.

Conclusion

This systematic review documents that substantial methodological challenges remain when reporting from PDSA projects. These challenges pose a problem for the legitimacy of the method. Individual improvement projects should strive to contribute to a scientific foundation for QI by conducting and documenting with a higher rigor. There seems to be a need for methodological improvement when conducting and reporting from QI initiatives.

Availability of data and materials

All data generated or analysed during this review are included in this published article and its supplementary information files.