Abstract
Background Clinical practice guidelines (CPGs) are not readily implemented in clinical practice. One of the impeding factors is that physical therapists do not hold realistic perceptions of their adherence to CPGs. Peer assessment (PA) is an implementation strategy that aims at improving guideline adherence by enhancing reflective practice, awareness of professional performance, and attainment of personal goals.
Objective The purpose of this study was to compare the effectiveness of PA with the usual case discussion (CD) strategy on adherence to CPGs for physical therapist management of upper extremity complaints.
Design A single-masked, cluster-randomized controlled trial with pretest-posttest design was conducted.
Intervention Twenty communities of practice (n=149 physical therapists) were randomly assigned to groups receiving PA or CD, with both interventions consisting of 4 sessions over 6 months. Both PA and CD groups worked on identical clinical cases relevant to the guidelines. Peer assessment focused on individual performance observed and evaluated by peers; CD focused on discussion.
Outcomes Guideline adherence was measured with clinical vignettes, reflective practice was measured with the Self-Reflection and Insight Scale (SRIS), awareness of performance was measured via the correlation between perceived and assessed improvement, and attainment of personal goals was measured with written commitments to change.
Results The PA groups improved more on guideline adherence compared with the CD groups (effect=22.52; 95% confidence interval [95% CI]=2.38, 42.66; P=.03). The SRIS scores did not differ between PA and CD groups. Awareness of performance was greater for the PA groups (r=.36) than for the CD groups (r=.08) (effect=14.73; 95% CI=2.78, 26.68; P=.01). The PA strategy was more effective than the CD strategy in attaining personal goals (effect=0.50; 95% CI=0.04, 0.96; P=.03).
Limitations Limited validity of clinical vignettes as a proxy measure of clinical practice was a limitation of the study.
Conclusions Peer assessment was more effective than CD in improving adherence to CPGs. Personal feedback may have contributed to its effectiveness. Future research should address the role of the group coach.
Clinical practice guidelines (CPGs) are designed to facilitate evidence-based practice and to improve the quality of health care.1 The purpose of guidelines is to enhance transparency of care, to reduce unwarranted variability in practice, and to increase accountability to external stakeholders.2 Despite a multitude of implementation strategies, research has demonstrated unambiguously that CPGs are not readily implemented in everyday clinical practice.3,4 The main bottlenecks for practitioners are attributable to knowledge, attitudes, and factors concerning social, organizational, and societal support.5 Because education is assumed to be the first step to behavioral change in clinical practice, a variety of educational interventions have been designed to address knowledge, skills, and attitudes.6 Systematic reviews studying the effectiveness of educational strategies, however, have shown little to moderate effects in improving evidence-based practice.7
Rutten et al8 assessed the effectiveness of a quality improvement program aimed at professional and organizational behavioral change in physical therapist practice. Guideline adherence was assessed by clinical vignettes in a one-group pretest-posttest design. They found a 3.1% increase in adherence. Wensing et al6 reported a mean effect of 5% on different aspects of clinical practice, regardless of the type of educational intervention. Research showed that the effectiveness of educational strategies might improve when the intervention addresses small groups and allows for active participation and social interaction.9 In addition, change may be more likely if strategies are specifically chosen to address identified barriers to change.10 Bekkering et al11 showed moderate improvement of adherence to CPGs by physical therapists in the Netherlands through active educational strategies (discussion, role playing) compared with standard passive methods of guideline dissemination in physical therapy. Guideline adherence of physical therapists depends on levels of awareness of guideline-consistent behavior. Rutten et al12 used clinical vignettes to compare self-reported and externally assessed adherence. Realistic perceptions of adherence to CPGs were found in 38.5% of the participants. Differences in levels of awareness interfered with other determinants of guideline adherence, such as motivation to change. Research showed that health care professionals have a limited ability to accurately assess their own level of competence,13,14 which they systematically overestimate or underestimate.15,16
The development of adequate self-perception requires both internal and external information about one's professional performance as well as knowledge of appropriate performance standards.17 This finding is supported by studies showing that the effect of educational strategies on evidence-based practice increases when they are combined with other strategies, such as audit and feedback.3,18 Yet, audit and feedback have not consistently been found effective to change practice. A systematic review by Ivers et al19 showed mean improvements of adherence to desired practice of 4.3% for dichotomous outcomes and 1.3% for continuous outcomes. Whether feedback is accepted and used to change professional practice depends on a multitude of variables.20,21 Clinicians struggle with accepting feedback when it is incongruent with their self-assessment or threatens their self-confidence.17,22 Feedback appears to be more acceptable20 when it is provided in an environment of trust and mutual respect, and it is likely to be rejected when the provider is not perceived to be a credible and trustworthy source of information17,21 or when it conflicts with personal or group norms and values.23 Acceptance may be enhanced when feedback is tailored to the stages of change as described by Prochaska et al,24 and when it closely connects to the context of daily practice.5,25
Situated learning theory, based on studies by Lave and Wenger26 and Li et al,27 shows that professional knowledge acquired in a certain situation transfers only to similar situations. Their studies support the assumption that feedback provided within communities of practice (CoPs) has greater impact on the improvement of clinical practice than feedback provided by “outsiders.” Moreover, the involvement of CoP participants in each other's professional development process may facilitate acceptance of feedback and alignment with personal learning needs and goals.28–30
Drawing on these considerations, we introduced peer assessment (PA) as a new implementation strategy for clinical guidelines within existing CoPs. Peer assessment is the process whereby professionals evaluate or are being evaluated by their peers and provide each other with performance feedback. The positive impact of PA on learning and change has been well researched in higher education31–33 and health care professional education.34–37 However, Topping38 argued that generalizations to professional practice should be made with caution because successful PA implementation depends on variables such as the context of peers, the nature of the PA intervention, and the outcomes assessed. Lack of specified knowledge about the PA practices impedes the transfer of results.38
During the implementation of the Dutch guideline for physical therapist management in patients with nonspecific low back pain,39 PA showed promising results. In a randomized controlled trial conducted by van Dulmen and collegues,40 PA was significantly more effective in improving guideline adherence (measured using clinical vignettes) than the usual implementation strategy of case discussion (CD). We redesigned this PA program for the implementation of a newly developed guideline for complaints of the arm, neck, and shoulder41 and a new evidence statement for subacromial complaints.42 We also included the appraisal of patient records as a new element. Record keeping is an important quality indicator for physical therapy care, and patient records offer authentic assessment material that reflects clinical practice.43,44
Peer assessment and CD are implementation strategies informed by several sometimes overlapping theoretical constructs concerning learning and behavior change: principles of social-constructivist learning theory,45 such as contextual learning, collaborative learning, and active knowledge construction, and principles of self-regulated learning theory, such as conscious goal setting and reflection.29,46 In addition, the PA approach builds on principles of social-cognitive learning theory (concrete experience with and performance of desired behavior)47 and stages of change theory (tailored feedback).29,30 Moreover, PA targets the development of a mutually accepted quality standard of performance by introducing peers to an “assessor” perspective.48,49
The objective of this study was to compare the effectiveness of PA with the casual CD strategy on adherence to CPGs for physical therapist management of upper extremity conditions.
Following social-cognitive theory, our hypothesis was that the performance-based approach of PA, combined with giving and receiving personal performance feedback, would be a more powerful tool than the CD approach for uncovering areas in personal clinical practice that need improvement. Based on self-directed learning theory and stages of change theory, we also posited that PA would provide a stronger trigger for reflective practice, would develop greater awareness of guideline-consistent behavior in daily practice, and would be more effective in guiding self-directed change toward personal learning goals than CD. The effectiveness of PA and CD was tested on 4 outcome measures: (1) guideline adherence, (2) reflective practice, (3) awareness of performance, and (4) attainment of personal goals.
Method
Design
This study was a single-masked, cluster-randomized controlled trial with a pretest-posttest design comparing the effectiveness of 2 implementation strategies.
Setting and Participants
Participants were physical therapists organized into CoPs, which are small groups of 5 to 15 professionals who share the same setting or the same interests and who work together on the improvement of the quality of care in postgraduate training programs provided yearly by the Royal Dutch Society for Physical Therapy (KNGF). Communities of practice can register with the KNGF to participate in such a program. The aim of the program under study was to implement 2 newly developed guidelines for physical therapist management in patients with upper extremity complaints. In November 2011, formal contact people of CoPs were invited by an electronic newsletter to a joint introduction meeting on the training program. Communities of interest that showed interest in participating received an information letter containing details of the training program, randomization procedure, time investments, risks, and advantages. Participation was awarded with continuing education credits for the Dutch quality register. All CoPs that showed interest were eligible for inclusion. We conducted a sample size calculation based on an estimated difference between the 2 interventions of 5% (power=80%, P=.05), with an anticipated intraclass correlation coefficient (ICC) of .10 and 10% loss to follow-up. This calculation resulted in the required inclusion of 110 physical therapists in 22 clusters with at least 5 physical therapists per cluster.50
Randomization
In December 2011, 22 CoPs showed interest in our study. Before randomization (January 2012), 2 CoPs withdrew because they felt the program would take too much time. A flowchart of the study sample is presented in the Figure. Because we expected that the size of the group would affect its learning,31,35,38 we aimed at a balanced distribution of large and small CoPs between CD and PA groups. The 20 CoPs were stratified by the number of participants into 2 blocks of groups with 5 to 10 and 11 to 15 participants and were randomly assigned to the intervention group or the control group using randomization software.51 This procedure resulted in 10 PA groups (n=73 physical therapists) and 10 CD groups (n=76 physical therapists). The CoPs were masked for the intervention because PA and CD were presented as alternative interventions. The primary researcher (M.M.) was not masked for the allocation of CoPs because she participated in conducting the intervention program. To reduce the risk of bias, she was masked for the outcomes until the data sampling was completed and the pretest-posttest differences were calculated.
Sample flowchart of participants in the study. CoP=community of practice, PT=physical therapist, PA=peer assessment, CD=case discussion.
Interventions
Before the start of the program, both PA and CD groups received a link to the KNGF guidelines and a link to the pretest questionnaires. All participants received by e-mail a program guide tailored to the intervention providing detailed information about learning objectives, learning content, training schedule, didactic format, and procedure. The program for both groups consisted of four 3-hour sessions and was launched in February 2012. Table 1 shows a detailed program overview and time schedule. In sessions 1, 2, and 4, the participants worked on written cases that fully covered the patient profiles described in the guidelines. Session 3 consisted of a review of patient records using a set of quality indicators derived from the KNGF guidelines on record keeping.52
Intervention Programs in Both Groups
The main difference between the 2 interventions is that in the PA approach the tasks are structured, with a focus on performance rather than discussion, and roles are predefined. Each participant performed 3 roles: physical therapist, assessor, and simulated patient. Because the therapists were complete novices in the PA method, the process was supervised by a group coach. In the CD approach, tasks are less structured, with ample opportunity for in-depth elaboration and discussion, and participant roles are not defined. In both PA and CD groups, participants worked on identical cases concerning problem content, but for the PA group, these cases were adjusted to allow for performance of participants in different roles. In the PA group, written cases were not known in advance but were presented by a coach on the spot, simulating daily practice. Participants were provided with ground rules for providing and receiving constructive feedback and for creating a safe learning environment. In the role of physical therapist, they analyzed the case by reasoning aloud and demonstrated (hands-on) diagnostic and treatment skills. Peer performance was assessed by using a global scoring sheet designed to support peer assessors in giving constructive feedback. It contained 3 performance categories (planning, performance, and evaluation) that were scored on a 5-point Likert scale (from 1=much improvement needed to 5=no improvement needed). Accordingly, qualitative oral improvement feedback was given. The complete PA program guide, including assessment criteria, is accessible online.53
Three group coaches (H.E., H.N., and V.V.) were trained by the primary researcher in the PA procedure, supported by a coaching manual. They were experienced tutors in problem-based learning, and they were instructed to encourage the group in providing tailored performance feedback and not to serve as an information source themselves. To reduce the risk of bias, the group coaches were not involved in the development of clinical vignettes. For CD groups, written cases were included in the program guide to allow for proper preparation, along with instructions and written questions to guide the discussion process. After completion of the program in July 2012, and before the posttest, all participants received an e-mail with model answers to all of the cases that were discussed during the program to control for unintended differences in knowledge development between and within groups due to the influence of the group coach.
Outcome Measures
Guideline adherence.
Participants completed an online test based on 4 clinical vignettes 1 week before the start of the program and within 2 weeks after completion of the program. A previous study by Rutten et al54 showed that vignettes have acceptable validity to measure physical therapists' adherence to CPGs, and these results were consistent with studies by Peabody and colleagues.55–57
Clinical vignettes require factual knowledge of CPGs as well as clinical reasoning consistent with CPGs in the context of a clinical problem. Four clinical vignettes were based on upper extremity disorders in the context of direct physical therapy access.58 Three vignettes adequately covered the patient profiles described in the guidelines, and the fourth vignette did not because of “red flags.” The vignettes and test items were constructed by a team containing 2 physical therapy scientists involved with guideline development, 5 physical therapy practitioners specializing in upper extremity conditions, and 1 physical therapy education scientist specializing in assessment development. Each vignette was accompanied by 11 response categories derived from the guidelines: (1) clinical pattern, (2) impairments and disabilities, (3) onset risk factors, (4) impeding recovery factors, (5) patient profile according to guidelines, (6) measurement instruments, (7) diagnostic clinical tests, (8) main treatment goals, (9) treatment approach, (10) information and advice, and (11) expected recovery time. Each response category contained a set of test items in the form of statements. Vignettes 1, 2, and 3 each contained 119 items; vignette 4 consisted of fewer items (n=31) because quality indicators addressing additional diagnosis and treatment were not applicable.
The statements were scored on a 3-point scale: D=disagree, D/A=neither disagree nor agree, and A=agree. Because clinical evidence is limited and guidelines cannot inform all clinical decisions, the option D/A was offered to reflect the way information is processed in the context of uncertainty.59 The group of 8 experts evaluated and adjusted the vignettes and test items. All experts completed the final test informed by the guidelines. The scoring method took variability of reasoning among experts into account as long as differences were limited to 2 alternatives (D and D/A, or D/A and A). Items with contradictory answers (D and A) were reviewed. The alternative that was chosen by the majority (>4) was assigned 2 points, and equal distribution was assigned 1 point for each alternative. A majority opting for the alternative D/A did not occur. The final scoring key was discussed among 4 experts until consensus was reached. The maximum score was 737 points (some answers received 1 point). The Appendix shows an example of a test item and its scoring key. The scores for each vignette were added, and mean total scores on the 4 clinical vignettes were perceived as a measure of guideline adherence.
Reflective practice.
Participants completed the validated questionnaire, the Self-Reflection and Insight Scale (SRIS), developed by Grant et al.60 It aims to measure the readiness for purposeful behavior change and has been shown responsive to change in the context of continuing professional education.61 The SRIS has been validated by Roberts and Stark62 and modified for the medical education context. It contains 3 subscales: the engagement with reflection, the need for reflection, and the insights obtained by reflection. Engagement and need refer to the practice of inspecting and evaluating one's own thoughts, feelings, and behavior; insight refers to understanding them. Sum scores for each subscale were computed, and mean total scores were conceived of as a measure of reflective practice.
Awareness of performance.
Awareness was conceived of as the association between perceived improvement and assessed improvement. At posttest, participants were asked to indicate how much guideline knowledge they had at pretest and how much at posttest on a scale from 1 (no knowledge) to 5 (much knowledge). The pretest-posttest difference was conceived of as a measure of perceived improvement. Assessed improvement was the difference between pretest and posttest scores on clinical vignettes.
Attainment of personal goals.
At pretest, all participants were asked to formulate 3 learning goals, ordered on personal importance according to the concept of Commitment to Change Statements (CTCS).29,63 Conscious goal setting belonged to the intervention strategy to enhance self-directed learning and progression through the stages of change.30 They also served as an outcome measure.63 Before the posttest, all participants were e-mailed a reminder of their personal goals at pretest. At posttest, they were asked to indicate the extent to which their goals were achieved on a 3-point scale (1=not achieved, 2=partly achieved, and 3=achieved). Achievement scores for each personal goal were added, and mean total scores were conceived of as a measure of goal attainment.
Data Analysis
IBM SPSS, version 20 (IBM Corp, Armonk, New York) was used for statistical analysis. For baseline characteristics (age, sex, clinical setting, and specialization), pretest scores on clinical vignettes and SRIS of physical therapists were described and tested for differences between the PA and CD groups using chi-square tests and unpaired t tests. Internal consistency of the clinical vignettes was tested by Cronbach alpha. Outcome differences between the PA and CD groups were described and tested by multilevel linear regression to adjust for clustering within CoPs. For each outcome measure, the ICC was calculated to test clustering at the CoP level. Baseline characteristics were included as covariates when differences between groups were statistically significant.
Pretest and posttest sum scores and mean total scores were calculated for each vignette. The intervention effect for guideline adherence was estimated with posttest scores on vignettes as the dependent variable and intervention and pretest scores as covariates. In the same way, mean pretest and posttest SRIS scores were calculated. The intervention effect for reflective practice was tested with posttest scores as the dependent variable and intervention and pretest scores as covariates. Mean posttest sum scores were calculated for each personal objective and total scores. Differences in attainment of personal goals were tested with scores on personal goals as the dependent variable and intervention as covariate. Mean assessed improvement scores and mean perceived improvement scores on clinical vignettes were calculated, and correlations were computed with assessed improvement as the dependent variable and perceived improvement as the independent variable. Differences in awareness were estimated with assessed improvement as the dependent variable and the interaction between the variables intervention and perceived improvement as covariate.
Role of Funding Source
This was a study initiated by researchers and funded by the Royal Dutch Society for Physical Therapy (KNGF). The KNGF had no role in the conduct of this study, analysis or interpretation of data, or preparation of the manuscript.
Results
The pretest response was 100%. The posttest response was 93,2% (n=68) for PA and 100% (n=76) for CD. Baseline characteristics of the participating physical therapists are presented in Table 2. We found differences between PA and CD groups for sex (P=.028), so we controlled for this confounder in multilevel linear regression. Internal consistency between scores across clinical vignettes (n=4) was good (pretest α=.82, posttest α=.86).
Physical Therapists' Characteristicsa
Table 3 presents the results of the outcome measures of guideline adherence, reflective practice, and attainment of personal goals. Results of awareness of performance are presented separately.
Multilevel Analyses for Guideline Adherence, Reflective Practice, and Attaining Personal Goalsa
Concerning guideline adherence, Table 3 shows that mean pretest scores on vignettes were comparable between PA and CD groups. At posttest, the PA and CD groups showed significant improvement: PA groups=29.82 (SD=63.97), P<.001, and CD groups=9.49 (SD=40.52), P<.001. Percent improvement was 5.8% for the PA groups and 2.0% for the CD groups. Multilevel linear regression analysis, controlling for sex, showed that the difference between the PA and CD groups was statistically significant in favor of the PA groups (estimated effect=22.52 points; 95% CI=2.38, 42.66; P=.031).
Mean pretest scores on the SRIS showed no difference between the PA and CD groups. At posttest, scores were significantly improved in both PA and CD groups: PA groups=2.34 (SD=8.69), P<.001, and CD groups=1.85 (SD=7.05), P<.001. Percent improvement was 2.8% for the PA groups and 2.2% for the CD groups. The difference between the PA and CD groups was not statistically significant (estimated effect=−0.06 points; 95% CI=−2.79, 2.65; P=.96). The results related to attainment of personal goals showed that scores were significantly higher for the PA groups than for the CD groups (estimated effect=0.50; 95% CI=0.04, 0.96; P=.03).
At posttest, participants in the PA groups showed greater awareness of their professional performance. The correlation between perceived improvement and assessed improvement was r=.36 (P=.002) for the PA groups and r=.08 (P=.50) for the CD groups. The difference was statistically significant (estimated effect=14.73; 95% CI=2.78, 26.68; P=.01).
Discussion
This study evaluated the effects of 2 strategies for the implementation of Dutch physical therapy guidelines. It showed that PA was more effective than CD in improving guideline adherence as measured by clinical vignettes. Moreover, the PA groups were more effective in attaining personal goals and showed higher levels of awareness of performance. The strength of this study is that we offered the PA and CD groups high-quality programs. Program evaluation showed that the perceived instructional value of PA and CD was comparable between PA and CD groups (results not presented). The outcome measures were equally facilitated by both interventions. First, the PA and CD groups had equal access to the guidelines, worked on solving identical clinical problems, and had equal access to the model answers of each problem. Second, neither of the interventions included tasks such as writing reflection reports and improvement plans that explicitly aimed to facilitate the outcomes of reflective practice, awareness of performance, and attainment of personal goals. Any pretest effect of the SRIS or the CTCS would have applied to both interventions.
We showed that a tailored, multifaceted intervention that addresses specific barriers to change,10 such as “awareness of performance” as identified by Rutten et al,12 is effective, and these findings are in line with the existing research evidence on implementation strategies.4,10,19,64 We observed high baseline scores and moderate, but statistically significant, improvement scores for continuous outcomes of clinical vignettes (PA groups=5.8%, CD groups=2.0%). High baseline scores can be attributed to the fact that participants received the guidelines before the pretest and were allowed to study them beforehand. Studies have shown that the intervention effect on desired practice increases when baseline performance is low.19,65
Rutten et al8 observed a 3.1% guideline adherence increase for the low back pain guideline using clinical vignettes that assessed the effectiveness of their program. This program, however, involved interventions addressing professional as well as organizational determinants of guideline adherence, so the results cannot be compared. We did not find studies that assessed comparable content and constructs concerning the improvement of the uptake of CPGs except for the study by van Dulmen et al,40 which showed that PA was more effective than CD in the implementation of the low back pain guideline, and that result is in line with our findings.
Given the notion that intervention programs aimed at enhancing the transfer of research evidence to clinical practice are very heterogeneous and the generalizability of the effects is limited,18,66 we explored the key differences between PA and CD informed by theory, which may contribute to the generalizability of the results. First, the PA task is highly structured and necessitates strong involvement of each participant. Individual contributions in learning groups may vary widely when conditions such as shared responsibility, interdependency, mutual trust, and psychological safety are not met.32,67 Discussion tasks do aim at active participation, but the task structure does not control for individual contributions to group learning.
Second, in contrast to CD, PA focuses on performance that can be observed and evaluated. The PA group participants performed in predefined roles that forced the transfer of knowledge and skills in order to fulfill this role convincingly. In the role of physical therapist, participants needed to make the transfer from implicit reasoning to explicit reasoning and from intentional behavior to observable behavior. The transferred knowledge and skills became transparent, and this new information became accessible for group review.68 The variety of feedback that PA group participants obtained about their performance may have helped them to become aware of areas in professional practice that need improvement and may have supported them in attaining personal goals. In the assessor role, participants needed to make a transfer from implicit appraisal to explicit appraisal. Supported by predefined performance criteria, peer assessors revealed their personal norms about the quality of the observed behavior.69 Personal standards could be compared with group standards. Research has revealed that the availability of both internal and external data about an individual's performance is conditional on the development of correct self-perceptions (awareness),49,50 which may explain why PA groups outperformed CD groups in this respect. A different perspective on why PA groups showed more improvement on guideline adherence is the testing effect. Recent insights in cognitive psychology show that tested information is better stored and retrieved from memory than information that is not.70,71 Because PA is based on assessment (unlike CD), PA group participants were repeatedly challenged to reproduce and apply newly acquired knowledge of CPGs. That may have strengthened awareness of deficiencies and facilitated retrieval of information from memory at posttest.
Although PA was more effective in 3 outcome measures, we could not explain these results by differences in reflective practice. Both the PA and CD groups showed comparable improvement scores on the SRIS. These scores reflect perceptions of conscious reflective practice,60,62 and conscious reflective practice was apparently enhanced by both interventions. Professional behavioral change, however, does not necessarily depend on conscious reflection but also might occur spontaneously through informal learning, such as concrete experience, role modeling,72 and action observation.73 Peer assessment involved concrete experience with guideline recommendations, including hands-on clinical skills. This approach might have prompted spontaneous (unintended) learning experiences more than the cognitive directed approach of CD. A study by Bandura and Locke74 showed that experience is the strongest source of information for the development of self-efficacy beliefs and that self-efficacy beliefs contribute significantly to motivation for behavioral change.
A third difference between the PA and CD groups was the presence of the group coach. Peer groups contained experienced health care practitioners, but they were absolute novices in the peer assessment method. Research has revealed that the acceptability of peer feedback highly depends on its perceived reliability32,68 and that reliability and validity of peer feedback improve by training and experience.31 It is possible that therapists have used the group coach as a tool to justify feedback because they did not fully rely on their peers' judgment. We assume that the effect of PA may increase when groups have more training in giving and receiving peer feedback and when standards for the quality of physical therapy care are internalized and mutually shared.48,49 We also assume that successful PA practices depend on commitment of physical therapists to the PA procedure. The role of the group coach might be important in this respect. On the other hand, it should be noted that the CD groups might have performed better when guided by a coach.
Finally, it should be noted that research has shown that improved guideline adherence is associated with improved process of care but not always with improved patient outcomes.5,11,75
Limitations
First, clinical vignettes remain a proxy measure of clinical practice. Direct observation or audio or video recording might be measures that better reflect authentic practice, but a systematic review by Hrisos et al76 suggests that such measures may lack reliability and validity as well because the behavior of interest cannot be standardized beforehand, and generalizations of the inferences are hard to make. Standardized (simulated) patients are generally considered to be an acceptable substitute, but these measures are costly and were not feasible given the sample size. Moreover, standardized patients do not provide a sufficiently broad case mix compared with clinical vignettes. Based on these considerations and the existing validity evidence,55–57 we opted for clinical vignettes.
A second limitation is the involvement of the primary researcher (M.M.) in conducting the intervention program. To reduce risk of bias, the primary researcher was masked for the outcomes until pretest-posttest scores had been described and between-group differences were calculated. The primary researcher was involved in additional multilevel analyses, supervised by another researcher (J.K.) who was masked for the intervention.
Third, the involvement of the group coaches should be addressed. We controlled for differences in knowledge development between and within groups by e-mailing to each participant, before the posttest, the model answers for all the clinical cases. Outcomes on all outcome measures did not show a significant difference between group coaches (M.M., H.E., H.N., or V.M.) (results not presented). However, we could not control for implicit effects of the group coaches on motivation to change, such as role modeling effects, increased self-efficacy beliefs, improved attitudes toward guidelines,24,32 and shared quality standards of performance.49
Fourth, the reliability of the test scores should be considered. The test contained a considerable number of test items (N=388). Although each participant fully completed the test within time limits (2 hours) at pretest and posttest, cognitive overload caused by time on task may have biased test results. The effect, however, applied to both the PA and CD groups, so it does not affect the validity of the inferences made about between-group differences.
Finally, we address the generalizability of our results. Studies have demonstrated cultural differences in attitudes toward PA, such as reluctance of peers in giving face-to-face feedback.28,32 External validity might be limited because the sample contained only Dutch physical therapists.
In conclusion, PA is more effective in guideline implementation than CD. The PA group participants showed higher improvement scores on clinical vignettes, showed more awareness of guideline-consistent behavior, and were more successful in attaining personal goals. The focus on individual performance, allowing for concrete experience with the guideline, and obtaining personal performance feedback probably contributed to its effectiveness. Moreover, performance in the assessor role necessitates critical appraisal of the observed behavior as well as critical self-appraisal.
We recommend PA for guideline implementation within CoPs. Further research should address the role of the group coach on the intervention effect and should explore the feasibility of replacing the group coaches by trained CoP members. They could play in important role in future bottom-up quality improvement initiatives addressing evidence-based practice and unwarranted variability in physical therapy care.
Appendix.
Example of a Clinical Vignette With Exemplary Test Items
Footnotes
The authors thank all participating physical therapists for their commitment to this research project. The authors thank Femke Atsma for verification of data analyses; Henk van Enck (H.E.), Henk Nieuwenhuijzen (H.N.), and Volcmar Visser (V.V.) for their contribution as group coaches; and all physical therapists and experts who contributed to the development of clinical vignettes. They also thank Moira Jackson and John Gabbay for reviewing for English language usage.
Ethical approval for the study was given by the Medical Ethical Committee on Research Involving Human Subjects, Arnhem-Nijmegen, the Netherlands (CMO registration number: 2013/036).
This study was funded by the Royal Dutch Society for Physical Therapy (KNGF) (registration number: 8203).
This trial is registered at ISRCTN (registration number: ISRCTN69003553).
- Received October 9, 2013.
- Accepted August 28, 2014.
- © 2015 American Physical Therapy Association