In response to Maher and colleagues' letter,1 we would like to point out that in our report,2 we made a general statement regarding psychometric properties of the quality assessment tools used in physical therapy, and we did not specifically state that the Physiotherapy Evidence Database (PEDro) scale has not been appropriately tested for validity or reliability. This statement was based on our previous review, which investigated the psychometric properties of tools to evaluate the quality of randomized controlled trials in health research and especially in physical therapy.3 The above-mentioned review highlighted that most of the existing scales to determine trial quality were not adequately developed, including the PEDro scale. According to the developers of the PEDro scale, the scale was developed based on the Delphi list.4 Based on our results,3 the Delphi list lacked internal consistency and construct validity. These psychometric properties are of importance in any scale because they indicate that the construct—in this case, “methodological quality” (or “risk of bias,” the more current term for this concept)—is fully represented by the items of the scale (internal consistency) and that the scores of a scale are based on hypothetical grounds and should behave based on predefined hypotheses. In other words, the PEDro scale was developed based on another tool that is not completely validated. In addition, the PEDro scale added 2 new items (items 8 and 10), which, to our knowledge, were not obtained from a thorough validation process. Therefore, the set of items included in the PEDro scale might not be entirely valid to represent the construct of “risk of bias.”
In addition, in a recent study,5 we recognized that some items used by quality assessment tools were related more to reporting quality and not to trial conduct or potential bias (eg, in PEDro scale, reporting confidence intervals or other measures of variability). Moreover, we noticed that there was a lack of agreement regarding item relevance to trial quality or risk of bias. Thus, these results called for an in-depth analysis of the items used to determine trial quality and risk of bias of clinical trials to provide evidence of validity for these items. This also applies to the PEDro scale. Although it has been claimed that this is a valid tool to determine study quality using the overall score, the validation studies mentioned failed to demonstrate that the items used by this tool are supported by empirical evidence and are related to bias in clinical trials. This step is of crucial importance to determine the validity of the tool.
Furthermore, previous and recent evidence has questioned the use of summary scores to determine risk of bias of RCTs.6–10 The effects of relevant criteria, such as concealment of allocation, may be diluted or confounded by the summary quality score because they include other items not related to the risk of bias or not important in a given context. Indeed, items that are important in some situations may not be relevant in others, yet they receive the same weight in a quality scale.6,7 The Cochrane Bias Methods Group and Cochrane Statistical Methods Group, therefore, recommend that summary scores obtained from quality scales should not be used.11 Rather, the relevant biases should be assessed one by one, including: selection bias, performance bias, detection bias, attrition bias, reporting bias, and other context-specific biases.9 Thus, the use of total score of the PEDro scale has been criticized.
A case study was presented8 as an example to highlight the fact that biases are introduced into systematic reviews and meta-analyses when summary quality scores are used as an eligibility criterion or somehow introduced into the analysis8 as proposed by others.6,7,10 A Cochrane review on knee osteoarthritis using a physical therapist intervention was revisited for this exercise.8 The PEDro scale summary scores were compared with the Cochrane domain approach where only trials appropriately concealed and blinded were used for analysis. The results of this study indicated that although “low risk of bias” trials according to the PEDro score showed highly clinically and statistically significant results, trials with proper blinding and concealment of allocation based on the Cochrane domain analysis indicated a treatment effect estimate at least 3 times lower, with confidence intervals including the null effect. This example showed that, depending on which scale is used to assess quality and which score of that scale is used as a threshold to define “low risk of bias,” point estimates of treatment effects may be completely different. This finding, in turn, may have serious implications on conclusions drawn by investigators who incorporate summary quality scores into their reviews.
Finally, we acknowledge the efforts performed by the developers of the PEDro scale to validate this scale; however, we feel that the roots of this scale and some of its items, as well as most of the methodological quality assessment tools, are not completely valid to assess the risk of bias of physical therapy trials. Also, the use of the summary score to determine “risk of bias” or “trial quality” might not be appropriate based on the existing evidence.
Footnotes
This letter was posted as a Rapid Response on September 22, 2014. at ptjournal.apta.org
- © 2014 American Physical Therapy Association