The thoughtful commentaries by Mizner1 and Pua and Bennell2 raise important points concerning the interpretation of our findings3 and the extent to which they may be generalizable.
Mizner provides 2 essential caveats to the interpretation of our results. The first addresses the extent to which our findings are generalizable, and the second considers the application of our findings to the reinterpretation of studies that reported self-report measure values only. Mizner offers examples from the literature that clearly show recovery is sample specific. What these investigations do not report is the relationship between self-report and performance-based measures. It may be that within a specific postarthroplasty interval, the relationship between self-report and performance-based measures is reasonably consistent, even though different samples display different recovery profiles. Conversely, it is equally plausible, as Mizner suggests, that the relationship between self-report and performance-based measures are sample and situation dependent. For example, building on Mizner's examples, there is evidence suggesting that the extent to which pain influences self-reported functional status changes over time for some self-report measures. In a previous study, we found that self-reported pain specific to performance tasks demonstrated a much higher correlation with Lower Extremity Functional Scale (LEFS) scores prearthroplasty than approximately 9 weeks postarthroplasty (prearthroplasty: r=.48; postarthroplasty: r=.36).4 In contrast, the time (distance) to complete the performance tasks showed a lower correlation with LEFS scores prearthroplasty than approximately 9 weeks postarthroplasty (prearthroplasty: r=.23; postarthroplasty: r=.45).4 This observation supports Mizner's concern that a systematic difference between self-report and performance-based measures may be extremely context specific, and further investigation is essential.
Mizner's second point addresses the extent to which our systematic difference estimates may be useful in reinterpreting studies reporting self-report measures only. We fully agree with Mizner's caution and endorse his suggested phrasing that our estimates may provide “a sense of what the difference might be in performance-based measures if they were administered in addition to the Western Ontario and McMaster Universities Osteoarthritis Index or Lower Extremity Functional Scale.”
Pua and Bennell also raise 2 important points. The first is that our estimates for a systematic difference between prearthroplasty and postarthroplasty assessments are specific to the Six-Minute Walk Test (6MWT) and Timed “Up & Go” Test (TUG) and may differ for other performance-based measures. We fully agree with this caveat.
The second point addresses the conservative nature of our choice to equate a clinically important difference to a within-patient change, rather than a within- or between-group difference. We agree that the magnitude of a clinically important difference based on a within-patient change will be substantially larger than a clinically important between-group difference. Indeed, Goldsmith et al5 have estimated an important within-patient change to be at least twice that of an important between-group difference. There were 2 reasons for our choice. First, we wanted to determine whether our findings were of practical importance in the clinical setting, where decisions apply to individual patients. Second, by demonstrating that the bias was important under the most conservative interpretation, we believed that readers would infer that the magnitude of the bias would have a greater impact when applied at the group level. Unfortunately, we did not make this point obvious to readers, and we thank Pua and Bennell for identifying this omission.
Lastly, Pua and Bennell inquired about “the proportion of patients whose systematic biases exceeded the minimal important difference.” We cannot answer this question directly because it would require each patient to walk (6MWT) the same distance or have the same time (TUG) prearthroplasty and postarthroplasty. However, in an attempt to provide further information to assist in the interpretation of our results, we provide Tables 1 and 2. These tables provide the cross-classification of patients having undergone a true improvement or not. Consistent with the interpretation offered in our article, we consider an improvement of 9 points to represent a true improvement for the LEFS and the Western Ontario and McMaster Universities Osteoarthritis Index physical function subscale. True improvement values for the 6MWT and TUG are based on the minimal detectable change (MDC90) estimates provided by Kennedy et al: 62 m for the 6MWT and 2.5 seconds for the TUG.6 Referring to the 6MWT and LEFS cross-classification results shown on the left side of Table 1, the interpretation is as follows: 44 patients reported an improvement of 9 or more LEFS points, and 16 patients demonstrated an increase of 62 m or more in their 6MWT distance, of whom 13 also reported a true improvement (ie, 9 or more points) in their LEFS scores.
Comparison of Patients Identified as Having Improved by the Six-Minute Walk Test and the Self-Report Measuresa
Comparison of Patients Identified as Having Improved by the Timed “Up & Go” Test and the Self-Report Measuresa
We thank Mizner and Pua and Bennell for their insightful observations and caveats concerning the interpretation of our results. Only with replication and further investigation will we know the extent to which our findings are generalizable and useful in the interpretation of studies that applied only self-reports of lower-extremity functional status.
- © 2010 American Physical Therapy Association