Abstract
Background Self-reports of function may systematically overestimate the ability of patients to move around postarthroplasty.
Objective The purpose of this study was to estimate the magnitude of systematic differences in Lower Extremity Functional Scale (LEFS) and Western Ontario and McMaster Universities Osteoarthritis Index physical function subscale (WOMAC-PF) scores before and after primary total knee arthroplasty (TKA) or total hip arthroplasty (THA) by referencing the values to Six-Minute Walk Test (6MWT) distances and Timed “Up & Go” Test (TUG) times.
Design This study was a secondary analysis of data from a prospective cohort study.
Methods The LEFS, WOMAC, 6MWT, and TUG were administered to 85 patients prearthroplasty and once at 9 to 13 weeks postarthroplasty. Regression analysis was applied using a robust error term for clustered data. With the self-report measures as dependent variables and performance measures, occasion (prearthroplasty or postarthroplasty), and performance measure-by-occasion as independent variables, 3 propositions were examined: (1) the relationship between self-report and performance measures is identical prearthroplasty and postarthroplasty (ie, regression lines are coincident); (2) the relationship differs between occasions, but is consistent (ie, regression lines are parallel); (3) the relationship is not consistent (ie, the regression lines are not parallel).
Results For all analyses, the results supported the second proposition (ie, the relationship differed between occasions, but was consistent). The systematic differences varied by location of arthroplasty, but were similar for both performance tests. For the LEFS, the difference was approximately 11 points for patients who received TKA and 13 points for patients who received THA. For the WOMAC-PF, the difference was approximately 12 points for patients who received TKA and 19 points for patients who received THA. These differences exceed the minimal clinically important change for an individual patient.
Limitations The findings are specific to 9 to 13 weeks postarthroplasty.
Conclusion Dependence on scores of self-report measures alone, without knowledge of the magnitude of the identified systematic differences, will result in overestimating the ability of patients to move around postarthroplasty.
Like therapeutic interventions and the methods used to evaluate them, patient outcome assessment has evolved over time. Historically, state-of-the-art outcome measures combined multiple patient attributes such as pain, range of motion, and function, into a single category.1,2 Scaling of these measures was arbitrary, and psychometric evaluation was not yet commonplace. Subsequently, there has been a proliferation of outcome measures targeting the functional status of patients with osteoarthritis (OA) of the hip or knee and those undergoing total joint arthroplasty (TJA).3–5 These measures include self-report and performance-based assessments, with self-report measures being endorsed and applied more often.6 A relatively recent and consistent finding is that self-report measures' scores of lower-extremity functional status before and after arthroplasty convey more than the ability of patients to move around.7–9 For example, pain has been shown to play a prominent role in patients' self-reports of their ability to move around.8,9 A reduction in pain after arthroplasty has been associated with patients' self-reported improvements in their ability to move around, even though their time to complete performance tasks had doubled.9 Because many studies commenting on the functional status of patients after hip or knee arthroplasty have been restricted to self-reports, interpretation of their findings is unclear.
One method of enhancing a measure's interpretability is to reference its value to that of another measure for which interpretability has been established. Often, this method involves the correlation of one measure's scores with those of another measure. For patients with OA of the hip or knee and those progressing to TJA, the Six-Minute Walk Test (6MWT)10 and the Timed “Up & Go” Test (TUG)11 are frequently reported performance measures.12–17 Interpretive value for these measures is evident at 2 levels. First, the units of measurement, meters and seconds, are well understood. Second, reference values are available against which an individual's score can be compared.11,18,19 For example, Enright and Sherrill18 generated predictive equations for the 6MWT that applied height, weight, and age as independent variables. Separate equations were generated for men and women. Podsiadlo and Richardson11 and Bohannon20 have reported customary age- and sex-specific values for the TUG. Kennedy and colleagues19,21 have reported estimates of measurement error, minimal detectable change,10 and recovery curves for the 6MWT and TUG postarthroplasty.
There is limited information concerning the relationship between self-report and performance measures prearthroplasty and postarthroplasty, and the available information typically is provided as correlation coefficients.8,22,23 The extent to which correlation coefficients clarify the interpretation of one measure by referencing its values to those of another measure is limited because correlation coefficients describe the degree to which 2 variables are related but not the relationship itself. It is possible to obtain nearly identical correlation coefficients between measures at 2 time points that conceal different relationships or score interpretations. This limitation can be overcome by examining regression coefficients.
In order to implement evidence-based practice, clinicians must be able to interpret the literature. Unfortunately, much of the literature pertaining to the recovery of patients after knee or hip arthroplasty has provided self-reports of lower-extremity functional status only, often in the form of Western Ontario and McMaster Universities Osteoarthritis Index physical function subscale (WOMAC-PF) scores. Given that previous inquiry has shown self-report assessments of function overestimate the ability of patients to move around postarthroplasty,7,9 our goal was to examine whether a systematic difference in scores could be detected and, if so, to estimate its magnitude. If it could be shown that a systematic difference exists between selected self-report and performance measures' scores prearthroplasty and postarthroplasty, this information could be used to augment the interpretation of postarthroplasty scores in studies that applied self-report measures only.
The purpose of this study was to estimate the magnitude of systematic differences in Lower Extremity Functional Scale (LEFS) and WOMAC-PF scores before and after primary total knee arthroplasty (TKA) or total hip arthroplasty (THA) by referencing their values to the 6MWT distances and TUG times. For the purpose of these investigations, and consistent with the definition applied to the WOMAC-PF, we define lower-extremity function as the ability to move around.24
Method
Study Design
This study represents a secondary analysis of data collected prospectively as part of a larger investigation that was conducted to describe the time course of lower-extremity functional status recovery after TJA.10,19,22 All patients were treated postoperatively using a standardized inpatient protocol, following either a primary total hip or knee care pathway. Patients undergoing TKA were permitted to be full weight bearing and participated in a progressive program of range of motion and strengthening exercises and functional training. Following THA, the patients were educated about postoperative movement restrictions and mobility, and strengthening exercises were initiated. Patients contributing data to the current investigation were assessed prearthroplasty and once between 9 and 13 weeks postarthroplasty. We chose this follow-up period for 2 reasons: (1) clinical decisions concerning the need for ongoing rehabilitation frequently are made during this period, and (2) investigators often have reported patient outcomes over this interval.7,14,25–28
Study Setting
The study took place at the Sunnybrook Holland Orthopaedic and Arthritic Centre, a tertiary care orthopedic facility in Toronto, Canada.
Participants
Patients were eligible for recruitment if they were scheduled to receive a primary hip or knee TJA due to OA, had sufficient language skills to communicate in written and spoken English, and were assessed prearthroplasty and between 9 and 13 weeks postarthroplasty. Patients were excluded if they had neurological, respiratory, cardiac, or other conditions that would significantly compromise their lower-extremity functional status. Patients with OA secondary to inflammatory disease (eg, rheumatoid arthritis) also were excluded from this study. Patients were recruited either during their consultation with the orthopedic surgeon or at the preadmission clinic visit prior to arthroplasty. Patients included in the current analysis also required complete WOMAC-PF, LEFS, 6MWT, and TUG data prearthroplasty and at 9 to 13 weeks postarthroplasty. Patients contributing data to these investigations provided written informed consent.
Measures
6MWT.
Originally conceived as an outcome measure for patients with respiratory problems, the 6MWT quantifies functional status as the distance in meters walked in 6 minutes. Test-retest reliability estimates of .94 (type 2,1 intraclass correlation coefficient [ICC]) and 26.3 m (standard error of measurement [SEM]) have been reported for patients with OA of the hip or knee awaiting arthroplasty.10 Minimal detectable change at a 90% confidence level (MDC90) has been estimated to be 61 m.10 For the current study, the 6MWT was performed on a measured 46-m, uncarpeted, rectangular indoor circuit. Standardized encouragement (eg, “You are doing well, keep up the good work”) was provided at 1-minute intervals during the test.
TUG.
The TUG quantifies functional status as the time taken in seconds to perform this activity. Individuals were instructed to rise from a standard armchair, walk at a safe and comfortable pace to a tape mark 3 m away, and return to a sitting position with their backs against the chair.11 Patients were allowed to use their arms when rising from and returning to a seated position. A stopwatch was used to measure the time to complete this activity to the nearest one tenth of a second. Test-retest reliability estimates of .75 (type 2,1 ICC) and 1.07 seconds (SEM) have been reported for patients with OA of the hip or knee awaiting arthroplasty.10 The MDC90 has been estimated to be 2.49 seconds.10
LEFS.
The LEFS is a 20-item, self-report, unidimensional, region-specific measure that inquires about perceived difficulty with a variety of activities. Each item is scored on a 5-point scale (0–4), and item scores are summed to yield a total score. Total LEFS scores can vary from 0 to 80, with higher scores representing better functional status. Previous investigations5,29,30 have examined the extent to which the LEFS is reliable and valid when applied to patients with OA of the hip or knee who have received a TJA. Test-retest reliability estimates (type 2,1 ICC) have consistently exceeded .85.5,29,30 Typical SEM values have clustered around 3.4 LEFS points. The MDC90 has been estimated to be 9 LEFS points.5,30
WOMAC.
The WOMAC consists of 3 subscales: pain (5 items), stiffness (2 items), and physical function (17 items). Items on the Likert 3.1 version of the measure, which was applied in this study, are scored on a 5-point scale (0–4), with higher scores representing more pain, greater stiffness, and lower levels of functional status. Although the participants completed all sections of this measure, only the physical function subscale scores, which can vary from 0 to 68, were of interest in the current study. Type 2,1 ICCs for the WOMAC-PF have exceeded .85,29 and a SEM of approximately 3.3 points has been reported for the WOMAC-PF.29 The MDC90 has been estimated to be approximately 9 points for the WOMAC-PF.29
Sample Size
The sample for this study was one of convenience and included all patients in our database who fulfilled the eligibility criteria.
Statistical Methods
First, we calculated descriptive statistics and correlations between self-report and performance measures prearthroplasty and postarthroplasty. Next, we compared the similarity of correlation coefficients between self-report and performance measures (eg, LEFS and TUG) obtained prearthroplasty with those obtained postarthroplasty. Because the prearthroplasty and postarthroplasty correlations for a given pair of measures performed on the same patients are not independent, the variance of the difference in correlations cannot be computed directly. To overcome this limitation, we applied a bootstrap procedure to estimate the 95% confidence intervals on the difference in correlations prearthroplasty and postarthroplasty. Specifically, 1,000 samples of size n—where n equaled the number of observations for the specific analysis of interest (47 for the knee analysis and 38 for the hip analysis)—were randomly selected with replacement. The 25th and 975th rank-ordered values from the bootstrap samples represent the 95% confidence limits.
Before proceeding to the main analysis, it was necessary to determine whether the relationship between the self-report measures and the performance measures was consistent over the postarthroplasty interval of 9 to 13 weeks. If the relationship was consistent, the data for this interval could be grouped to represent a single postarthroplasty occasion. With the LEFS or WOMAC-PF specified as the dependent variable, the analysis proceeded as follows for the 6MWT. The initial independent variables for each self-report measure were the 6MWT, week postarthroplasty, and 6MWT × week interaction term. For example, the LEFS regression equation was: LEFS=β0 + β1(6MWT) + β2(week) + β3(6MWT × week).
We evaluated the hypothesis that the relationship between the LEFS and the 6MWT was identical (ie, the regression lines were coincident or identical) from 9 to 13 weeks postarthroplasty by testing β2 = β3 = 0. When this condition holds true, the regression equation can be rewritten as: LEFS=β0 + β1(6MWT). This analysis was repeated substituting the TUG for the 6MWT, and again with the WOMAC-PF substituted for the LEFS.
The next step addressed the main purpose of our study: to examine the relationship between the self-report and performance measures prearthroplasty and postarthroplasty. With the LEFS or WOMAC-PF specified as the dependent variable, the initial independent variables for each self-report measure were the performance measures, a dummy variable coding occasion (0=prearthroplasty, 1=postarthroplasty), and a performance measure × occasion interaction term.31 For example, the WOMAC-PF regression equation was: WOMAC-PF=β0 + β1(6MWT) + β2(occasion) + β3(6MWT × occasion).
We evaluated the hypothesis that the relationship between the WOMAC-PF and the 6MWT was identical prearthroplasty and postarthroplasty (ie, the regression lines were coincident, and no systematic difference existed) by testing whether β2 = β3 = 0. We evaluated the hypothesis that a difference in relationship between time points was consistent over the range of possible scores for the measures (ie, the regression lines were parallel but not coincident, and a systematic difference existed) by testing whether β3 = 0, but β2 ≠ 0. We evaluated the hypothesis that a difference in relationship between time points was not consistent over the range of possible scores for the measures (ie, the lines were not parallel, and the difference between time points was score dependent) by testing whether β3 ≠ 0. These 3 hypotheses and statistical tests were repeated with the LEFS as the dependent variable and the performance measures (6MWT or TUG), occasion, and interaction term as independent variables. As a preliminary step, we assessed whether sex, site of arthroplasty, and their interaction contributed significantly to the models. Neither sex nor any of its interactions contributed significantly to the model. However, the site of arthroplasty × occasion interaction was significant. Accordingly, separate analyses were performed for site of arthroplasty. Because each patient contributed 2 data points for a given measure (eg, one WOMAC-PF measure prearthroplasty and a second WOMAC-PF measure postarthroplasty) in each regression analysis, we applied a robust error term for clustered data.
All statistical tests were 2-tailed, and an effect was considered statistically significant if P<.05 (ie, 95% confidence intervals on the beta coefficients excluded zero). We used STATA version 10.1* for all statistical analyses.
Results
Participants
Eighty-five of the 162 patients assessed preoperatively also were assessed between 9 and 13 weeks postarthroplasty. There was no difference (χ2=0, df=1, P=.99) in the distributions of joint involvement (ie, hip, knee) by sex between those patients who were assessed and those who were not assessed 9 to 13 weeks postarthroplasty. Also, there were no differences in body mass index (t158=0.68, P=.50), age (t160=1.13, P=.26), or LEFS (t160=1.16, P=.25), WOMAC-PF (t160=0.68, P=.49), 6MWT (t160=1.69, P=.09), and TUG (t160=1.42, P=.16) scores between those patients who were assessed and those who were not assessed 9 to 13 weeks postarthroplasty.
Descriptive Data
Table 1 provides a summary of the participants' characteristics. Forty of the 85 patients contributing prearthroplasty and postarthroplasty data were women. Forty-seven patients, of whom 24 were women, had a total knee replacement. All patients were assessed prearthroplasty, and each patient was assessed over the postarthroplasty interval of 9 to 13 weeks. The frequencies of postarthroplasty assessments were 32 patients in week 9, 22 patients in week 10, 17 patients in week 11, 10 patients in week 12, and 4 patients in week 13. The postarthroplasty LEFS and WOMAC-PF scores demonstrated a statistically significant improvement (ie, the 95% confidence intervals did not include zero) in function compared with the prearthroplasty values; the 6MWT distances and TUG times did not differ between prearthroplasty and postarthroplasty.
Table 2 reports the prearthroplasty and postarthroplasty correlations between self-report and performance measures. The negative sign reflects the opposite scale orientations of the LEFS and the WOMAC-PF. The pair-wise correlations between measures (eg, LEFS and TUG) did not differ significantly prearthroplasty and postarthroplasty (ie, 95% confidence interval on difference included zero).
Main Results
The analyses assessing the extent to which the relationship between self-report and performance measures were consistent over the postarthroplasty interval of 9 to 13 weeks showed no significant (P>.05) week or performance measure × week interaction (ie, β2=β3=0). This finding supported collapsing the data gathered over the postarthropathy period of 9 to 13 weeks to represent a single occasion for the subsequent main analysis.
Table 3 summarizes the regression analyses that compared the relationship between self-report and performance measures prearthroplasty and postarthroplasty. There were no significant performance measure × occasion interactions (ie, β3 did not differ from 0); however, an occasion effect was present (ie, β2≠0) in each analysis. In the absence of an interaction, the occasion effect identified by the β2 coefficient reports the systematic difference in LEFS or WOMAC-PF points between prearthroplasty and postarthroplasty assessments for a given 6MWT or TUG value. As shown in Table 3, for the LEFS, the difference was 11.81 points based on the 6MWT and 10.34 points based on the TUG for patients who received a TKA and 13.08 points based on the 6MWT and 13.25 points based on the TUG for patients who received a THA. For the WOMAC-PF, the difference was 12.65 points based on the 6MWT and 11.32 points based on the TUG for patients who received a TKA and 18.94 points based on the 6MWT and 19.00 points based on the TUG for patients who received a THA. For example, the Figure illustrates the systematic difference of 12.65 points for the WOMAC-PF associated with the 6MWT for participants who received a TKA.
Illustration of the systematic difference between prearthroplasty and postarthroplasty Western Ontario and McMaster Universities Osteoarthritis Index physical function subscale (WOMAC-PF) scores for participants who received a total knee arthroplasty. 6MWT=Six-Minute Walk Test.
Discussion
Key Results
Our goal was to contribute to the interpretation of LEFS and WOMAC-PF scores by referencing their values to 6MWT distances and TUG times. We examined regression equations generated prearthroplasty and postarthroplasty and found that for each measure the regression slope was similar at both time points. However, there was a systematic difference in self-report values between occasions. For a given 6MWT distance or TUG time, patients reported substantially greater functional status levels for the LEFS and WOMAC-PF postarthroplasty compared with prearthroplasty. The magnitude of the systematic difference was similar for both performance test reference standards. For the LEFS, the difference was approximately 11 points for patients who received a TKA and 13 points for patients who received a THA. For the WOMAC-PF, the difference was approximately 12 points for patients who received a TKA and 19 points for patients who received a THA.
Limitations
There are several potential limitations of our study. One limitation is that this study investigated the relationship between self-report and performance measures at the postarthroplasty interval of 9 to 13 weeks only. Accordingly, it should not be inferred that the reported systematic differences exist to the same extent outside the time frame reported in this study. A second potential limitation is the application of 2 performance measures as the reference standards for the ability of a patient to move around. However, the 6MWT and TUG have been reported often as representations of the ability of patients to move around,12–14,26,32 they have been shown to be sensitive to change in similar patient groups and under similar conditions,7,19 and reference values exist for these measures.11,18,19 A third limitation is that our sample was one of convenience, which resulted in confidence intervals that are slightly wider than the ideal. Finally, the validity of our findings is dependent on our sample of 85 patients being an unbiased representation of the larger number of patients in our database. Although we did not detect a difference in the characteristics of the 85 patients who contributed data to the current study and the 77 patients who did not contribute data, we cannot be certain that the 2 groups differed on some other unmeasured characteristic.
Interpretation
The finding of a systematic difference may be of little consequence, provided its magnitude is small and clinically unimportant. To determine the extent to which the magnitude of the identified systematic differences were clinically important, we sought out literature-based estimates of the within-patient minimal clinically important change (MCIC). For the 80-point LEFS and the 68-point WOMAC-PF, the reported within-patient MCIC is approximately 9 points, which is less than the estimated point estimates of the systematic difference for both self-report measures.5,29,33 We interpret the magnitude of the systematic difference for both self-report measures to be clinically important and suggest that it should be incorporated into decisions concerning the ability of patients to move around at follow-up assessments 9 to 13 weeks postarthroplasty. Dependence on self-report measures alone will result in an overestimation of the ability of patients to move around postarthroplasty. A previous investigation5 has shown that self-report and performance measures assess different aspects of functional status, the definition of which is not always declared or restricted to a patient's ability to move around. Accordingly, to gain a comprehensive evaluation of lower-extremity functional status, we believe that self-report and performance measures provide complementary information and view both as essential components of a patient's assessment. Furthermore, we believe that no single performance measure can adequately represent lower-extremity functional status at all time points postarthroplasty and recommend that a battery of performance tests consisting of essential and diverse activities (eg, ambulation, stair climbing, standing, transferring) be considered.9
Much of the existing literature examining recovery after TKA or THA has assessed outcome by applying only self-report measures of physical function. In part, this is a likely consequence of the OMERACT III position that identified self-reports of physical function as essential and performance-based assessments as optional.6 One application of our systematic difference estimates is to augment the interpretation of studies that provided only self-reports of patients' ability to move around. For example, in a study of preoperative physical therapy on outcome following hip arthroplasty, Ferrara et al27 reported the following WOMAC-PF values: 33.7 and 43.5 points prearthroplasty for the study group and control groups, respectively, and 18.3 and 28.5 points 3 months postarthroplasty for the study and control groups, respectively. Ferrara et al concluded that both groups showed significant improvement 3 months postarthroplasty. However, application of the systematic difference estimate of 19 points for the WOMAC-PF suggests that physical function was somewhat worse postarthroplasty.
In another study, Harrington et al28 examined differences in the outcome of patients following knee arthroplasty with either a fixed- or mobile-bearing prosthesis. Prearthroplasty, these investigators reported WOMAC-PF values of 32.2 and 33.2 points for the fixed- and mobile-bearing prostheses, respectively. At 3 months postarthroplasty, the WOMAC-PF values were 18.0 and 17.9 points for the fixed- and mobile-bearing prostheses, respectively. Once again, applying a systematic difference estimate of 12 points for the WOMAC-PF suggests that the ability of patients to move around 3 months postarthroplasty had improved slightly rather than a large amount compared with their prearthroplasty levels. In these examples, the adjusted 3-month follow-up values were reasonably similar to the prearthroplasty values, however, it is unlikely that this will always be the case. The salient point is that failure to adjust postarthroplasty scores will result in overestimating the change.
Generalizability
Our findings are specific to patients who receive a primary TKA or THA due to end-stage OA and who are functioning at a reasonably high level—as indicated by the 6MWT distances and TUG times—prior to arthroplasty and who are assessed 9 to 13 weeks postarthroplasty.
Conclusion
This study demonstrated a consistent relationship between the identified self-report and performance measures over the interval specified. Applications of the information provided in this study could assist in the reinterpretation of studies reporting only scores for WOMAC-PF and LEFS self-report measures. In clinical practice, dependence on unadjusted scores for self-report measures could lead to erroneous conclusions concerning the ability of patients to move around postarthroplasty.
The Bottom Line
What we already know about the topic?
There is consistent evidence supporting the premise that self-report measures overestimate the ability of patients to move around after a hip or knee arthroplasty. One reason for this overestimation of function is the marked reduction in pain following arthroplasty.
What new information does this study offer?
Using performance measures as a standard, the current study provides estimates of systematic differences between functional tests and scores on the Lower Extremity Functional Scale and the Western Ontario and McMaster Universities Osteoarthritis Index physical function subscale.
If you're a patient, what might these findings mean for you?
Physical therapists may use both functional tasks and questionnaires to evaluate your function. Both provide useful information about your state of recovery.
Footnotes
-
All authors provided concept/idea/research design and writing. Professor Stratford provided data analysis. Ms Kennedy provided project management, fund procurement, facilities/equipment, and institutional liaisons. Ms Kennedy, Dr Maly, and Dr MacIntyre provided consultation (including review of manuscript before submission).
-
Ethical approval was obtained from the Sunnybrook Health Sciences Centre's Research Ethics Board prior to initiating data collection.
-
↵* StataCorp LP, 4905 Lakeway Dr, College Station, TX 77845.
- Received February 8, 2010.
- Accepted May 2, 2010.
- © 2010 American Physical Therapy Association