Abstract
Background The upper-extremity portion of the Fugl-Meyer Scale (UE-FM) is one of the most established and commonly used outcome measures in stroke rehabilitative trials. Empirical work is needed to determine the amount of change in UE-FM scores that can be regarded as important and clinically meaningful for health professionals, patients, and other stakeholders.
Objective This study used anchor-based methods to estimate the clinically important difference (CID) for the UE-FM in people with minimal to moderate impairment due to chronic stroke.
Method One hundred forty-six individuals with stable, mild to moderate upper-extremity (UE) hemiparesis were administered the UE-FM before and after an intervention targeting their affected UEs. The treating therapists rated each participant's perceived amount of UE motor recovery on a global rating of change (GROC) scale evaluating several facets of UE movement (grasp, release, move the affected UE, perform 5 important functional tasks with the affected UE, overall UE function). Estimated CID of the UE-FM scores was calculated using receiver operating characteristic (ROC) curve with the GROC scores as the anchor.
Results The ROC curve analysis revealed that change in UE-FM scores during the intervention period distinguished participants who experienced clinically important improvement from those that did not based on the therapists' GROC scores. The area under the curve ranged from 0.61 to 0.70 for the different facets of UE movement.
Conclusions The estimated CID of the UE-FM scores ranged from 4.25 to 7.25 points, depending on the different facets of UE movement.
Following stroke, upper-extremity (UE) hemiparesis is one of the most commonly exhibited impairments.1 Upper-extremity hemiparesis also may be the most disabling stroke-induced impairment due to its impact on performance of valued activities. Indeed, 50% of patients retain some degree of hemiparesis at 6 months after stroke,2 and up to 70% of patients remain unable to use their affected UEs functionally after discharge from rehabilitative therapies.3
To increase affected UE movement, several promising rehabilitative interventions have been tested.4–8 In testing stroke rehabilitative regimens, authors frequently have used the UE motor section of the Fugl-Meyer Assessment of Sensorimotor Impairment (UE-FM)9 to determine intervention response. Yet, although it is straightforward to determine the statistical significance of a motor change using the Fugl-Meyer Scale (FM), placing the magnitude of these changes in a context that is meaningful for busy rehabilitation clinicians is more difficult. Determining the magnitude of change that corresponds to an important UE movement change would help to address this shortfall.
Given the FM's recommended use in stroke rehabilitative trials,10 our overall goal was to contextualize how an FM score change would likely to translate to a functional change in the clinic. As a first step toward this goal, the purpose of this study was to estimate the clinically important difference (CID) of UE-FM scores in patients with chronic stroke exhibiting minimal to moderate UE hemiparesis, a frequently targeted population of many stroke rehabilitation interventions. Anchor-based CID computation methods, which examine the relationship between an outcome measure (eg, the FM) and a comparison measure (or “anchor”), were used to elucidate the meaning of a particular degree of change in FM scores. Such methods require an independent standard—or anchor—that is itself interpretable and correlated with the instrument being explored. In the current study, we used physical therapists' and occupational therapists' global ratings of change (GROCs) as anchors. Although the UE-FM constitutes one of the most established and commonly used outcome measures in stroke trials, to our knowledge, this was the first study examining UE-FM CIDs.
Method
Study Design
The present study was a secondary analysis of data from the Everest randomized controlled trial of implanted cortical stimulation for UE function in chronic stroke.11 As described elsewhere,11 participants were randomized to either a control group or a treatment group. In the control group, motor learning-based, repetitive task-specific training (RTP) targeting the affected UE was administered for approximately 2.5 hours per weekday over a 6-week period. For the first 4 weeks, rehabilitation therapy was conducted every weekday. For the subsequent 2 weeks, rehabilitation therapy was conducted 3 days per week with 1 to 2 sessions per day. Thus, patients received a total of 26 days of rehabilitation therapy. In the treatment group, the RTP regimen was coadministered with electrical cortical stimulation using the Northstar Stroke Recovery System (Northstar Neuroscience Inc, Seattle, Washington). As described elsewhere,11 the system delivers targeted epidural electrical stimulation to the cortex using electrodes and implantable pulse generators. Outcome measures—including the UE-FM—were administered before and after participation in the interventions.
Participants
Patients were recruited for the intervention trial from across the United States using several strategies, including print advertisements (eg, pamphlets) placed in clinics near enrolling sites, radio advertisements in the markets of enrolling sites, and print advertisements placed in national magazines whose primary subscribers were survivors of stroke. As volunteers came forward, the screening criteria shown in Table 1 were applied.
Screening Criteriaa
Using these study criteria, 146 patients (87 men, 59 women) were included in the current analysis. The mean age of all patients was 57.1 years (SD=10.96, range=29–83), and their mean time since stroke onset was 59.37 months (SD=63.22). Eighty-eight patients had hemiparesis affecting their right UEs, and 82 patients had hemiparesis affecting their dominant UEs.
Instruments
The following measures were administered by a blinded rater at one of the participating centers at which the Everest study was being conducted. All raters were certified on the outcome measures and recertified every 3 months using standardized, video-based interrater reliability checks at the main study center.
The UE-FM9 was used to assess UE impairment before and after the interventions previously described. Data are derived from a 3-point ordinal scale (0=cannot perform, 1=can partially perform, 2=can perform fully) applied to each item, and the item scores are summed to provide a maximum score of 66. The FM's scores have been shown to have high test-retest reliability (total=.98–.99; subtests=.87–1.00), interrater reliability, and construct validity in contexts similar to those of this study (ie, subacute and chronic stroke).12,13
Additionally, immediately after the 6-week intervention period, each treating therapist rated the amount of motor improvement exhibited by each participant in 5 different aspects of affected UE function. This rating was accomplished using a 5-point, observation-based GROC scale (Tab. 2). Global rating of change scales are commonly used as an anchor to examine important change in outcome measures.14–17 The therapists rated improvement in the affected UE in the following functional areas: ability to grasp objects, ability to release objects, ability to move the affected UE, ability to perform the 5 most important activities identified by the individual in the Canadian Occupational Performance Measure (COPM),18 and overall arm and hand function.
Global Rating of Change Scale
Two additional outcome measures were administered at baseline and at the end of the 6-week intervention period: the Arm Motor Ability Test (AMAT)19 and the Stroke Impact Scale (SIS). The AMAT is a 13-item test in which activities of daily living (ADLs) are rated according to a functional ability scale that examines paretic limb use (0=does not perform with paretic UE, 5=does use arm at a level comparable to less affected side) and a quality of movement scale (0=no movement initiated, 5=normal movement). The ADLs, which are further divided into subactivities to be rated, include using a knife and fork, eating with a spoon, combing hair, and tying shoelaces. The AMAT is a valid, stable, and reliable scale, and its scores correlate positively with those of other stroke-specific functional scales. The SIS20 is a 64-item self-report measure that assesses 8 domains (strength, hand function, ADLs and instrumental ADLs, mobility, communication, emotion, memory and thinking, and participation). In a previous study,20 SIS domains were examined by comparing the SIS with existing stroke measures and by comparing differences in SIS scores across Rankin Scale levels. Using these techniques, each domain met or approached the standard of 0.9 alpha coefficient for comparing the same patients across time.16
Data Analysis
Baseline data on the UE-FM, AMAT, and SIS were analyzed using descriptive statistics to characterize the sample. Participants were dichotomized into 2 groups based on the score from their treating therapists' ratings using the previously described 5-point ordinal GROC scale for each of the 5 different aspects of UE function that were rated: those who received a score of 5 were considered to have experienced clinically important improvement, and those who scored below 5 were considered to be stable and not to have experienced clinically important improvement. We chose a cutoff score of 5 to identify clinically important change because other researchers have indicated a need to identify changes in outcome measures that are more than minimally clinically important.21–23
Receiver operating characteristic (ROC) curves were constructed by plotting sensitivity values (true positive rate) on the y-axis and 1−specificity values (false positive rate) on the x-axis for different changes in UE-FM scores for distinguishing participants who demonstrated important improvement from those who did not have important improvement. Separate ROC curves were constructed for the different aspects of UE function (grasp, release, move UE, COPM score, and overall) that were rated by the therapists. The area under the curve (AUC) and 95% confidence intervals (CIs) were obtained as a method for describing the ability of the UE-FM to distinguish participants who improved from those who did not improve. An AUC of 0.50 or less indicates that change in UE-FM scores has no ability to distinguish improved participants from stable participants (as defined by the GROC score) beyond chance, whereas a value of 1.0 indicates perfect ability to distinguish between participants who improved and those who did not improve.
The CID in the UE-FM scores was estimated by identifying the point on the ROC curve nearest the upper left-hand corner, which is considered to be the best cutoff score for distinguishing improved participants from those who are stable or not improved. Sensitivity and specificity and positive and negative likelihood ratios (+LR and −LR) were calculated for the estimated CID values. These calculations were performed for all of the different aspects of UE function (grasp, release, move UE, COPM score, and overall) that were rated by the therapists.
Role of the Funding Source
This study was supported by funding from Northstar Neuroscience Inc.
Results
Baseline scores on the UE-FM, AMAT, hand and ADL sections of the SIS, and total SIS for participants with clinically important change (GROC score=5) and those who did not experience clinically important change (GROC score of <5) are presented in Table 3.
Mean (SD) Baseline Scores of Participants With Clinically Important Change (Global Rating of Change [GROC] Score=5) and Those Who Did Not Experience Clinically Important Change (GROC score <5)a
The ROC analysis revealed that the change in UE-FM scores during the intervention period distinguished participants who experienced clinically important improvement from those who did not based on therapists' GROC scores. The AUC values were statistically significant (>0.50) for all GROC anchors, except grasping ability (Figure). The CID estimates ranged from 4.25 to 7.25 for the 5 different aspects of UE function, with corresponding sensitivity values ranging from 0.53 to 0.64, specificity values ranging from 0.61 to 0.83, +LR values ranging from 1.6 to 3.1, and −LR values ranging from 0.57 to 0.69 (Tab. 4).
Receiver operating characteristic (ROC) curve for different changes in upper-extremity portion of the Fugl-Meyer Scale scores for distinguishing participants who demonstrated important improvement from those who did not have important improvement. Sensitivity values (true positive rate) on the y-axis and 1 − specificity values (false positive rate) on the x-axis. AUC=area under curve, COPM=Canadian Occupational Performance Measure.
Mean (SD) Estimated Clinically Important Difference of the Upper-Extremity Portion of the Fugl-Meyer Scale With Corresponding Area Under the Curve, Sensitivity, Specificity, and Positive and Negative Likelihood Ratiosa
Discussion
Although the FM is used commonly in clinical trials to assess impairment and determine efficacy, no previous studies were found that estimated the threshold for clinically important change. The present study provides an estimate of the CID of the UE-FM scores using data from people with chronic stroke exhibiting minimal to moderate UE impairment. The CID values reported provide a basis for clinicians and researchers to quantitatively interpret the clinical importance of FM changes that they observe based on the change from the patient's baseline FM scores.
Clinicians and researchers can interpret change in FM scores after rehabilitation intervention targeted at the UE using the estimated CID values of the UE-FM scores presented here. For example, if an individual's initial UE-FM score was 32 and, after rehabilitation, it improved to 40, it is likely that this improvement was clinically meaningful. The 8-point improvement in UE-FM scores exceeded the estimated CID in all 5 of the aspects of UE movement and function (grasp, release, ability to move the arm, ability to perform activities identified on the COPM, and overall arm and hand function) that were rated by the therapists in this study. Therapists also can administer the FM longitudinally to monitor the proportion of patients who meet or exceed the CID, as estimated in this study. These measurements would provide an objective and empirically based estimate of therapeutic “success” in terms of impairment-based motor recovery. Likewise, researchers can use the CID estimated here as a criterion for efficacy of a particular intervention, such as modified constraint-induced movement therapy,6 because this work has habitually used the FM as an outcome measure with patients who match the motor inclusion criteria used in this study.
The findings of this study are useful in that CID knowledge enables researchers to express their findings on the UE-FM in terms of the proportion of patients in the experimental group who exceeded the estimated CID values compared with the same proportion of patients in the comparison group. Using the preliminary estimated CID values for the UE-FM presented here, it would be the proportion of participants meeting our study criteria whose UE-FM scores improved by 4.25 to 7.25 or more. From these percentages, it is possible to calculate the number needed to treat, which may provide a more clinically relevant method of examining differences between intervention strategies.24
In obtaining these results, this study used a large sample, and thus a large number of observations were incorporated into the analyses. This large number of observations is a study strength, as it means that the values we determined to be clinically meaningful are likely generalizable when administering the FM to patients with stroke exhibiting minimal to moderate impairment of the affected UE. Given that this impairment group was eligible for several, recently developed rehabilitative therapies,4–6 such information is expected to be useful in determining whether the therapies render clinically meaningful benefits. However, even with these strengths, the sensitivity, specificity, and LR values depicted in Table 3 were somewhat lower than anticipated. Nonetheless, as stated elsewhere in this article, the AUCs were significantly higher than chance, which may have been because there is considerable variability in UE recovery after stroke. Such variability—and the heterogeneity of affected UE recovery patterns—even among patients scoring similarly on the FM, will likely reduce sensitivity and specificity. This is a study limitation, but is likely to be seen with many measures of UE recovery after stroke. The use of multiple statistical comparisons also constitutes a minor study limitation, mostly because it may undermine power to a small extent. Again, though, this concern is outweighed by the number of participants used in this analysis and the validity of findings (as discussed below).
Clinicians and researchers have proposed that minimally clinically important differences (MCIDs) of UE measures (including the FM) for people with chronic stroke are likely about 10% of the scale range.10,25,26 Interestingly, the findings of the present study appear to confirm this estimate, as our CID estimates correspond to 7.2% to 11.0% of the FM score range. To our knowledge, only one previous study has estimated MCIDs for poststroke UE outcome measures. Lang et al17 reported MCIDs ranging from 16% to 30% of the scale range for several UE tests (not including the FM) among patients with acute stroke (ie, <30 days post-ictus). The fact that lower relative CIDs were found in the present study is likely attributable to the chronicity of stroke in the population. Patients in the acute phase of stroke recovery are thought to have higher MCIDs due to expectations of rapid, spontaneous recovery early after stroke.
In the present study, CIDs were calculated using 5 different GROC anchors: the therapist's impression of patient improvement in grasping ability, releasing ability, ability to move the arm, ability to perform COPM identified activities, and overall UE function. The FM distinguished individuals with meaningful improvement from those without meaningful improvement significantly better than chance for all anchors except the grasping ability GROC, whose low end of the 95% CI for the AUC was at 0.50. This finding indicates that the FM may be less sensitive to changes in grasping ability than to changes in the other aspects of UE movement and function. Our estimated CID of the UE-FM scores for improvement in grasping ability is likely less valid than CIDs from the other anchors. However, eliminating the CID from the grasping ability GROC does not change the CID range calculated in this study (4.75–7.25). Moreover, the narrowness of this range further reinforces the validity of our findings.
Although the CID can be used to assess the clinical importance of outcome measure changes, it does not assess the reliability of the measure and cannot differentiate real change from chance variation and measurement error. The minimal detectable change (MDC), on the other hand, is the smallest change in an outcome measure that exceeds chance variation and measurement error.15 The MDC is calculated from test-retest reliability data and is distribution based. Thus, it is associated with a CI. For example, the MDC95 is the threshold for real change with 95% certainty. Ideally, the MDC95 should be less than or equal to the CID so that the important change represented by the CID is not attributable to chance or measurement error. Previous authors have reported MDC95 values of 5.2 to 7.2 for the FM,27,28 which are almost identical to the range of CIDs found in the present study. Therefore, improvement equal to or greater than 6 to 8 points on the FM can be considered both real and clinically meaningful for people with chronic stroke.
Although this study forwards important and clinically useful information, it should be noted that CID values should be used only to interpret outcome measure changes for people with characteristics similar to those of the participants in the current study. In particular, the use of a well-defined population meeting well-defined study criteria (ie, individuals with chronic, minimal to moderate UE impairment) constitutes a strength that makes the findings likely to be “true” in this group of survivors of stroke. However, this group of patients is not representative of all survivors of stroke. Thus, generalizability of our findings to other groups should not be assumed, and FM CIDs are likely to vary by impairment group. Future studies are needed to examine FM CIDs in other impairment groups. Similarly, although the CID values reported here are likely valid, the values that we computed did not incorporate participants' input. Global rating of change scales are commonly used as an anchor to examine patients' perspective on change to enhance interpretability of clinical outcome measures.14–17 However, we obtained GROC scores only from the participants' treating therapists. Thus, it remains to be seen whether the scores reported here are representative of changes that patients endorse as “clinically meaningful.” Future clinicians may want to compare the GROC ratings obtained by therapists with those obtained by patients to determine the level of agreement between the 2 groups.
Conclusions
This is the first study to date to determine UE-FM CID values in people with chronic stroke. The data suggest that UE-FM scores between 4.25 and 7.25 points are representative of clinically important changes.
The Bottom Line
What do we already know about this topic?
The Fugl-Meyer Scale is an established, often-used measure of upper-extremity impairment after stroke. Its reliability and validity are widely known.
What new information does this study offer?
This study provides clinicians with information on the smallest amount of change that scores on the Fugl-Meyer Scale must have for that change to be clinically important and meaningful to people with stroke.
Footnotes
-
Dr Page and Dr Fulk provided concept/idea/research design. All authors provided writing and data analysis. Dr Page provided project management and fund procurement. Dr Page and Dr Boyne provided consultation (including review of manuscript before submission).
-
This study was supported by funding from Northstar Neuroscience Inc.
- Received January 14, 2011.
- Accepted January 22, 2012.
- © 2012 American Physical Therapy Association