Abstract
Background The group-level responsiveness of the Postural Assessment Scale for Stroke Patients (PASS) is similar to that of the short-form PASS (SFPASS). This result is counterintuitive because the PASS has more items (12) and response levels (4) than does the SFPASS (5 items and 3 response levels).
Objective The purpose of this study was to compare individual-level responsiveness between both measures to determine whether the SFPASS can detect change with as much sensitivity as the PASS.
Study Design and Setting Two hundred fifty-one patients were assessed using the PASS at 14 and 30 days after stroke onset in a medical center.
Methods The SFPASS scores were calculated from the patients' responses on the PASS. Individual-level responsiveness was calculated on the basis of the value of minimal detectable change (MDC). If a patient's change score was greater than the MDC of the PASS or SFPASS, his or her improvement was considered significant. The difference in the number of patients scoring greater than the MDC and the units of MDC (the MDC ratio) improved by the patients on both measures was examined.
Results Fifty-three percent of the patients scored greater than the MDC of the PASS, whereas 43.0% of the patients scored greater than the MDC of the SFPASS. The difference was significant. The mean (±SD) MDC ratio of the PASS (1.8±1.7) was significantly higher than that of the SFPASS (1.2±1.3).
Limitations The scores of the SFPASS were retrieved from those of the PASS, which limits the generalization of our findings.
Conclusions The PASS has better individual-level responsiveness than does the SFPASS. To comprehensively report effects of clinical trials, future studies using the PASS should report the individual-level effect (eg, number of patients scoring greater than the MDC).
A short and psychometrically sound measure allows clinicians and researchers to efficiently quantify patients' outcomes. Previous studies have shown that short-form measures have psychometric properties, particularly responsiveness (a critical index for outcome measures to show their ability to detect change), which are similar to their original or long-form measures.1–4 However, Hobart et al5 argued that such a similarity in responsiveness between short forms and long forms could be ascribed to the use of group-level indexes (eg, effect size) for comparisons. They used standard error generated from item response theory (IRT) for each participant to calculate individual-level responsiveness (ie, calculating how many people can achieve significant improvement or deterioration). Results show that the individual-level responsiveness of a short measure (the 10-item, 2- to 4-response-level Barthel Index) is lower than that of a long measure (the 13-item, 7-response-level Functional Independence Measure).5 Thus, individual-level responsiveness is critical for clinicians and researchers on the selection of competing outcome measures.
Although IRT-based standard error appears useful for calculating individual-level responsiveness, most current measures have been developed and examined using classical test theory.6 Classical test theory also can generate a similar index for random error (called minimal detectable change [MDC] or smallest real difference). The MDC is the smallest threshold of change scores that is greater than random error at a certain level of confidence (usually 95%).7,8 Thus, the MDC95 can be used as a threshold for identifying statistically significant individual changes.7,9 The MDC95 is simple and useful for estimating the individual-level responsiveness of a measure.
We have shown that the group-level responsiveness of the Postural Assessment Scale for Stroke Patients (PASS) is similar to that of the short-form PASS (SFPASS).1 The 5 items of the SFPASS were selected from the 12 items of the PASS. Furthermore, the middle response level of the 3-level SFPASS was created by combining the middle 2 levels (1 and 2) of the 4-level PASS.1 Therefore, the similar group-level responsiveness of both measures is counterintuitive, as the PASS has more items and response levels than does the SFPASS. Every item and response level of the PASS is different from the others and thus provides unique information for assessment of balance. The additional items (7) and response level (1) of the PASS would be useful to detect change compared with the SFPASS. The individual-level responsiveness of the 2 measures remains unknown, which complicates the selection of a competitive measure by clinicians and researchers. Thus, the purpose of this study was to compare the individual-level responsiveness of the PASS and SFPASS. We hypothesized that the PASS would statistically detect more patients with significant balance improvement compared with the SFPASS (P<.05). The group-level responsiveness of both measures also was compared. We expected that both measures would have similar group-level responsiveness (overlapped confidence intervals of group-level indexes). The results would be valuable for clinicians and researchers for the selection of a responsive balance measure.
Method
Participants
Data were available from a previous longitudinal follow-up study.10 Each participant in the study was assessed at 14 days after stroke onset and reassessed at 30 days after onset to characterize his or her balance ability (eg, as measured with the PASS) and recovery of neurological impairments. We recruited participants who met the following criteria: (1) first or recurrent onset of cerebrovascular accident without other major diseases (eg, cancer, dementia, severe rheumatoid arthritis), (2) ability to follow verbal instructions to complete the PASS, and (3) ability to provide informed consent personally or by proxy. Patients were excluded if they had another stroke or other major diseases during the follow-up period. We also excluded patients with the highest possible scores of the PASS and the SFPASS (ie, 36 and 15, respectively) because these patients had no room to improve on the PASS or SFPASS.
Procedure
The PASS was administered by an occupational therapist who was not informed of the purpose of this study. The patients were assessed at a hospital or in their homes. The scores of the SFPASS were obtained from the patients' responses on the PASS.
Measures
The PASS was specifically developed to assess balance function in people with stroke.11 The PASS contains 12 four-level (0-1-2-3) items assessing a person's balance performance in situations with varying difficulties (ie, maintaining or changing a lying, sitting, or standing position). Its total score ranges from 0 to 36, and the psychometric properties (including reliability, concurrent validity, predictive validity, and group-level responsiveness) of the PASS are satisfactory when used to assess people with stroke.10,11 The MDC95 of the PASS is 3.2, which was estimated by repeated assessments on a group of patients with stable condition (ie, 52 patients with chronic stroke).12 Because the MDC of a measure is considered sample independent, the MDC of the PASS was used to examine individual-level responsiveness in this study.
The SFPASS has 5 three-level items, which are listed in the Appendix. The 5 items are those from the original PASS that have the best measurement properties (ie, higher internal consistency and greater responsiveness). The middle level of the SFPASS was created by combining the middle 2 levels of the original PASS. Thus, both the items and scores of the SFPASS can be obtained from the scores of the PASS. The score of the SFPASS ranges from 0 to 15. The psychometric properties (including reliability, concurrent validity, and group-level responsiveness) of the SFPASS are very similar to those of the original PASS. The MDC95 of the SFPASS is 2.2, based on the same patients used for estimating the MDC of the PASS.13
Data Analysis
Group-level comparison.
Two indicators were used to examine the group-level responsiveness. First, Kazis' effect size14 was calculated by dividing the mean changes by the standard deviation of the baseline scores obtained at 14 days after stroke onset. Second, the standardized response mean (SRM) (another type of effect size) was calculated by dividing the mean changes by the standard deviation of the change in scores. According to Cohen's criteria, an effect size greater than 0.8 is large, 0.5 to 0.8 is moderate, and 0.2 to 0.5 is small.15
In addition, to compare the responsiveness between the PASS and SFPASS, we estimated the 95% confidence intervals (95% CIs) of Kazis' effect size and SRM to test the differences between the above measures by 10,000 bootstrap samples.16
Individual-level comparison.
The MDC95 values of both the PASS and the SFPASS (3.2 and 2.2, respectively) were retrieved from recent studies,12,13 in which 52 patients in a stable condition were tested twice, 1 week apart. The scores of the SFPASS were obtained from the patients' scores on the PASS.13 In brief, the MDC was estimated on the basis of test-retest reliability investigation. The MDC based on the standard error of measurement (SEM) was calculated using the following formulas:
The z score (equation 1) represents the confidence interval (CI) from a standard normal distribution (1.96 for 95% CI was used in this study). The SEM was calculated by the square root of the error variance including systematic differences (equation 2), which was obtained from the analysis of variance table.17
The relative responsiveness of the PASS and SFPASS was compared at the individual level with 4 steps. First, we calculated the size of change score of each patient (“score at 30 days after onset” − “score at 14 days after onset”). Second, we examined whether the change score was larger than the MDC. Third, we categorized the significance of each patient's change score into 1 of 3 groups according to the size and direction of the change score. The first group was significant improvement: change score ≥MDC95. The second group was nonsignificant improvement: 0 ≤ change score < MDC95. The third group was others (no change or worsening). Fourth, the distributions categorized into these 3 groups for both measures were compared using a test of marginal homogeneity (a likelihood-ratio test on the G2 goodness-of-fit statistics). The difference of proportions with a 95% CI was used to compare 2 proportions of any specific category for statistical significance.
Because the above individual-level responsiveness index used the MDC95 to categorize participants into 3 groups only, we further calculated the MDC ratio for each participant's change score (ie, change score divided by MDC95). The MDC ratio is a continuous index and thus is a sensitive index. Then we compared the MDC ratios of both measures using a paired t test.
Results
Demographic and clinical characteristics of the participants are summarized in Table 1. Three hundred one patients were assessed using the PASS at 14 days after a recent stroke onset. A total of 50 patients were not followed because they achieved the highest possible score of the SFPASS (13 patients), were in an unstable condition (16 patients), or were discharged without prior notice (21 patients). In total, 251 patients were assessed at both time points (ie, 14 and 30 days after stroke onset), and their data were used for further analyses. These patients had a wide range of balance impairment (from bedridden to nearly able to stand on the affected leg for 10 seconds).
Demographic and Clinical Characteristics of the Participantsa
The SFPASS showed a notable floor effect for patients at 14 days after onset. The ceiling effect of the SFPASS was more notable than that of the PASS for the participants at 30 days after onset.
Group-Level Comparison
Kazis' effect size and SRM showed moderate to large responsiveness (0.46–0.91) of both measures in detecting changes from 14 days to 30 days after stroke (Tab. 2). In particular, bootstrap analyses showed that the 95% CIs of the 2 effect size indexes of both measures largely overlapped.
Group-Level Responsiveness of Both Balance Measures (n=251)a
Individual-Level Comparison
Table 3 shows that the distributions for both balance measures to detect the number of participants achieving different levels of improvement were significantly different (P<.001). Specifically, 53.0% of the participants scored greater than the MDC95 of the PASS (ie, significant improvement group), and 43.0% of the participants scored greater than that of the SFPASS. The PASS significantly detected a greater proportion of participants as showing significant improvement than the SFPASS detected (P<.001). In addition, Table 4 shows similar trends in which the PASS detected a greater proportion of participants as showing significant improvement and a smaller proportion of participants as showing nonsignificant improvement than the SFPASS detected at the other levels of confidence (ie, 70%, 75%, 80%, 85%, 90%, and 99%). The MDC ratio of the PASS (X̅±SD=1.8±1.7) was significantly higher than that of the SFPASS (X̅±SD=1.2±1.3) (P<.001).
Individual Patient-Level Responsiveness of the Postural Assessment Scale for Stroke Patients (PASS) and Short-Form PASS (SFPASS) (n=251)
Individual Patient-Level Responsiveness of Both Balance Measures Calculated at Certain Confidence Levels (n=251)a
Discussion
The main purpose of this study was to determine whether the SFPASS can detect change as sensitively as the PASS at an individual level. Particularly, the PASS has more items and response categories and shows more potential to detect change than does the SFPASS. We found the responsiveness of the PASS and that of the SFPASS, as expected, to be similar at the group level, as shown by Kazis' effect size and SRM. These results were confirmed by 10,000 bootstrap samples. The similarity in group-level responsiveness of the PASS and SFPASS has been reported previously.1 However, individual-level responsiveness of the PASS was better than that of the SFPASS. The PASS could detect significant recovery of balance function in more patients than the SFPASS could. Furthermore, the MDC ratio (ie, how many units of MDC had improved in each participant) of the PASS was significantly higher than that of the SFPASS. Thus, the PASS was better able to detect change compared with the SFPASS. This finding supports the intuitive sense that the higher number of items and additional response level of the PASS should make it a better outcome measure than the SFPASS.
Our results of individual-level responsiveness can provide unique information that traditional indexes of group-level responsiveness cannot provide. Thus, we raise 2 issues for researchers in examining the responsiveness of an outcome measure. First, it is strongly recommended that individual-level responsiveness be included in examinations of the responsiveness of outcome measures, and particularly for comparing competing measures (eg, short forms versus long forms, new measures versus legacy measures). The results will be critical for clinicians and researchers in the selection of competing measures on the basis of comprehensive empirical evidence.
Second, our findings support the argument that group-level indicators of responsiveness (eg, Kazis' effect size and SRM) are inappropriate or limited.5,18 The group-level indicators of responsiveness could not demonstrate differences in levels of responsiveness between a short measure (eg, the Barthel Index) and a long measure (eg, the Functional Independence Measure).5,18,19 However, the superiority of the Functional Independence Measure over the Barthel Index in detecting change is demonstrated by individual-level analyses.5 These observations indicate that group-level indexes of responsiveness can be misleading.
It is still unclear why the results of group-level responsiveness (eg, Kazis' effect size and SRM) and individual-level responsiveness (eg, MDC as a cutoff score, MDC ratio) were different.5 However, the PASS, which has more items and response levels than does the SFPASS, should be better able to detect change than the SFPASS. Further research is warranted to confirm our findings and to determine the causes of the differences.
Our findings imply that studies using the PASS or SFPASS as an outcome measure should report the individual-level effect (ie, number of patients scoring greater than the MDC or the MDC ratio) in addition to the group-level effect (eg, effect size) in order to comprehensively report the effects of clinical trials. The MDC is the smallest threshold of change scores that is detectable and greater than random error at a certain level of confidence (usually 95%).7 Reporting the number of patients improving by greater than the MDC in the experimental group of a clinical trial would help clinicians integrate both evidence-based results and measurement errors that are inherent in any kinds of measurement. Both researchers and clinicians need to interpret their observations on the basis of objective measurement properties (ie, the change observed in each patient has to be greater than random measurement error [eg, MDC]).
There were 2 limitations in this study. First, our patients were followed at the subacute stage. Only a few patients, as expected, deteriorated during the follow-up periods. Thus, the differences in the abilities of the 2 measures to detect deterioration remain unknown. Second, the scores of the SFPASS were retrieved from those of the PASS. Future studies are needed to validate our findings using the PASS and SFPASS independently.
In brief, the individual-level responsiveness of the PASS was better than that of the SFPASS, so the PASS is recommended for clinical trials and clinical settings. Future studies using the PASS should report the individual-level effect (ie, number of patients scoring greater than MDC) in addition to the group-level effect (eg, effect size) in order to comprehensively report the effects of clinical trials.
Appendix.
Five-Item Short-Form Postural Assessment Scale for Stroke Patients (SFPASS) and Criteria for Scoring
Footnotes
All authors provided concept/idea/research design. Ms Hsueh and Dr Hsieh provided writing. Dr Hsieh provided data collection, project management, fund procurement, and clerical support. Mr Chou, Dr Wang, and Dr Hsieh provided data analysis. Dr Wang provided study participants. Ms Hsueh, Dr Chen, Mr Chou, and Dr Hsieh provided consultation (including review of manuscript before submission).
The study was approved by the institutional Review Board of National Taiwan University Hospital.
This study was supported by research grants from the National Science Council (NSC 99-2314-B-002-037-MY3) and the E-Da Hospital (97-EDN11, 100-EDN10) in Taiwan.
- Received February 6, 2013.
- Accepted May 28, 2013.
- © 2013 American Physical Therapy Association