Responsiveness of the Balance Evaluation Systems Test (BESTest) in People With Subacute Stroke
- Butsara Chinsongkram,
- Nithinun Chaikeeree,
- Vitoon Saengsirisuwan,
- Fay B. Horak and
- Rumpa Boonsinsukh
- B. Chinsongkram, PhD, Faculty of Physical Therapy, Rangsit University, Pathum Thani, Thailand.
- N. Chaikeeree, PhD, Division of Physical Therapy, Faculty of Health Science, Srinakharinwirot University, Nakhonnayok, Thailand.
- V. Saengsirisuwan, PhD, Department of Physiology, Faculty of Science, Mahidol University, Bangkok, Thailand.
- F.B. Horak, PhD, Balance Disorders Laboratory, Department of Neurology, Oregon Health and Science University, Beaverton, Oregon, and Portland VA Medical Center, Portland, Oregon.
- R. Boonsinsukh, PhD, Division of Physical Therapy, Faculty of Health Science, Srinakharinwirot University, 63 Moo 7, Nakhonnayok, Thailand.
- Address all correspondence to Dr Boonsinsukh at: rumpa{at}swu.ac.th.
Abstract
Background The reliability and convergent validity of the Balance Evaluation Systems Test (BESTest) in people with subacute stroke have been established, but its responsiveness to rehabilitation has not been examined.
Objective The study objective was to compare the responsiveness of the BESTest with those of other clinical balance tools in people with subacute stroke.
Design This was a prospective cohort study.
Methods Forty-nine people with subacute stroke (mean age=57.8 years, SD=11.8) participated in this study. Five balance measures—the BESTest, the Mini-BESTest, the Berg Balance Scale, the Postural Assessment Scale for Stroke Patients, and the Community Balance and Mobility Scale (CB&M)—were used to measure balance performance before and after rehabilitation or before discharge from the hospital, whichever came first. The internal responsiveness of each balance measure was classified with the standardized response mean (SRM); changes in Berg Balance Scale scores of greater than 7 were used as the external standard for determining the external responsiveness. Analysis of the receiver operating characteristic curve was used to determine the accuracy and cutoff scores for identifying participants with balance improvement.
Results Participants received 13.7 days (SD=9.3, range=5–44) of physical therapy rehabilitation. The internal responsiveness of all balance measures, except for the CB&M, was high (SRM=0.9–1.2). The BESTest had a higher SRM than the Mini-BESTest and the CB&M, indicating that the BESTest was more sensitive for detecting balance changes than the Mini-BESTest and the CB&M. In addition, compared with other balance measures, the BESTest had no floor, ceiling, or responsive ceiling effects. The results also indicated that the percentage of participants with no change in scores after rehabilitation was smaller with the BESTest than with the Mini-BESTest and the CB&M. With regard to the external responsiveness, the BESTest had higher accuracy, sensitivity, specificity, and posttest accuracy than the Postural Assessment Scale for Stroke Patients and the CB&M for identifying participants with balance improvement. Changes in BESTest scores of 10% or more indicated changes in balance performance.
Limitations A limitation of this study was the difference in the time periods between the first and the second assessments across participants.
Conclusions The BESTest was the most sensitive scale for assessing balance recovery in participants with subacute stroke because of its high internal and external responsiveness and lack of floor and ceiling effects.
Balance impairment in patients with stroke stems from diverse mechanisms involved in postural control.1–6 Using a comprehensive balance measure, clinicians can precisely identify balance systems underlying balance impairments and select a specific intervention for each balance problem.7 Several balance scales, such as the Berg Balance Scale (BBS), the Postural Assessment Scale for Stroke Patients (PASS), and the Community Balance and Mobility Scale (CB&M), have been used to evaluate balance performance in patients with stroke.
The BBS is considered to be a reference standard for assessing balance in patients with stroke. It consists of functional balance tasks involved in the maintenance of a specified position as well as anticipatory postural adjustments.8,9 The BBS was shown to be reliable, valid, and sensitive to change in patients with stroke.10–14 The PASS is a functional balance test developed specifically for patients with stroke and very poor balance. Its psychometric properties were found to be satisfactory in patients during the first 3 months after stroke.15 The CB&M was developed for assessing balance and mobility in people with moderate to high levels of function after stroke.16 This scale includes tasks that are commonly performed in the community.16 The CB&M was shown to have an excellent ability to detect change in patients with chronic stroke.17 However, these 3 functional balance scales have floor and ceiling effects that might affect sensitivity for detecting changes across stroke severity.10,13,18,19
The Balance Evaluation Systems Test (BESTest) is a clinical balance assessment tool that provides information on postural control systems underlying balance impairments in patients with neurological disorders.7 Six interacting postural control systems—biomechanical constraints, stability limits and verticality, anticipatory postural adjustments, automatic postural responses, sensory organization, and stability in gait—are evaluated in the BESTest. Each of the 36 items of the BESTest is scored from 0 (severe impairment) to 3 (no impairment), creating a total score of 108 points or 100%.7 Because of the lengthy time required for administering the BESTest, a shortened version of the BESTest—the Mini-BESTest—was developed to specifically assess dynamic balance.20 Factor analysis and Rasch analysis were used to reduce the numbers of items and rating categories.20
Previous studies demonstrated excellent reliability and validity of the BESTest and the Mini-BESTest for assessing balance performance in patients with various neurological conditions, such as Parkinson disease, vestibular loss, and peripheral neuropathy.7,21 For psychometric property testing in patients with stroke, both the BESTest and the Mini-BESTest showed excellent intrarater reliability and interrater reliability (intraclass correlation coefficient=.99), as well as excellent validity with other balance measures, such as the BBS, PASS, CB&M, functional gait assessment, and Timed “Up & Go” Test.7,21–23 No floor or ceiling effect was observed with the BESTest in patients with subacute stroke.22 In contrast, the Mini-BESTest showed a floor effect in patients with subacute stroke and low functional ability,22 but this scale showed no floor or ceiling effect in people with chronic stroke.23 Although the BESTest was proven to be reliable and valid for measuring balance performance in patients with stroke, the ability of the BESTest to detect a change in balance performance as a result of a rehabilitation program has not been established.
Responsiveness, or the ability to detect true change over time, is a key psychometric property of clinical scales for evaluating the effectiveness of rehabilitation after stroke.24–27 Two types of responsiveness—internal responsiveness and external responsiveness—are commonly assessed.26,28 Internal responsiveness refers to the possibility of detecting any statistical change with a single-group, repeated-measures design, in which patients are assessed before and after a known treatment.24,26 The limitation of internal responsiveness is that it does not provide information on the quality of changes, such as worsening, improvement, or clinical relevance.25 In contrast, external responsiveness is associated with the concept of clinical relevance, which depends on the magnitude of a change and the choice of an external standard.25 For example, the BBS has been frequently used as an external standard for changes in balance performance.10,29,30 The assessment of both internal responsiveness and external responsiveness ensures that a change in performance over time will be large enough to be statistically significant for research purposes and precise enough to reflect meaningful change in an external criterion for clinical applications.24
The purpose of this study was to compare the internal responsiveness and external responsiveness of the BESTest with those of other balance measures (BBS, PASS, CB&M, and Mini-BESTest) in people with subacute stroke. We hypothesized that the BESTest would have significantly better internal and external responsiveness than the other balance measures tested.
Method
Participants
This cohort study is a continuation of a previous validity study.22 Participants were people who had subacute stroke (defined as the time since stroke of 48 hours–4 months)31,32 and were referred for physical therapy services at Prasart Neurological Institute, Bangkok, Thailand, from November 1, 2012, to January 31, 2014. Potential participants were included in this study if they met the following criteria, as screened by a physician: diagnosis of cerebral hemorrhage or cerebral infarction with stable medical conditions; age between 25 and 90 years; first unilateral hemispheric stroke; and ability to follow instructions to complete the assessment. We excluded people with the following conditions: cognitive impairment, defined as a score of less than 24 on the Mini-Mental State Examination (MMSE)33,34; cerebral aneurysm; lesion in the brain stem, involving the sleep-wake and respiratory control center, or cerebellum; aphasia, as diagnosed by a physician with a Thai adaptation of the Western Aphasia Battery35; a neurological disorder other than stroke; and the presence of major peripheral neuropathy or a musculoskeletal problem sufficient to disturb balance. Written informed consent was obtained from each participant before participation.
Outcome Measures
To determine the level of functional ability, we selected the Fugl-Meyer Assessment motor subscale.35,36 This assessment is used for both the upper and the lower extremities; a total of 100 points can be scored on a 3-point ordinal scale, ranging from 0 (“cannot perform”) to 2 (“performs fully”).36 Five balance evaluation scales (BESTest, Mini-BESTest, BBS, PASS, and CB&M) were administered to each participant to determine balance ability.
The Mini-BESTest consists of 14 items focusing on dynamic balance; it contains some items from section 3 to section 6 of the original BESTest. The score for each item ranges from 0 to 2; 0 means “severe impairment” and 2 means “no impairment.” The total possible score on the Mini-BESTest is 28 points.20
The BBS consists of 14 functional balance items. The score for each item ranges from 0 to 4; 0 means “unable to perform” and 4 means “able to complete the task.”9 The total possible score on the BBS is 56 points.10
The PASS is a 12-item test designed to measure balance performance in people with poor functional ability. The score for each item ranges from 0 to 3; 0 means “unable to perform” and 3 means “able to complete the task.”15 The total possible score on the PASS is 36 points.
The CB&M consists of 19 tasks, including advanced functional balance and mobility activities aimed at people who have had a stroke and are ambulatory or dwell in the community.16 Items are scored on an ordinal scale from 0 (“unable to perform”) to 5 (“performs independently”). The maximum score is 96 points because an extra point is given for being able to carry a basket while descending stairs.
Procedure
Each participant was assessed twice, before and after the physical therapy rehabilitation program. Rehabilitation before the assessment was completed within 2 days after admission to the rehabilitation ward; assessment after rehabilitation was done either at the end of rehabilitation or before discharge from the hospital, whichever came first. Each participant's demographic and clinical information (eg, Barthel Index [BI] and Fugl-Meyer Assessment motor subscale scores) was assessed by a trained physical therapist (N.C.). All balance measures were administered by a different, trained physical therapist (B.C.). Items from all balance measures were grouped by the test position: lying, sitting, and walking. Next, the position group was randomly assigned as the test sequence for each participant. Any test item that was duplicated in the tests was performed only once and then scored with the criteria from each test.21 The evaluations were performed in the same laboratory setting, and all participants received the same verbal instructions. Rest was allowed between the test items to avoid fatigue.
Each participant received physical therapy rehabilitation for 1 hour per day, 5 days per week. One hour of physical therapy training included passive stretching and active exercise as well as balance training and functional training, such as bed mobility, sit-to-stand transfer, and gait training. Of the 69 original participants from the previous validity study,22 20 participants were lost during the follow-up because they did not participate in the second assessment (n=5) or did not require continued rehabilitation (n=15). Therefore, 49 participants were included in this responsiveness study.
Data Analysis
We conducted descriptive statistical analyses of the demographic and baseline clinical characteristics of the participants. Statistical analysis was performed using IBM SPSS version 2.0 (IBM Corp, Armonk, New York). The percentage of participants who showed no change after rehabilitation was calculated from those whose scores did not differ between the first and the second assessments. The floor and ceiling effects were calculated as the percentages of the sample scoring the minimum and the maximum possible scores, respectively. Ceiling and floor effects of 20% or greater were considered significant.37 The responsive ceiling effect was calculated as the percentage of participants who scored within the top 10% of the test.21 The McNemar test was used to compare differences between the BESTest and the other balance measures, in terms of the proportions of participants for whom floor, ceiling, and responsive ceiling effects were found and the proportion of participants who showed no change after rehabilitation. To control for the overall P value at .05, we set the significance level for each paired comparison at less than .0125 (α/number of tests).
The percentages of minimum and maximum scores on each item of the BESTest were calculated as the percentages of participants who achieved minimum (score=0) and maximum (score=3) scores on each item. These percentages represented the difficulty of the item, such that a high percentage (>80%) of minimum scores reflected a difficult item and a high percentage of maximum scores indicated an easy item.22,23
Paired t tests were used to compare balance scores before and after treatment. The standardized response mean (SRM), calculated by dividing the observed mean change scores by the standard deviation of the change score for the same participant, was used to indicate the internal responsiveness. According to the Cohen criteria,38 an SRM of greater than 0.8 indicates a large change, an SRM of 0.5 to 0.8 indicates a moderate change, and an SRM of 0.2 or less indicates a small change. Moderate and high SRMs were considered to indicate sufficient internal responsiveness. For further comparison of the differences between the SRM of the BESTest and the SRMs of other balance measures, paired t tests with a significance level for each paired comparison of less than .0125 were selected.
To determine the external responsiveness, we selected the BBS as the external standard; a score change of greater than 7 was considered to indicate real clinical improvement.14,29 The receiver operating characteristic curve approach was selected for evaluating the external responsiveness through determination of the relative balance scores for classifying participants into 2 groups: participants who did not have a balance change (BBS score of ≤7) and those who did have a balance change (BBS score of >7). The accuracy of the group classification was assessed with the area under the curve (AUC). An AUC of greater than or equal to 0.9 is considered to have excellent discrimination.39 Sigmaplot 13 software (Systat Software, San Jose, California) was used to compare differences in the AUC of the BESTest and the AUCs of the Mini-BESTest, PASS, and CB&M. The cutoff point was chosen by selecting the score that provided the best balance between high sensitivity and high specificity.24 A likelihood ratio also was calculated to confirm the usefulness of the selected cutoff point. A positive likelihood ratio (LR+) of greater than 5 and a negative likelihood ratio (LR−) of less than 0.2 indicated that the cutoff point was useful.24 The accuracy of using the selected cutoff point for correctly identifying participants who showed balance improvement represented posttest accuracy.
Role of the Funding Source
This project was supported by the Thailand Research Fund, the Office of the Higher Education Commission, Srinakharinwirot University (grant no. RSA5580002), and by NIH and VA Merit Award 1075 (grant no. AG006457).
Results
Forty-nine participants with stroke participated in the present study. The demographic and clinical characteristics of the participants are shown in Table 1. The participants had a wide range of age, time since stroke, level of functional ability, and amount of rehabilitation. Table 2 shows the floor, ceiling, and responsive ceiling effects before and after rehabilitation for all participants. A significant floor effect was found before rehabilitation with the Mini-BESTest (32.7%) and the CB&M (69.4%). After rehabilitation, the PASS was the only measure that showed a significant responsive ceiling effect (20.4%), whereas a significant floor effect still was found for the CB&M (42.9%). Although the BBS did not show a significant ceiling effect after rehabilitation, it showed a higher responsive ceiling effect than the BESTest, Mini-BESTest, and CB&M (P<.01). The BESTest did not show any floor, ceiling, or responsive ceiling effect.
Participant Demographic and Clinical Characteristics of (N=49)a
Floor, Ceiling, and Responsive Ceiling Effects of the Balance Evaluation Systems Test (BESTest) and Other Balance Measuresa
Internal Responsiveness
After rehabilitation, all participants showed improvement in their balance, as demonstrated by significant increases in the scores on all balance measures, except for the CB&M (Tab. 3). Significant changes in scores corresponded to SRMs, such that the SRMs of the BESTest, Mini-BESTest, BBS, and PASS were high—ranging from 0.9 to 1.2—whereas the SRM of the CB&M was the lowest (P<.01), reflecting the limited internal responsiveness of the CB&M for participants with subacute stroke. In addition, the BESTest showed a significantly higher SRM than the Mini-BESTest (P<.001) and CB&M (P<.01), indicating that the BESTest was more sensitive for detecting balance change over time than the Mini-BESTest and CB&M. These data corresponded to the percentages of participants with no change; the percentage was smaller for the BESTest than for the Mini-BESTest and CB&M (P<.01).
Internal Responsiveness and Score Distribution of the Balance Evaluation Systems Test (BESTest) and Other Balance Measuresa
The score change and section SRM for the BESTest are shown in Table 4. The BESTest section score change ranged from 8.5 to 18.2. The sensory organization and postural response sections showed the largest score changes, whereas the biomechanical constraints section showed the smallest score change. The SRMs of all sections were high, ranging from 0.8 to 1.0, except for the stability-in-gait section, which had moderate internal responsiveness (SRM=0.6). All item scores, with the exception of those for the items “base of support” and “sitting verticality, left and right,” were significantly higher after rehabilitation. The items on which larger numbers of participants improved were “center-of-mass alignment,” “functional reach, forward and lateral,” “sit-to-stand transfer,” “in-place postural responses, forward and backward,” and “standing on a firm surface with eyes closed.” In contrast, more than 80% of participants achieved maximum scores before rehabilitation on the items “base of support” and “sitting verticality,” resulting in limited responsiveness for these items.
Balance Evaluation Systems Test (BESTest) Item Scores and Changes in Scores After Rehabilitationa
External Responsiveness
The receiver operating characteristic curve analysis and the receiver operating characteristic curve plot are shown in Table 5 and the Figure, respectively. On the basis of a BBS score change of greater than 7 for identifying participants who showed balance improvement, the BESTest had an excellent AUC (0.92), which was significantly larger than those of the PASS (P<.05) and CB&M (P<.01). Among 4 balance measures, only the cutoff scores for the BESTest >10%) and Mini-BESTest >3 points) were clinically meaningful, as confirmed by the likelihood ratios (LR+ of >5 and LR− of ≤0.2) and posttest accuracy.
Cutoff Scores and Associated Sensitivity, Specificity, LRs, and AUCs of Various Tests for Classifying Participants With Balance Improvementa
Receiver operating characteristic curve plot of score changes for the Balance Evaluation Systems Test (BESTest), Mini-BESTest, Postural Assessment Scale for Stroke Patients (PASS), and Community Balance and Mobility Scale (CB&M). The arrows indicate the cutoff points for identifying participants who showed balance improvement with the balance measures.
Discussion
The present study provides novel information on the responsiveness of the BESTest in people with subacute stroke. Responsiveness is a crucial psychometric property of any measurement scale, as it reports the effectiveness of the intervention or the recovery of the patients.17 Our results demonstrated that the BESTest had higher responsiveness than the Mini-BESTest, PASS, and CB&M in people with subacute stroke at an average of 39 to 52 days after stroke. Such high responsiveness in patients with stroke also was found for the BBS and PASS when patients were evaluated at 14 to 30 days after stroke.13,40,41 However, the responsiveness was reported to decline further when the measurement was done at a later stage after stroke, for example, 90 or 180 days after stroke.13,40,41
To determine the external responsiveness, we decided to use the BBS as the external standard instead of the global rating of change (GRC) because the GRC could be affected by recall bias and a participant's ability to understand the context of improvement.42 The BESTest with a cutoff score of 10 points showed significantly better accuracy for identifying participants with balance improvement than the PASS and CB&M. Therefore, neither the PASS nor the CB&M would be a scale of choice for people with subacute stroke. Moreover, the PASS had a significant ceiling effect after treatment, and the CB&M had a significant floor effect both before and after rehabilitation. These findings were in accordance with previous literature showing a ceiling effect of the PASS (38%) at 90 days after stroke.15 Knorr et al17 suggested that the CB&M was more suitable for people who had sustained a stroke and dwelled in the community, as neither a floor effect nor a ceiling effect was observed in this group of people. Although a previous study23 suggested the use of the Mini-BESTest in people who had sustained a stroke and dwelled in the community, the present study demonstrated that the Mini-BESTest might not be appropriate for people with subacute stroke because of a significant floor effect of the Mini-BESTest before rehabilitation, lower internal responsiveness, and a larger number of participants with no change after rehabilitation, relative to the results for the BESTest.
Each section of the BESTest also showed high responsiveness, except for the stability-in-gait section, which showed moderate responsiveness. This result suggested that, with the exception of the stability-in-gait section, each section could be used separately to evaluate change in balance performance. The lower responsiveness of the stability-in-gait section could have been due to the slow recovery of participants' walking ability. Approximately 10% of the participants were able to walk independently before rehabilitation, and this number increased to only 40% after rehabilitation. Gait recovery after stroke usually developed later than other abilities. The sequence of functional recovery started with the ability to maintain the position of standing upright with asymmetrical weight bearing and postural sway43,44 and then progressed to the restoration of paretic leg muscle function,45 improved stabilization of the head and trunk in space,45 more effective muscular compensation through the nonparetic leg for weight bearing and response to internal perturbation,44,45 reduction of visual dependency,1 progressive internalization of the altered body,45 and an ability to walk independently.43 In addition, not all of the BESTest item scores increased after rehabilitation. The items that showed the greatest improvement were “center-of-mass alignment,” “functional reach, forward and lateral,” “sit-to-stand transfer,” “in-place postural responses, forward and backward,” and “standing on a firm surface with eyes closed.” These improvements followed the aforementioned sequence of functional recovery in participants with stroke. In contrast, the items “base of support” and “sitting verticality, left and right” showed no improvement after rehabilitation, as the participants achieved almost the maximum scores before rehabilitation, so the range for improvement was minimal.
Information about floor and ceiling effects is important for selecting the appropriate scale. Our results confirmed that the BESTest did not have a floor, ceiling, or responsive ceiling effect in participants with subacute stroke. The BBS also did not have a significant floor or ceiling effect, suggesting that the BBS also is appropriate for use in the subacute phase of stroke. However, clinicians should be aware of an increased responsive ceiling effect for the BBS after rehabilitation; this effect was significantly higher than those for the BESTest and Mini-BESTest. The ceiling effect of the BBS might be more evident during a longer follow-up period. Supporting this observation, previous studies demonstrated a significant ceiling effect (26%) of the BBS at 38 days after stroke45 and increases in that ceiling effect to 34.1% and 47.7% at 3 and 8 months after stroke, respectively.17
In the present study, we examined responsiveness as a result of rehabilitation; the rehabilitation varied in quantity among the participants, as we could not schedule the same amount of rehabilitation for each participant because of differences in the severity of their health conditions. This scenario corresponds to real clinical situations, which involve heterogeneity of people with stroke and adjustments in the amount of therapy on the basis of the disease severity and a person's recovery rate. Because of variations in the time periods between the first and the second assessments (ranging from 5 to 44 days), it is possible that participants who had longer follow-up periods had greater increases in balance scores. Supporting this notion, a previous meta-analysis study showed a positive relationship between increased therapy time and increased functional improvement after stroke.46 Therefore, clinicians should apply the results of the present study in light of variations in the quantity of rehabilitation among people with stroke.
As the balance assessments were administered only twice—before and at the end of rehabilitation—the present study provides general information on the responsiveness of the balance measures as a result of rehabilitation. We could not specifically determine whether the BESTest is more responsive in the early phase or the later phase of rehabilitation. Future studies should assess the change in balance performance at several time points during the process of recovery after stroke. In addition, the present study demonstrated that the BESTest was the preferred balance measure for participants with subacute stroke. However, the drawback of the BESTest is the amount of time required to complete the scale. We found that for some items of the BESTest (ie, “base of support” and “sitting verticality”), maximum scores were achieved during the first assessment, so these items could not be used to report any improvement for participants after rehabilitation. Further statistical analysis eliminated the least sensitive and redundant items of the BESTest, resulting in the more concise and less time-consuming version of the BESTest for people with subacute stroke.
In conclusion, the BESTest is the most appropriate scale for assessing balance in people with subacute stroke because of its high responsiveness and lack of floor and ceiling effects, relative to the BBS, PASS, Mini-BESTest, and CB&M. A change in the BESTest score of 10% or more is an indicator of a change in balance performance.
Footnotes
Dr Chinsongkram, Dr Horak, and Dr Boonsinsukh provided concept/idea/research design. Dr Chinsongkram, Dr Saengsirisuwan, Dr Horak, and Dr Boonsinsukh provided writing. Dr Chinsongkram and Dr Chaikeeree provided data collection. Dr Chinsongkram and Dr Boonsinsukh provided data analysis. Dr Boonsinsukh provided project management, fund procurement, institutional liaisons, and administrative support. Dr Chaikeeree and Dr Boonsinsukh provided facilities/equipment. Dr Saengsirisuwan, Dr Horak, and Dr Boonsinsukh provided consultation (including review of manuscript before submission).
The authors thank Prasat Neurological Institute, Department of Medical Services, Ministry of Public Health, Bangkok, Thailand, for offering facilities and space. They express sincere gratitude to Lawan Panichareon and physical therapists at the Rehabilitation Medicine Department, Prasat Neurological Institute, for helping recruit the participants.
This study received ethical approval from the Human Research Protection Committee at Prasat Neurological Institute Research Center, Bangkok, Thailand.
This project was supported by the Thailand Research Fund, the Office of the Higher Education Commission, Srinakharinwirot University (grant no. RSA5580002), and by NIH and VA Merit Award 1075 (grant no. AG006457).
- Received November 10, 2015.
- Accepted April 14, 2016.
- © 2016 American Physical Therapy Association