Abstract
Background The Berg Balance Scale (BBS) is a balance measure commonly used for people with multiple sclerosis (MS). The Mini-BESTest is an alternative based on balance systems.
Objective The study objective was to compare the BBS and the Mini-BESTest for sensitivity to change, likelihood ratios for walking aid use and falls, and associations with clinical variables in people who have MS and are ambulatory.
Design This was a cohort study with measurements before and after exposure to 8 weeks of routine physical therapy intervention.
Methods For 52 participants who had a primary diagnosis of MS and who were independently mobile, with or without an aid, demographic details and a history of falls and near falls were collected. Participants completed the Mini-BESTest, Multiple Sclerosis Impact Scale-29, Multiple Sclerosis Walking Scale-12, BBS, Modified Fatigue Impact Scale, and Six-Minute Walk Test.
Results No participant started with a baseline Mini-BESTest maximum score of 28, whereas 38.5% (n=20) started with a baseline BBS maximum score of 56. Statistically significant changes in the Mini-BESTest score (X̅=5.31, SD=3.5) and the BBS score (X̅=1.4, SD=1.9) were demonstrated. Effect sizes for the Mini-BESTest and the BBS were 0.70 and 0.37, respectively; standard response means for the Mini-BESTest and the BBS were 1.52 and 0.74, respectively. Areas under the receiver operating characteristic curves for the Mini-BESTest and the BBS were 0.88 and 0.77, respectively, for detecting mobility device use and 0.88 and 0.75, respectively, for detecting self-reported near falls. The Mini-BESTest had a higher correlation for each secondary measure than did the BBS.
Limitations This study involved a sample of convenience; 61% of the participants did not use a walking aid. The order of testing was not randomized, and fall status was obtained through retrospective recall.
Conclusions The Mini-BESTest had a lower ceiling effect and higher values on responsiveness tests. These findings suggest that the Mini-BESTest may be better at detecting changes in balance in people who have MS, are ambulatory, and have relatively little walking disability.
Multiple sclerosis (MS) is a chronic, progressive disorder of the central nervous system that commonly affects young adults.1,2 Many people with MS report balance problems.3,4 Such balance impairments may occur in the initial stages of the disease and in people who have MS and minimal clinical disability.5,6 Balance is a complex phenomenon involving many systems and domains,7 and the early detection of balance impairments in people with MS is essential for optimally managing this lifelong condition.
Various balance measurements have been used in studies involving people with MS.8 One commonly used clinical balance measurement is the Berg Balance Scale (BBS).9 Although studies have reported favorable psychometric properties of the BBS in people with MS,10,11 limitations such as ceiling effects,12,13 reduced responsiveness,13 and problems with its rating scale design have been reported.14 The BBS also is not designed to assess many balance elements, such as reactions to external perturbations, dynamic gait, and dual tasks—all required for functional balance in people with MS.8,15,16 Such limitations should be considered in the choice of an appropriate measure in research and clinical practice.
An alternative to the BBS is the BESTest, a 27-item balance measure.16 However, the main limitation of the BESTest is that it can take up to 30 minutes to administer. A shorter version, the 14-item Mini-BESTest,15 takes 10 to 15 minutes to administer. The Mini-BESTest takes into account dynamic balance, including anticipatory transitions, postural responses, sensory orientation, and dynamic gait—all required for functional balance.15 The Mini-BESTest has been shown to have good psychometric properties in various neurological conditions.17–21 The Mini-BESTest may be more appropriate than the BBS for use in people with MS because its wider system-specific assessment may ensure that a high-level deficit is not overlooked and that changes in these domains are detected after treatment.
Recent literature13,19,22 has compared various psychometric properties of the Mini-BESTest and the BBS in people with Parkinson disease and people with mixed neurological conditions. All of those studies demonstrated more favorable results (ie, responsiveness, validity, reliability, sensitivity, and specificity) for the Mini-BESTest than for the BBS. To our knowledge, a comparison of the BBS and the Mini-BESTest in a sample of only people who have MS and are ambulatory has not been done.
Therefore, the aim of this study was to compare the Mini-BESTest and the BBS for assessing balance in people who have MS and are ambulatory. Specifically, our objectives were to investigate which balance measure is better at detecting a change in balance after routine physical therapy care; whether each measure can distinguish people who report falls, near falls, or the use of a mobility device; whether a change in balance score is associated with a patient's or a therapist's (or both) impression of change; and whether each measure is associated with other clinical measures. Our hypothesis was that the Mini-BESTest would perform better in all areas than would the BBS.
Method
This was a descriptive cohort study, and STROBE (STrengthening the Reporting of OBservational studies in Epidemiology) guidelines were followed to standardize the conduct and reporting of the research. Participants gave informed consent before assessments.
People who had MS and attended outpatient clinics at an acute care teaching hospital for physical therapy treatment from September to November 2013 were invited to take part. All participants had a primary diagnosis of MS; were medically stable; were independently mobile, with or without an aid; and were more than 18 years of age. We aimed to recruit a sample of 50 participants. The sample size was chosen on the basis of 2 considerations: (1) that the detection of a moderate change from the baseline (Cohen d=0.5) with a sample size of 44 will have a 90% power to detect an effect size of 0.5 by use of a paired t test with a 2-sided significance level of .05, and (2) that correlation coefficients will have a 2-sided Fisher z test value of 0.05 for the null hypothesis that a Pearson correlation coefficient (r) of .0 will have an 80% power to detect an r of .4 with a sample size of 47. Sample size calculations were done with nQuery 7 software (Statistical Solutions Ltd, Cork, Ireland).
Each participant attended the first testing day as part of routine physical therapy care, and demographic data (ie, age, sex, time since diagnosis, mobility status, and self-reported history of falls or near falls) were collected. Falls were assessed by asking the following 2 questions: (1) “Have you had a fall in the last 3 months?” and (2) “Have you had a near fall in the last 3 months?” A fall was defined as an unexpected event in which the participants came to rest on the ground, floor, or lower level, and a near fall was defined as an unexpected event in which the participants nearly came to rest on the ground, floor, or lower level.”
A senior physical therapist with experience in MS (E.R.) carried out the testing. Each participant was assessed before and after the completion of routine physical therapy. The therapy included individual sessions, home exercise programs, or group classes (or a combination of these interventions) incorporating neuromuscular stimulation, Nintendo Wii Fit games (Nintendo, Redmond, Washington), and specific strengthening and aerobic training, individually prescribed as needed by the lead investigator (E.R.).
After 8 weeks of physical therapy, participants completed the outcome measures again in the same order. The lead investigator and the participants then rated their change in balance using a 7-point global rating of change (GRC) scale.
Outcome Measures
Demographic variables were collected from the participants' medical or physical therapy notes. Any outstanding information was collected from participants verbally before assessments. In addition to the BBS and the Mini-BESTest, several measures suggested as part of a core set of outcomes23 for people with MS were administered.
The BBS is a 14-item balance test that takes approximately 10 to 15 minutes to administer. Performance on each item is rated from 0 (cannot perform) to 4 (normal performance). The total BBS score ranges from a minimum of 0 point to a maximum of 56 points. A higher score on the BBS indicates better balance. The BBS was shown to have good test-retest reliability (intraclass correlation coefficient=.96) and interrater reliability (intraclass correlation coefficient=.96) in people with MS.11 The BBS also was shown to have acceptable concurrent validity with other balance measures, such as the Timed “Up & Go” Test, Dynamic Gait Index, Hauser Deambulation Index, Dizziness Handicap Inventory, and Activities-specific Balance Confidence Scale, in people with MS.10
The 14-item Mini-BESTest15 was designed to investigate the following 4 dynamic balance domains: anticipatory postural adjustments, postural responses to perturbations, sensory orientation, and balance during gait with and without a cognitive task. The standard protocol for administrating the Mini-BESTest was used in this study.15,16 All items are scored on an ordinal scale (0=severe, 1=moderate, and 2=normal performance). The total Mini-BESTest score ranges from a minimum of 0 point to a maximum of 28 points. A higher score on the Mini-BESTest indicates better balance. Preliminary evidence supports the between-rater reliability21 and the test-retest reliability and convergent validity19,24 of the Mini-BESTest in heterogeneous neurological populations, which included people with MS.
The Multiple Sclerosis Impact Scale-29 (MSIS-29) is a 29-item self-report questionnaire for assessing the impact of MS on physical (20 items) and psychological (9 items) domains over the preceding 4 weeks in people with MS.25 Each item has 5 potential responses, from 1 (not at all) to 5 (extremely). Both domains of the scale are scored by summing all of the responses across the items and then converting the values to a scale from 0 to 100, where 100 indicates a greater impact of disease on daily function (worse health). The MSIS-29 has been shown to have acceptable psychometric properties in people with MS.26
The Multiple Sclerosis Walking Scale-12 (MSWS-12)27 is a 12-item self-report questionnaire for rating the impact of MS on walking domains over the preceding 2 weeks. Scores on each individual walking item range from 1 to 5, with 1 meaning no limitation and 5 meaning extreme limitation. A higher score indicates a greater impact of MS on walking ability. The MSWS-12 has been shown to be reliable and valid for people with MS.28,29
The Modified Fatigue Impact Scale30 is a 21-item self-report questionnaire for rating the impact of MS on physical, cognitive, and psychological domains of fatigue. It takes 5 to 10 minutes to complete. Each item is rated on a 5-point Likert scale (0–4). The total score (0–84) and scores on subscales for physical (0–36), cognitive (0–40), and psychosocial (0–8) functioning are totaled and converted to a score out of 100. A higher value indicates greater fatigue. The Modified Fatigue Impact Scale is commonly used in people with MS31,32 and has been shown to have acceptable validity.33
The Six-Minute Walk Test (6MWT)34,35 is a submaximal measure of walking endurance. Participants walk as far as possible in 6 minutes along a 30-m hallway, turning around cones at each end, with their habitual assistive device, if needed. The distance walked is documented. Walking improvement on the 6MWT is indicated by positive change scores (in meters). The average value for people who are healthy is 900 m. The 6MWT has been shown to be reliable and valid in people with MS.36,37
Global rating of change scales are designed to quantify a person's impression of improvement or deterioration over time, usually either to determine the effect of an intervention or to chart the clinical course of a condition.38 For this study, two 7-point scales were designed for both the lead investigator and participants to quantify the participants' change in balance after routine physical therapy care. The 2 questions asked were: (1) “With respect to your balance, due to your multiple sclerosis, how would you describe yourself now compared to the beginning of this intervention period?” and (2) “Compared with the beginning of this intervention period, how would you describe the participants' balance, due to their multiple sclerosis, now?” The possible responses were: very much worse, much worse, minimally worse, no change, minimally improved, much improved, and very much improved. The responses “minimally improved,” “much improved,” and “very much improved” were combined to create the variable “improved balance.”
IBM SPSS Statistics version 20 (IBM SPSS, Armonk, New York) and Microsoft Excel (Microsoft Corp) were used for data analysis. Demographic data were expressed as the mean and standard deviation or as the median and interquartile range. P values of less than .05 were considered significant. Paired t tests were used to examine the changes in Mini-BESTest and BBS scores after treatment. Floor and ceiling effects were calculated as the percentages of participants with minimum and maximum scores, respectively. The McNemar test for binary matched-pairs data was used to compare ceiling effects.
The standard response mean and the effect size also were used to investigate the ability of each measure to detect change.39 The standard response mean is determined by dividing the mean change in scores at 2 time points by the standard deviation of the change. Standard response means of 0.20, 0.50, and 0.80 or greater have been proposed to represent small, moderate, and large levels of responsiveness, respectively.40
The equation used to calculate the effect size was:
where t=the test statistic value and N=the sample size. Guidelines41 have stated that values of 0.2 to 0.3 represent a small effect, a value of 0.5 represents a medium effect, and a value of 0.8 represents a large effect.
The receiver operating characteristic curve and the area under this curve (AUC), with its 95% confidence interval (95% CI), were calculated42 to investigate the ability of the BBS and the Mini-BESTest to distinguish reported improvements in balance (scores of 5, 6, and 7 on the GRC scale), reported falls or near falls, or the use of a mobility aid. The AUC can range from 0.5 to 1.0; a value of 0.5 indicates no accuracy in distinguishing “improved” from “not improved,” whereas a value of 1 indicates perfect accuracy.39
Likelihood ratios were computed for all possible cutoff scores. Positive likelihood ratios, computed as sensitivity/(1 − specificity), indicate the increase in the odds of having a condition given a positive test result. Negative likelihood ratios, computed as (1 − sensitivity)/specificity, indicate the decrease in the odds of having a condition given a negative test result.
An analysis of variance was used to compare the mean differences between GRC score categories. The nonparametric Spearman rank order correlation was used to compare baseline balance score measurements with each other and to compare baseline and change scores on the Mini-BESTest and the BBS with those on all of the aforementioned secondary measures used in this study.
Role of the Funding Source
Ms Ross was the recipient of the Chartered Physiotherapists in Neurology and Gerontology Research Bursary.
Results
Over a 3-month period, 62 consecutive people with MS were invited to participate in the present study. Of those, 53 expressed an interest and 52 were eligible for the study. Data from 52 participants at baseline (mean age=45.73 years, SD=5.65; 37 women) and from 47 participants at follow-up (mean age=41.09 years, SD=5.65; 33 women) were used. Participants for whom data at follow-up were missing were omitted from the relevant analyses. Baseline demographic data are shown in Table 1. Reasons for dropouts (n=5, 9.6%) included family commitments (n=2), transportation (n=1), work (n=1), and other medical issues (n=1).
Baseline Demographic Data for 52 Participantsa
Descriptive statistics for the Mini-BESTest and the BBS are shown in Table 2. No participant had the maximum baseline Mini-BESTest score of 28 points. Twenty participants (38.5%) had the maximum baseline BBS score of 56 points; those 20 participants had baseline Mini-BESTest scores ranging from 12 to 27 points. The proportion of participants who had the maximum score on the BBS at baseline was statistically larger than the proportion of participants who had the maximum score on the Mini-BESTest at baseline (McNemar test, P<.001). The Figure shows the frequencies of scores across both measures, with the Mini-BESTest scores doubled, so that scores on both measures are out of 56.
Descriptive Statistics for Berg Balance Scale and Mini-BESTesta
Frequencies of scores across Mini-BESTest and Berg Balance Scale (BBS) at baseline.
All 47 participants who completed the study showed improvements in Mini-BESTest scores. Twenty-seven participants (57.4%) showed improvements in BBS scores. Nineteen participants (40.4%) did not show changes in baseline BBS scores after the intervention; 16 of these 19 participants still had the maximum BBS score of 56.
There were statistically significant improvements in the Mini-BESTest score (X̅=5.31, SD=3.5) (P<.01) and in the BBS score (X̅=1.4, SD=1.9) (P<.01) after treatment. The effect sizes for the Mini-BESTest and BBS changes were 0.70 and 0.37, respectively. The standard response means for the Mini-BESTest and the BBS were 1.52 and 0.74, respectively. Receiver operating characteristic curve analysis could not be completed for the Mini-BESTest change score because all responses were positive.
With regard to the ability of the BBS change to detect a participant's impression of a balance change (a participant whose GRC score indicated improvement), the AUC was 0.56 (not significant) (P=.59). With regard to the ability of the Mini-BESTest and the BBS at baseline (N=52) to distinguish between participants who used a mobility device and those who did not, the AUCs were 0.88 (P<.01) and 0.88 (P<.01), respectively. A Mini-BESTest score below a cutoff of 19.5 classified participants as requiring a mobility aid with a positive likelihood ratio of 4.53 (95% CI=2.15, 9.54) and a negative likelihood ratio of 0.18 (95% CI=0.06, 0.53). A BBS score below a cutoff of 54.5 classified participants as requiring a mobility aid with a positive likelihood ratio of 3.89 (95% CI=1.97, 7.67) and a negative likelihood ratio of 0.19 (95% CI=0.07, 0.55).
Near falls were reported by 65.4% of participants (34/52). With regard to the detection of self-reported near falls, the AUCs for the baseline Mini-BESTest and BBS scores were 0.77 (P<.01) and 0.75 (P=.03), respectively. A Mini-BESTest score below a cutoff of 22.5 classified a participant nearly falling with a positive likelihood ratio of 2.86 (95% CI=1.33, 6.14) and a negative likelihood ratio of 0.19 (95% CI=0.07, 0.55). A BBS score below a cutoff of 55.5 classified a participant nearly falling with a positive likelihood ratio of 1.89 (95% CI=1.02, 3.49) and a negative likelihood ratio of 0.43 (95% CI=0.22, 0.85). A history of falls was reported by 28.8% of participants (15/52). With regard to the detection of self-reported falls, the AUCs for the Mini-BESTest and BBS scores were 0.65 and 0.592, respectively (not significant) (P=.092 and P=.302, respectively).
Table 3 shows the frequencies of GRC scores for both the therapist and the participant (n=47). There was no statistical difference in the means of either Mini-BESTest or BBS scores between the GRC categories (analysis of variance). There was a trend for increasing participant GRC balance rating scores and therapist GRC balance rating scores to be associated with an increasing change in the Mini-BESTest balance score. This linear trend was not replicated with a change in the BBS.
Global Ratings of Change by Participant and Physical Therapist and Corresponding Balance Scoresa
The baseline mean Mini-BESTest score was significantly associated with the baseline mean BBS score (r=.78, P=.01). At baseline, both balance measures were significantly correlated with all 5 secondary outcome measures. The baseline Mini-BESTest score had a higher correlation coefficient for each secondary measure than did the baseline BBS score (Tab. 4).
Correlation Coefficients (Spearman Rho) for Baseline Mini-BESTest and Berg Balance Scale (BBS) Scores and Secondary Outcome Measuresa
The associations between changes in Mini-BESTest and BBS scores and changes in secondary measure scores are shown in Table 5. A change in the BBS score was not significantly associated with a change in any of the secondary measures. A change in the Mini-BESTest score was significantly associated with a change in the MSIS-29 (physical) score (r=.355, P=.014).
Correlation Coefficients (Spearman Rho) for Changes in Mini-BESTest and Berg Balance Scale (BBS) Scores and Secondary Outcome Measuresa
Discussion
Both balance measures revealed statistically significant changes in balance after a routine physical therapy intervention. The BBS had a higher ceiling effect, a smaller effect size, and a lower standard response mean than did the Mini-BESTest. Furthermore, the change in the BBS was not associated with the overall trend for a participant's impression of a balance change. Other studies in which the Mini-BESTest and the BBS were compared reported similar, more favorable findings for the Mini-BESTest in people with Parkinson disease13 and people with other, mixed neurological conditions. Therefore, the present study provides further evidence that the ability of the Mini-BESTest to detect a balance change after physical therapy is better than that of the BBS in people who have MS and who are predominantly independently mobile, with or without a unilateral aid.
Both balance measures were similarly able to distinguish people who reported near falls or used mobility devices but not people who reported falls. The cutoff points for walking aid use were similar to those of King et al,13 who compared the abilities of the BSS and the Mini-BESTest to detect people with and people without postural responses in Parkinson disease. They reported a cutoff point of 21 for the Mini-BESTest, yielding positive and negative likelihood ratios of 4.68 and 0.21, respectively, and a cutoff point of 52 for the BBS, yielding positive and negative likelihood ratios of 2.96 and 0.34, respectively. In another study,17 a cutoff score of 20 was reported for the Mini-BESTest, yielding a positive likelihood ratio of 4.00 and a negative likelihood ratio of 0.25 for identifying people with a history of falls in Parkinson disease. Although we acknowledge the limitation of using self-reported falls and near falls in the present study, we suggest that the mobility device use cutoff values for both measures may be clinically useful. To our knowledge, the present study is the first to compare the abilities of both balance measures to detect near falls or mobility aid use exclusively in people who have MS and are ambulatory.
There was a positive, nonsignificant trend for participants' and physical therapists' GRC balance scores to be associated with an increasing Mini-BESTest score; such a trend was not seen for the BBS. Godi et al19 also demonstrated more favorable results in their sample of people with mixed neurological conditions. They found that the Mini-BESTest was better at determining people who reported a balance improvement on the GRC scale than was the BBS. This finding suggested that a change on the Mini-BESTest may be better associated with a person's impression of improved balance. Further studies with larger sample sizes and the GRC scale are needed to calculate the minimum clinically important difference for the Mini-BESTest.
There was a high correlation (r=.78) between the baseline Mini-BESTest score and the baseline BBS score. This finding concurs with that of King et al,13 who found a high correlation (.78) between the Mini-BESTest and the BBS in people with Parkinson disease. Both balance measures were significantly correlated with all 5 secondary outcome measures. These findings support the concurrent validity of both the Mini-BESTest and the BBS. The Mini-BESTest had a higher correlation for 2 secondary measures, the 6MWT and the MSWS-12, than did the BBS. Both the 6MWT and the MSWS-12 take into account balance during gait. These findings suggest that the Mini-BESTest may have a greater association with dynamic balance; this suggestion is supported by the fact that the Mini-BESTest takes into account additional systems required for postural stability in people with MS. Only a change in the Mini-BESTest was associated with a change in the MSIS-29 physical domain. The weak and nonsignificant associations between balance improvements and changes in other variables, such as walking, fatigue, and the psychological impact of MS, warrant further investigation.
Limitations
The sample in the present study was one of convenience (all participants ambulated independently, with or without an aid); therefore, the findings may not be generalizable to a wider population of people with MS. Retrospective data were collected for a history of falls and near falls. Prospective data would provide a more accurate account of fall status. It was not possible to calculate the minimum clinically important difference for the Mini-BESTest because of the small sample size. We also did not compare or investigate various reliability estimates for both measures. The same rater collected data for both outcome measures and may have been more positively biased toward the Mini-BESTest.
In conclusion, the Mini-BESTest had a lower ceiling effect, better responsiveness, and a slightly better ability to detect people who nearly fell and people who used mobility aids than did the BBS. The Mini-BESTest scores at baseline also were more correlated with the secondary measures than were the baseline BBS data; this result added to the validity of the Mini-BESTest. These findings suggest that the Mini-BESTest was a clinically useful measure of dynamic balance after physical therapy in people who had MS and were ambulatory, mostly without a mobility aid.
Footnotes
Ms Ross, Dr Uszynski, and Dr Coote provided concept/idea/research design. Ms Ross, Dr Uszynski, Dr Hayes, Ms Casey, Ms Browne, and Dr Coote provided writing. Ms Ross provided data collection and participants. Dr Purtill, Dr Uszynski, Dr Hayes, Ms Browne, and Dr Coote provided data analysis. Ms Ross and Dr Coote provided project management, facilities/equipment, and institutional liaisons.
The Research Ethics Committee of St. James Hospital approved this study.
Ms Ross was the recipient of the Chartered Physiotherapists in Neurology and Gerontology Research Bursary.
- Received August 17, 2015.
- Accepted February 13, 2016.
- © 2016 American Physical Therapy Association