Abstract
Background The Mini-Balance Evaluation Systems Test (Mini-BESTest) is a new balance assessment, but its psychometric properties have not been specifically tested in individuals with stroke.
Objectives The purpose of this study was to examine the reliability and validity of the Mini-BESTest and its accuracy in categorizing people with stroke based on fall history.
Design An observational measurement study with a test-retest design was conducted.
Methods One hundred six people with chronic stroke were recruited. Intrarater reliability was evaluated by repeating the Mini-BESTest within 10 days by the same rater. The Mini-BESTest was administered by 2 independent raters to establish interrater reliability. Validity was assessed by correlating Mini-BESTest scores with scores of other balance measures (Berg Balance Scale, one-leg-standing, Functional Reach Test, and Timed “Up & Go” Test) in the stroke group and by comparing Mini-BESTest scores between the stroke group and 48 control participants, and between fallers (≥1 falls in the previous 12 months, n=25) and nonfallers (n=81) in the stroke group.
Results The Mini-BESTest had excellent internal consistency (Cronbach alpha=.89–.94), intrarater reliability (intraclass correlation coefficient [3,1]=.97), and interrater reliability (intraclass correlation coefficient [2,1]=.96). The minimal detectable change at 95% confidence interval was 3.0 points. The Mini-BESTest was strongly correlated with other balance measures. Significant differences in Mini-BESTest total scores were found between the stroke and control groups and between fallers and nonfallers in the stroke group. In terms of floor and ceiling effects, the Mini-BESTest was significantly less skewed than other balance measures, except for one-leg-standing on the nonparetic side. The Berg Balance Scale showed significantly better ability to identify fallers (positive likelihood ratio=2.6) than the Mini-BESTest (positive likelihood ratio=1.8).
Limitations The results are generalizable only to people with mild to moderate chronic stroke.
Conclusions The Mini-BESTest is a reliable and valid tool for evaluating balance in people with chronic stroke.
Stroke is a major cause of disability and global disease burden.1 Dysfunction in balance control is one of the most common physical impairments observed after stroke.2,3 Compromised balance ability has been associated with reduced ambulatory function,4 poorer performance in activities of daily living (ADL),5 and restricted societal participation.6 Impaired balance also is a significant predictor of falls7 and long-term institutionalization.8
Much effort has been directed toward enhancing balance function in people with stroke.9–11 Balance control is complex and involves various aspects such as ability to maintain a body position, postural responses to external perturbations, anticipatory postural adjustments, and sensory integration.12 To obtain a clearer understanding of balance dysfunctions after a stroke and to better assess the effect of intervention programs, a standardized assessment of balance function is essential. Many clinical tools are available to assess balance in individuals with stroke.13,14 Some of the most commonly used balance assessment tools in stroke rehabilitation are the Berg Balance Scale (BBS),15 Functional Reach Test (FRT),16 Timed “Up & Go” Test (TUG),17 and one-leg standing (OLS).18,19 However, they are not without their limitations. For example, important aspects of dynamic balance control that reflect balance challenges during ADL are missing in the BBS.20 Leroux et al21 found that among ambulatory patients with chronic stroke, improvement in postural stability observed after exercise intervention was poorly correlated with change in the BBS score. On the other hand, OLS, FRT, and TUG, being single-task assessments, are unable to provide information on which postural control subsystem is dysfunctional and have a limited role in directing treatment.13 Significant floor or ceiling effects also have been identified in the BBS, OLS, and FRT.22–24 Furthermore, the BBS25,26 and TUG27 have been criticized for their limited ability to predict falls in people with stroke. Certain balance assessment tools that are specifically designed for people with stroke also have similar limitations. For example, the balance subscale of the Fugl-Meyer test28 has been shown to have significant floor effects.22
The Balance Evaluation Systems Test (BESTest) is a relatively new multi-task balance assessment developed to identify specific postural control problems (ie, biomechanical constraints, stability limits, postural responses, anticipatory postural adjustments, sensory orientation, dynamic balance during gait, and cognitive effects).20,29 However, this 36-item assessment takes 30 to 35 minutes to complete and may not be feasible in real clinical settings, where time constraint is often a major concern. A shorter version of the test, the 14-item Mini-BESTest, has recently been developed.20 It takes only 10 minutes to complete, and good intrarater and interrater reliability have been reported in a sample of people with mixed conditions.30 Recent studies further showed that the Mini-BESTest has good interrater and intrarater reliability and concurrent validity31,32 and is useful in predicting falls33,34 in patients with Parkinson disease (PD). However, the psychometric properties of the Mini-BESTest have not been specifically evaluated in the stroke population. Additionally, no study has evaluated the ability of the Mini-BESTest in distinguishing fallers from nonfallers among individuals with stroke. The current study was undertaken to (1) examine the reliability and validity of the Mini-BESTest and (2) compare the Mini-BESTest with 4 other balance measures based on the floor and ceiling effects and on sensitivity and specificity for distinguishing between individuals with and without a history of falls in a group of community-dwelling people with chronic stroke.
Method
Study Overview
This was an observational measurement study. Floor and ceiling effects, reliability (internal consistency, intrarater and interrater), and validity (concurrent, convergent, discriminant, known-groups) of the Mini-BESTest were assessed in a sample of people with stroke. To establish known-groups validity, a control group was included to enable us to assess the differences in Mini-BESTest scores between the stroke group and control group. The ability of the Mini-BESTest to distinguish between people with stroke with and without a history of falls also was examined and compared with that of 4 other balance measures (ie, BBS, TUG, OLS, and FRT). All of the raters involved in the study were physical therapists who had more than 10 years of relevant experience and were well trained to administer all of the balance assessment tools used in this study.
Participants and Sample Size Calculations
Participants were recruited during the period June 2009 and December 2010. Individuals with stroke were recruited from a local rehabilitation center and community self-help groups on a volunteer basis (ie, convenience sampling). Each participant was interviewed during the first assessment session. Ability to understand verbal instructions was one of the inclusion criteria. An individual was considered to have fulfilled this criterion if he or she managed to carry out a normal conservation with the assessor. Other inclusion criteria for the stroke group were: a diagnosis of stroke for more than 6 months, community-dwelling, and aged 18 years or older. The exclusion criteria were: pain during performance of daily activities, neurological conditions in addition to stroke, other conditions that affect balance (eg, Ménière disease), and any other serious illnesses that precluded participation. Control individuals were recruited from the community for comparison. The eligibility criteria were the same as those used in the stroke group, except that the control participants did not have a history of stroke. All participants provided written informed consent before enrollment in the study. All procedures were conducted in accordance with the Declaration of Helsinki.
All sample size calculations were done prior to enrollment of participants and were based on an alpha level of .05 (2-tailed) and a power of 0.8 (NCSS and PASS 2005, NCSS LLS Co, Kaysville, Utah). For reliability analysis, a coefficient of .75 or greater was generally considered to be acceptable.35 Leddy et al32 found that the Mini-BESTest had excellent intrarater and interrater reliability in people with PD, with intraclass correlation coefficient (ICC) values of .92 and .91, respectively. A similar reliability coefficient was expected in this study. Thus, the acceptable reliability and expected reliability was set at ICC=.75 and ICC=.90, respectively.32 For establishing interrater reliability between 2 raters, a sample of 26 patients with stroke was required. As establishing intrarater reliability required 2 assessment sessions, a 10% attrition rate was estimated, yielding a minimum sample of 30 participants.
A study by King et al31 showed a strong correlation between the Mini-BESTest and the BBS in patients with PD (r=.79; large effect size). Therefore, for analysis of concurrent and convergent validity, a large effect size was expected when the Mini-BESTest was correlated with other balance and related measures in individuals with stroke. Using the conventional value of a large effect size (r=.5) in the sample size calculation,35 the minimum number of participants required for the analysis of concurrent validity would be 26.
The Mini-BESTest scores obtained from the stroke group were compared with those from the control group to establish known-groups validity. Horak et al29 compared the BESTest total score between patients with different balance problems (X̅=74.5, SD=9.0) and controls without disabilities (X̅=90.6, SD=4.8), and the effect size was large (Cohen d=1.8). We expected the Mini-BESTest to also have good ability to discriminate between the 2 groups. Using the conventional value of a large effect size (Cohen d=0.8) for calculation,35 a minimum of 26 participants per group would be required for this analysis.
We also were interested in determining whether the Mini-BESTest scores and other balance tests could differentiate people with stroke with and without a history of falls. Receiver operating characteristic (ROC) curve plots were used for this analysis.35 An area under the curve (AUC) value of 0.7 to 0.8 was generally considered to be acceptable.36 Duncan et al34 showed that the Mini-BESTest had good ability to identify fallers among patients with PD, with an AUC value of 0.86. The acceptable and expected AUC values thus were set at 0.7 and 0.9, respectively.36 Previous studies in community-dwelling individuals with stroke demonstrated a fall rate of 23% to 73%.7,37–39 Assuming that the proportion of fallers was 30% in our stroke group, a minimum of 60 individuals with stroke (fallers: n=18; nonfallers: n=42) would be required for ROC curve plots. In summary, a minimum of 60 and 26 individuals would be recruited from the stroke and control groups, respectively.
Procedure
Stroke group.
In the initial assessment (session 1), relevant demographic data (eg, age, medical history) and fall history were obtained from interviewing the participants. To calculate body mass index (BMI, in kg/m2), height (in meters) and weight (in kilograms) were measured with a stadiometer (Health O Meter, Alsip, Illinois). Each participant was evaluated with the Mini-BESTest, 4 additional balance assessments (BBS, FRT, OLS, and TUG) and other measures (Chedoke-McMaster Stroke Assessment, Modified Ashworth Scale [MAS], Activities-specific Balance Confidence [ABC] Scale, Abbreviated Mental Test [AMT], Geriatric Depression Scale–short form [GDS], and Oxfordshire Community Stroke Project Classification). Either rater 1 or rater 2 conducted the assessments in session 1.
The first 30 participants assessed by rater 2 in session 1 also were evaluated with the Mini-BESTest a second time by another independent rater (rater 3) in the same session. Whether rater 2 or rater 3 administered the Mini-BESTest first was determined randomly by drawing lots. Intermittent rest periods were given throughout the session. The typical duration of session 1 was 2.5 hours, including the rest periods. Interrater reliability of the Mini-BESTest was determined by comparing the scores given by raters 2 and 3 in session 1.
The 30 participants with stroke who were evaluated for interrater reliability also participated in the intrarater reliability experiments. A second assessment session (session 2) was held within 10 days after session 1. The participants did not receive any physical therapy intervention during the period between sessions 1 and 2. In session 2, each of the 30 participants was evaluated with the Mini-BESTest once by rater 2. Session 2 was typically 20 minutes in duration. Intrarater reliability was established by comparing the Mini-BESTest scores given by rater 2 in sessions 1 and 2.
Control group.
The participants in the control group underwent one assessment session conducted by rater 1. Demographic data (eg, age, medical history), height, and weight were obtained using the same methods as in the stroke group described above. The Mini-BESTest was administered once. Comparing the Mini-BESTest scores of the control group with those of the stroke group would be useful in determining the known-groups validity. No other measures were administered to the control group.
Measures
Fall history.
Information on fall history was obtained through interview of participants. Those who had experienced one or more falls in the previous 12 months were considered to have a positive fall history.
Mini-BESTest.
The Mini-BESTest is a 14-item performance-based measure of balance disorders. The tasks involved varied in difficulty and covered different balance subsystems, including responses to external perturbations, anticipatory postural adjustments, stability in gait, and sensory orientation. Each task was rated from an ordinal scale of 0 to 2. Items 3 (stand on one leg) and 6 (compensatory stepping correction in lateral direction) assessed both sides, and only the side with a lower score was used for calculating the total score.20 When reporting the item scores, however, the results of both the paretic and nonparetic sides were shown for these 2 items. The total score ranged from 0 to 28, with higher scores denoting better balance ability.
Other balance measures.
The BBS is a 14-item assessment of functional balance. Each task was rated from 0 to 4, yielding a possible maximum total score of 56. Higher scores are indicative of better balance.15 The BBS has shown good interrater and intrarater reliability (ICC>.90) and concurrent validity (correlation with Postural Assessment Scale for Stroke Patients: r=.92–.95) in individuals with stroke.15,22,40
The FRT measures balance by assessing the limit of stability.16 The maximum distance (in centimeters) an individual could reach forward beyond arm's length on a fixed base of support was measured. Its interrater reliability (ICC=.99) and validity (correlation with the BBS: r=.619) in people with stroke are well established.40 A score of 0 cm was given for participants who were unable to maintain the standing position without external support.
The OLS test measures the time (in seconds) an individual can stand on one leg (either side).18 Participants were asked to stand on one leg with eyes open and hands placed on the hips. Using a stopwatch, timing commenced when the foot left the ground and stopped when the same foot touched the ground, when the individual's hand swung away from the hips, or when OLS was maintained for a period of 1 minute. One-leg standing was tested on both sides in the current study. One-leg standing has shown good intrarater reliability (nonparetic side: ICC=.88, paretic side: ICC=.92) and significant correlation with the BBS (r=.65) in people with stroke.18 A score of 0 second was given for participants who were unable to maintain the standing position without external support.
The TUG measures the time (in seconds) an individual required to get up from an armed chair, walk 3 m with normal walking pace, turn around, walk back, and sit down again.17 Use of a walking aid was allowed if necessary. The TUG has shown good test-retest reliability (ICC=.96) and concurrent validity (correlation with Community Balance and Mobility Scale: rho=−.75) in individuals with stroke.41,42
Measures of other related functions.
The Impairment Inventory of the Chedoke-McMaster Stroke Assessment was used to assess the motor recovery of arm, hand, leg, and foot in the stroke group.43 Each of the 4 body parts was rated on a 7-point scale, with a higher score indicating better motor recovery. Good intrarater (ICC=.98) and interrater reliability (ICC=.97) have been reported in people with stroke.43
The MAS, a 6-point ordinal scale, was used for assessing muscle tone around the ankle joint of the affected leg (0=no increase in muscle tone, 4=part rigid in flexion and extension).44 The intrarater and interrater reliability of the MAS in people with stroke are well established (kappa>.8).44
The ABC Scale was used for measuring balance confidence.45 Participants were asked to rate their confidence in their balance associated with performing 16 listed daily tasks from 0% (absolutely no confidence) to 100% (fully confident). The average score of the 16 items was calculated. The ABC Scale has shown high test-retest reliability (ICC=.87) and concurrent validity (correlation with the BBS: ρ=.36 and with gait speed: ρ=.48) among individuals with chronic stroke.46,47
Other measures.
The Oxfordshire Community Stroke Project Classification was used to identify the clinical stroke subtypes.48 The intrarater agreement and interrater agreement for the classification was moderate to good, with kappa values of .48 to .83 and .54 to .64, respectively.49,50
The AMT was used to assess cognitive function.51 The AMT has shown good internal consistency (Cronbach α=.81), interrater reliability (ICC=.99), and concurrent validity (correlation with Mini-Mental State Examination: r=.86) among older adults.52 It also is able to differentiate between individuals with and without cognitive impairments (P≤.001).52
The 15-item GDS was used to indicate the severity of depressive symptoms (0–4=no depression, 5–10=mild depression, and ≥11=severe depression).53,54 The GDS has shown good test-retest reliability (ICC=.75) in people with stroke.54
Data Analysis
All statistical analyses were performed using SPSS 18.0 software (SPSS Inc, Chicago, Illinois), unless otherwise indicated. The significance level was set a priori at ≤.05.
Floor and ceiling effects.
The skewness (γ1) of the distribution of scores was first assessed for each balance measure. Positive skewness reflects a floor effect and negative skewness indicates a ceiling effect for the Mini-BESTest, BBS, OLS, and FRT, whereas the opposite is true for the TUG.31 R Statistical Software with Bootstrapping methods (version 2.15.2, Bell Laboratories, Murray Hill, New Jersey) was used to compare the degree of skewness in distribution of scores between the Mini-BESTest and other balance measures.31 To further explore the floor and ceiling effects, the proportion of participants with the lowest and highest possible scores was examined.23 Floor or ceiling effects greater than 20% were considered to be significant.23
Reliability.
Using the data obtained from the stroke group, the internal consistency of the Mini-BESTest was assessed by Cronbach alpha. Intraclass correlation coefficients were used to determine the intrarater (ICC [3,1]) and interrater (ICC [2,1]) reliability of the Mini-BESTest total score. An ICC >.75 is indicative of good reliability, and an ICC of .5 to .75 is indicative of moderate reliability.55 The kappa statistic was used to examine the intrarater and interrater reliability of each individual test item (kappa: .81=almost perfect agreement, .61–.8=substantial agreement, .41–.6=adequate agreement, .21–.4=fair agreement, and 0–.2=slight agreement).35 Using the intrarater reliability results, the minimal detectable change at the 95% confidence interval (MDC95) was computed using the following formula35:
The standard error of measurement (SEM) value of the Mini-BESTest total score was derived from the following formula35:
where Sx is the standard deviation of the Mini-BESTest total score and rxx is the reliability coefficient.
Validity.
For the stroke group data, the Spearman rho was used to examine the degree of association of the Mini-BESTest total scores (measured in the first session) with the following: (1) other established balance measures (ie, BBS, FRT, TUG, and OLS) (ie, concurrent validity), (2) instruments measuring attributes that supposedly are related to balance function (ie, Chedoke-McMaster Stroke Assessment leg and foot impairment score and ABC Scale) (ie, convergent validity), and (3) measures that assess unrelated characteristics (ie, GDS and AMT) (ie, discriminant validity).
In addition to assessing convergence and discrimination, another way to examine the construct validity of the Mini-BESTest was to evaluate the known-groups validity. A test with good known-groups validity should be able to distinguish individuals with good balance ability from those with poor balance ability. Comparisons of Mini-BESTest total and item scores were made between the stroke and control groups, and between participants with and without a history of falls in the stroke group, using the Mann-Whitney U test, as the total scores were not normally distributed (checked by Kolmogorov-Smirnov test) and the item scores were ordinal in nature. In Mann-Whitney U test, the between-group comparison was based on rank ordering of the raw scores.35 Considering the data of the 2 groups together, the scores were ranked from the smallest to largest. For example, the lowest score was assigned the rank of 1, and the next smallest value was assigned the rank of 2. When 2 or more scores were tied, they were each given the same rank, which was the average of the ranks they occupied. For example, if there were 3 scores with the smallest value, they occupied ranks 1, 2, and 3. Thus, they were each given the rank of 2 (the average of 1+2+3).35 The rank scores of each group then were summed and divided by the number of participants in the group to yield the mean rank score. A higher mean rank reflected an overall better balance ability as a group.
To further compare the Mini-BESTest with other balance measures in differentiating between people with stroke with and without a history of falls, ROC curves were constructed. The AUC derived from the Mini-BESTest data then was compared with that of other balance measures, using the chi-square test for comparing the areas under 2 or more correlated ROC curves (SigmaPlot version 12.3, Systat Software Inc, San Jose, California).56 For each ROC curve, the score that yielded the largest Youden index (sensitivity + [1 − specificity]) was chosen as the cutoff score. The positive and negative likelihood ratios (LR+ and LR−) and their 95% confidence intervals (95% CI) were computed using an online CI calculator.57 As 4 participants were unable to ambulate without manual assistance and thus did not complete the TUG, their data were not included for the comparison of skewness and AUC between the Mini-BESTest and the TUG.
Results
A total of 106 individuals with stroke (73 men, 33 women) and 48 controls (28 men, 20 women) participated in the study. The participant characteristics are shown in Table 1. Seventy participants (66.0%) in the stroke group did not require any walking aid for ambulation. Twenty-five individuals (23.6%) in the stroke group had a history of falls, 7 (6.6%) of whom were recurrent fallers (ie, 2 or more falls during the previous 12 months).
Characteristics of Participantsa
Four participants required physical assistance to ambulate and thus were unable to complete the TUG. Three individuals were unable to maintain the standing position without external support and were given a score of 0 for the OLS and FRT. There were no significant differences in any of the demographic variables (eg, age, proportion of men and women, BMI) between the stroke and control groups.
Score Distribution and Ceiling and Floor Effects
The score distribution of the Mini-BESTest within the stroke group is shown in Figure 1A, and those of the BBS, FRT, TUG, and OLS are shown in Figure 1B–F. We found that the Mini-BESTest had significantly less skewness than other balance measures (P≤.001), except OLS on the nonparetic side (P=.965) (Tab. 2). The proportion of participants with the lowest and highest possible Mini-BESTest scores was 0% and 0.9%, respectively. The BBS had the most severe ceiling effect, with 32% of the individuals achieving the highest possible score.
Score distribution of the balance tests. Frequency distributions of scores on the (A) Mini-Balance Evaluation Systems Test (Mini-BESTest), (B) Berg Balance Scale (BBS), (C) Functional Reach Test (FRT), (D) Timed “Up & Go” Test (TUG), (E) one-leg standing (OLS) (paretic side), and (F) OLS (nonparetic side) are shown. The data of 106 individuals with stroke are shown, except for the TUG, which was based on 102 participants with stroke only, as 4 participants were unable to walk without manual assistance.
Comparison of Mini-BESTest With Other Balance Measures: Floor and Ceiling Effectsa
Reliability Analysis
Thirty individuals with stroke participated in the reliability assessment. The Mini-BESTest demonstrated good internal consistency, with Cronbach alpha values of .89, .93, and .94 for raters 1, 2, and 3, respectively. Intrarater reliability of the Mini-BESTest total score was excellent (ICC [3,1]=.97, P≤.001), yielding an MDC95 value of 3.0 points. The Mini-BESTest total score also showed excellent interrater reliability (ICC [2,1]=.96, P≤.001). When the test items were analyzed separately, adequate to excellent intrarater and interrater reliability were found for all items (Tab. 3), except for item 5 (compensatory stepping correction in backward direction), item 6 (compensatory stepping correction in lateral direction), and item 8 (stand on foam surface with eyes closed), which showed fair reliability (kappa=.30–.40).
Intrarater and Interrater Reliability of the Mini-BESTesta
Validity Analysis
Concurrent validity.
In the stroke group, significant relationships were found between the Mini-BESTest total score and the BBS (rho=.83, P≤.001), FRT (rho=.55, P≤.001), OLS on the paretic side (rho=.83, P≤.001), OLS on the nonparetic side (rho=.54, P≤.001), and TUG (rho=−.82, P≤.001).
Convergent and discriminant validity.
In the stroke group, the Mini-BESTest total score was significantly correlated with the Chedoke-McMaster Stroke Assessment leg score (rho=.53, P≤.001) and foot score (rho=.64, P≤.001), MAS (rho=−.22, P=.02), and ABC Scale (rho=.50, P≤.001), but not with the GDS (rho=−.17, P=.08) and AMT (rho=.08, P=.42), thus demonstrating good convergent and discriminant validity.
Known-groups validity.
Significant differences in the Mini-BESTest total score and most individual item scores were found between the stroke and control groups and between fallers and nonfallers in the stroke group (Tab. 4).
Known-Groups Validity of the Mini-BESTesta
ROC curve analysis.
Receiver operating characteristic curves were constructed to assess the ability of the various balance measures to distinguish people with stroke with and without a history of falls (Tab. 5). The cutoff score for the Mini-BESTest was 17.5, and the ROC curve yielded an AUC of 0.64 (95% CI=0.51–0.77), a sensitivity of 64.0% (95% CI=44.5–79.7), and a specificity of 64.2% (95% CI=53.3–73.7). The associated LR+ and LR− values were 1.8 (95% CI=1.2–2.7) and 0.6 (95% CI=0.3–1.0), respectively. The AUC value of the Mini-BESTest then was compared with that of the BBS, TUG, OLS, and FRT. We found that the AUC of the Mini-BESTest was significantly smaller than that of the BBS (χ2=7.36, P=.01). The AUC of the Mini-BESTest was not significantly different from that of the TUG (χ2=0.05, P=.82), OLS on the paretic side (χ2=0.80, P=.37), OLS on the nonparetic side (χ2=0.01, P=.90), and FRT (χ2=0.48, P=.49).
Comparison of Mini-BESTest With Other Balance Measures: Differentiating Between Fallers and Nonfallers in the Stroke Groupa
Discussion
In this study, the psychometric properties of the Mini-BESTest for people with chronic stroke were examined. The ceiling and floor effects and ability of the Mini-BESTest to identify fallers among individuals with chronic stroke also were systematically compared with those of 4 other balance measures for the first time. The study showed that the Mini-BESTest is a reliable and valid measure of balance performance for community-dwelling individuals with chronic stroke, with no significant floor or ceiling effects. The association between the Mini-BESTest and fall history, however, is limited.
Score Distribution and Ceiling and Floor Effects
Our results showed that among the various balance measures, the Mini-BESTest has the least floor or ceiling effects, as indicated by both the degree of skewness and the proportion of participants with minimum and maximum possible scores. In contrast, a significant ceiling effect was found for the BBS (32.5%). Mao et al22 found a similar ceiling effect of the BBS among patients with chronic stroke (at 180 days after discharge) (28.8%). A study comparing the Mini-BESTest with the BBS in patients with PD also showed that the score distribution for the BESTest was significantly less skewed than that for the BBS.31 Our data revealed that the score distribution for the TUG demonstrated substantial skewness (Tab. 2), with almost half of our participants with stroke being able to complete the task within 15 seconds (ie, ceiling effect) (Fig. 1D). The BBS consists of a good number of relatively less demanding tasks such as sitting unsupported, standing unsupported, and moving from sitting to standing, whereas the TUG is a single-item assessment involving only moving from sitting to standing, walking, and turning. The majority of our participants, however, have regained their ambulatory function, thus leading to a ceiling effect. In contrast, the inclusion of more challenging tasks such as postural responses to external perturbations (items 4–6) and walking balance tasks (items 11–14) in the Mini-BESTest may have improved the discrimination between participants. The OLS (paretic side) showed considerable positive skewness, indicating a possible floor effect. It reveals that maintaining balance while standing on the paretic leg remains a very difficult task for many individuals with stroke, despite all of our participants being community-dwelling. Eighty-three (78%) of our participants with stroke had an OLS time of less than 5 seconds, and 14 (13%) of these individuals were even unable to perform the task (ie, score of 0 second) (Fig. 1E).
Reliability
The Mini-BESTest had high internal consistency (Cronbach alpha=.89–.94), indicating all of the items measure the same underlying attribute. The intrarater and interrater reliability of the Mini-BESTest also were excellent when administered to people with stroke, comparable to those of the BBS (intrarater=.92–.98, interrater=.93–.99),15,22,30,40 TUG (intrarater=.96),40 OLS (intrarater=.88–.92),18 and FRT (interrater=.99)40 previously reported in people with stroke. Our results are thus in line with those of Godi et al,30 who found that the Mini-BESTest had excellent intrarater reliability (ICC=.96) and interrater reliability (ICC=.98) in a sample of people with different balance disorders. Leddy et al32 also evaluated both the intrarater and interrater reliability of the Mini-BESTest, and their results obtained from patients with PD are similar to ours (intrarater=.88–.91, interrater=.91–.96). The MDC95 obtained in our study was 3.0 points, which represents the minimum difference that would reflect a real change in the mini-BESTest total score. Godi et al30 found a very similar MDC95 value (3.5 points) in their sample of participants with mixed conditions. The minimal detectable change established here would be useful for future stroke clinical trials in determining whether the experimental intervention has caused any real change in balance ability.
It is noted that item 5 (compensatory stepping correction in a backward direction), item 6 (compensatory stepping correction in a lateral direction), and item 8 (standing on a foam surface with eyes closed) showed fair reliability only. The discrepancies in scoring between the 2 testing sessions or between the 2 raters may have been partly due to the actual change in patients' performance. These 3 items represent the more challenging tasks, with the majority of participants attaining a score of only 0 or 1 at initial assessment (Tab. 4). A patient's performance of these tasks thus might be more variable with repeated testing. For the compensatory stepping reaction tests (items 5 and 6), the lower agreement in scores also might be related to the consistency of the therapist in applying the displacement. A slight increase or decrease in magnitude of the displacing force applied by the therapist might elicit a very different balance response from the patient.
Validity
We found that the Mini-BESTest total score was significantly associated with other established balance measures (BBS, OLS, FRT, and TUG) and other measures evaluating related concepts (lower-limb motor recovery, ABC Scale), but not with measures assessing different attributes (eg, GDS, AMT), thus demonstrating good concurrent, convergent, and discriminant validity, respectively. Our results are in agreement with King et al,31 who found a strong association of the Mini-BESTest with the BBS (r=.79) and Unified Parkinson's Disease Rating Scale motor score (r=−.51) among patients with PD. The results showed that the Mini-BESTest total score was able to separate people with different balance abilities (ie, known-groups validity), as indicated by the significant difference in scores between the stroke and control groups and between people with stroke with and without a history of falls. Our results concord with the findings of King et al,31 who showed that the Mini-BESTest can effectively distinguish between individuals with and without postural response deficits as defined by the Hoehn and Yahr scale.
When comparing the ROC curves, however, the results show that the Mini-BESTest (AUC=0.64, 95% CI=0.51–0.77), similar to the TUG (AUC=0.66, 95% CI=0.53–0.80), OLS on the paretic side (AUC=0.67, 95% CI=0.54–0.80), OLS on the nonparetic side (AUC=0.64, 95% CI=0.52–0.77), and FRT (AUC=0.67, 95% CI=0.55–0.79), has a limited association with fall history (AUC <0.7). Only the BBS showed a reasonable AUC value of 0.72 (95% CI=0.61–0.83), which was significantly greater than that of the Mini-BESTest. Whether this statistically significant difference in AUC was clinically meaningful will need further study.
The limited association of the Mini-BESTest with fall history in people with stroke may be explained by several reasons. First, it is well known that the causes of falls are multifactorial. Many factors other than balance ability, both intrinsic and extrinsic, may contribute to falls after stroke.58 For example, Harris et al27 found that ambulatory individuals with stroke who attained a low BBS score and used a wheelchair or walker for longer distances had lower risk for falls compared with those who had a higher BBS score and only used a cane for ambulation. Apparently, the relationship between balance and falls is not linear and involves the interplay of many other factors. This possible explanation may partly explain why balance assessment tools, when used alone, may not be effective in predicting falls in people with stroke. Indeed, a number of previous studies have shown that various balance assessment tools commonly used in stroke rehabilitation, such as the BBS and TUG, have limited ability to predict falls after chronic stroke.25–27,59 Second, the fall data were collected retrospectively, which is more susceptible to recall problems and bias than when a prospective design is used for fall data collection. For example, a fall that occurred earlier in the period (eg, 10 months previously) may not be reported compared with a fall that occurred more recently (eg, 2 weeks previously). One may not recall a fall that was relatively inconsequential compared with a fall that necessitated medical attention. Further study should assess the utility of the Mini-BESTest for predicting future falls in patients with stroke.
Our results are in contrast to the findings of Duncan et al,34 who examined the relationship between the Mini-BESTest and recurrent falls during the previous 6 months (retrospective) and future 12 months (prospective) in a sample of 80 patients with PD. Their results showed a strong association of the Mini-BESTest with recurrent falls, both retrospectively and prospectively. The AUC values reported were 0.77 to 0.86, with a sensitivity of 0.62 to 0.88, a specificity of 0.74 to 0.78, an LR+ of 2.4 to 4.0, and an LR− of 0.15 to 0.52. The discordance in results between their study and ours may be explained by the different study population and research methods. Patients with PD were used in their study, whereas our sample consisted of only people with chronic stroke. In their study, the Mini-BESTest was used to predict recurrent fallers (those who experienced 2 or more falls), whereas the faller group included both single and recurrent fallers in our study. The fall rate reported also was higher in their study. The proportion of fallers in our study was 23.6%, and only 6.6% were recurrent fallers, whereas 27.5% and 32.5% of their study participants reported recurrent falls in the previous 6 months and the 12-month follow-up period, respectively. The lower fall rate may be due to several factors. First, our sample was relatively young (mean age=57.1 years). The time since the onset of stroke was more than 6 months for all of our participants (median=2.9 years). Thus, they likely had developed compensatory strategies in their adaptation to a chronic and presumably more stable condition. In contrast, the patients with PD in the study by Duncan et al34 were older (mean age=68.2 years) and were coping with a disease that was progressive in nature.
Limitations and Future Research Directions
This study has several limitations. First, because the participants in the stroke group were community-dwelling and most were ambulatory, the results are generalizable only to people with similar characteristics. Further research is needed to validate the Mini-BESTest in people who are in acute or subacute stages of stroke recovery, severely impaired, or institutionalized. Second, the ability to carry on a normal conversation was used as an eligibility criterion, but it may not be equivalent to being able to follow directions. Perhaps a cutoff score of a standardized assessment of cognition should have been used to determine eligibility. Third, the actual number of enrolled participants was higher than that derived from the sample size calculation described in the “Method” section. We received an overwhelming response, and a large number of people volunteered to participate in our study. As there were no substantial budgetary concerns, we decided to measure all volunteers who were eligible. Although the power analysis a priori helped us to determine the minimum sample size required to detect significant findings, a larger sample size presumably would have further increased the statistical power of the study. Indeed, with the current sample size of 106 people with stroke, the power was increased to 0.95, if the alpha level (.05) and acceptable and expected AUC (0.7 and 0.9, respectively) remained the same as originally planned.
We also acknowledge that other clinical balance scales are available for patients with stroke, including the Postural Assessment Scale for Stroke Patients, Trunk Control Test, and many others,14,28,60–62 but were not used for comparison with Mini-BESTest in this study. We selected only the most commonly used balance assessment tools in stroke rehabilitation and research for comparison. In addition, feasibility of the study and patient fatigue would be concerns if more balance tests were added to the assessment battery. Another interesting research question has to do with the responsiveness of the Mini-BESTest. Godi et al30 found that the Mini-BESTest is more responsive to change in balance ability than the BBS in a sample consisting of patients with different balance disorders. Is the Mini-BESTest more responsive than other balance measures in detecting treatment effects among individuals with stroke at different stages of recovery? Further study is needed to address this interesting and important question.
Overall, although the association of fall history with the Mini-BESTest is limited, the Mini-BESTest remains a better option than other balance measures used in this study to assess balance function in community-dwelling people with chronic stroke who have mild to moderate neurological impairments, as it has excellent reliability and validity, with no significant floor and ceiling effects. Additionally, compared with single-item measures such as the TUG and OLS, the Mini-BESTest is useful in identifying specific postural control problems and directing treatment.
Footnotes
Ms Tsang and Dr Pang provided concept/idea/research design and project management. Ms Tsang, Mr Liao, and Dr Pang provided writing. Ms Tsang and Mr Liao provided data collection. All authors provided data analysis. Dr Pang provided fund procurement and facilities/equipment. Ms Tsang provided institutional liaisons. Ms Tsang, Dr Chung, and Dr Pang provided consultation (including review of manuscript before submission).
Ethics approval for the study was granted by the Ethics Review Committee of the Hong Kong Polytechnic University.
The preliminary data were presented in abstract format at the 21st European Stroke Conference; May 22–25, 2012; Lisbon, Portugal.
Mr Liao was supported by a full-time research studentship granted by the Hong Kong Polytechnic University.
- Received November 13, 2012.
- Accepted April 1, 2013.
- © 2013 American Physical Therapy Association