[Editor's note: Both the letter to the editor by Franchignoni and Giordano and the response by Padgett and colleagues are commenting on the accepted but unedited author manuscript version of this article that was published ahead of print on June 7, 2012.]
We read with interest the article by Padgett et al1 in which they presented a new short version of the BESTest (Brief-BESTest) and compared some of its metric properties against the established BESTest2 and the Mini-BESTest.3 A few shortcomings of the authors' work and a misinterpretation of our article3 prompt us to respond.
Padgett et al explained the necessity of a new shortened version of BESTest, different from the Mini-BESTest, on the basis of anecdotal reports suggesting that the latter remains too lengthy (about 15 minutes) given increasing constraints on patient contact time in the clinic.1 Moreover, the authors felt the lack of a scale more in line with the theoretical objective of the BESTest, which is to provide a global assessment of multiple constructs that influence postural control. Thus, the authors created the Brief-BESTest, selecting the “most representative item” in each of the 6 balance domains listed in BESTest,2 based on item-total correlation.1 Although conceptually appealing, choosing only 1 item to cover each “domain” of postural control—as interpreted by the BESTest2—may be psychometrically unsound and needs to be strongly justified, as even in a short questionnaire 3 or more items usually are needed to define a construct or dimension4 (eg, as in the BESTest), especially if the new scale must be able to inform clinicians “to direct interventions on the basis of these impairments.”1
No dimensionality analysis, to our knowledge, has ever supported this 6-domain structure of the BESTest. Only when the dimensionality of the postural control system has been determined with appropriate samples and techniques may the researcher proceed to consider the creation and scoring of subscales (giving a profile instead of a summary measure) or the application of multidimensional models.5 In addition, the authors' statement “[t]he construction of the Mini-BESTest implies that postural control represents a single construct” is a misinterpretation of our position, which is clearly stated in our article on the Mini-BESTest3: “parts I ‘Biomechanical constraints’ and II ‘Stability limits’ of the BESTest warrant separate psychometric studies” because they “are also important facets of postural control, but appear to be independent of the construct ‘dynamic balance.’”
This means that comprehensive analysis of postural control probably requires a multidimensional approach, and the Mini-BESTest represents a tool with sound psychometric properties for measuring the dimension “dynamic balance.” Among these psychometric properties, as determined by Rasch analysis (RA), are unidimensionality and additivity: without them, the interpretation of the raw scale scores is ambiguous6 and potential confusion could arise in interpreting the meaningfulness of changes in the measure.7 Collaterally, the study confirms the robustness of the Mini-BESTest, even in a population designed to maximize the influence on postural control of mechanical constraints and limits of stability (which are not directly explored by the instrument). Another statement from the authors calls for a brief comment: “Alternative methods, such as classical test theory (CTT), therefore, may offer another approach to shortening the BESTest.” Rasch analysis and CTT are not alternative methods, but complement each other to provide thorough analysis of an instrument, each providing a wealth of psychometric information.8,9
Furthermore, we are not confident about the replicability of Padgett and colleagues' results. The case mix and the limited sample size of the cohort used to define the new scale prevent any generalization and substantial validation, with results that remain group- and test-dependent.7 No attempt was made to evaluate the robustness of the estimates, and the conclusion that the Brief-BESTest is superior to the Mini-BESTest in identifying “fallers” lacks a formal testing for differences between the measures. The absence of summary statistics adds to the uncertainty. To verify our doubts, we applied the item selection process proposed by the authors (based on item-total correlation) to the group of 115 patients with diverse neurological diagnoses and disease severity reported in our article on the Mini-BESTest.3
According to Padgett and colleagues' method, the results, as suspected, would have produced a different short version of the BESTest. In addition, in order to analyze in greater depth the psychometric performance of the Brief-BESTest and allow—among other things—a broader range of interpretation at item level, we analyzed the 8 items of the Brief-BESTest (embedded in the BESTest) with the same mix of CTT (factor analysis) and RA approaches used in our article on the Mini-BESTest. The main results were as follows:
-
The explanatory factor analysis showed the presence of at least 2 factors (with eigenvalues of 4.08 and 1.25) with a loading into the first factor of the first 2 items (hip abduction and forward functional reach) lower than 0.4. A similar observation could be inferred from the principal component analysis of the standardized residual performed by the RA.
-
Item 1 (hip abduction) misfitted the Rasch model, further confirmation of an “off dimension” item.
-
The person separation reliability of the scale was .76 (very low), with a Cronbach alpha of .84.
-
The average item difficulty estimates span only from −1.11 (functional reach) to +0.87 (one leg standing) logits, showing limited capability of the Brief-BESTest items to analyze a wide range of patient capabilities (for comparison, the Mini-BESTest item difficulties ranged from −4 to +2.5 logits).
-
Items 5 (leftward compensatory stepping) and item 6 (rightward compensatory stepping) showed local dependence (ie, item responses are related to each other when trait level is controlled), a situation known to produce inflated relevance of the items in a short scale.
In conclusion, all of these findings compare unfavorably with the Mini-BESTest, suggesting, at best, the need for further, more sophisticated analysis to develop alternate short forms of the BESTest. Due to the complexity of postural control, if we want to assess it in the clinical setting (or tailor a physical therapy program to the individual patient), we should not be too preoccupied about “increasing constraints on patient contact time in the clinic”: “hastily done is ill done.”
Footnotes
-
This letter was posted as a Rapid Response on July 9, 2012, at ptjournal.apta.org.
- © 2012 American Physical Therapy Association