Abstract
Background Physical performance tests are important for assessing the effect of physical activity interventions in older people with dementia, but their psychometric properties have not been systematically established within this specific population.
Objective The purpose of this study was to determine the relative and absolute test-retest reliability of the 6-m walk test, the Figure-of-Eight Walk Test (F8W), the Timed “Up & Go” Test (TUG), the Frailty and Injuries: Cooperative Studies of Intervention Techniques–4 (FICSIT–4) Balance Test, the Chair Rise Test (CRT), and the Jamar dynamometer. These tests are used to assess gait speed, dynamic balance, functional mobility, static balance, lower-limb strength, and grip strength, respectively.
Design This investigation was a prospective, nonexperimental study.
Methods Older people with dementia (n=58, age range=70–92 years) performed each test at baseline and again after 1 week. Intraclass correlation coefficients (ICC), standard error of measurement (SEM), minimal detectable change (MDC), and log-transferred limits of agreement of Bland-Altman plots were calculated.
Results The relative reliability of the F8W, TUG, and Jamar dynamometer was excellent (ICC=.90–.95) and good for the 6-m walk test, FICSIT–4, and CRT (ICC=.79–.86). The SEMs and MDCs were large for all tests. The absolute reliability of the TUG and CRT was significantly influenced by the level of cognitive functioning (as assessed with the Mini-Mental State Examination [MMSE]).
Limitations The specific etiology of dementia was not obtained.
Conclusions The physical performance tests evaluated are useful for detecting differences in performance between older people with mild to moderate dementia and, therefore, are suitable for cross-sectional or controlled intervention studies. They appear less suitable to monitor clinically relevant intra-individual performance changes. Future studies should focus on the development of more sensitive tests and the identification of criteria for clinically relevant changes in this rapidly growing population.
In the next few decades, the number of people with dementia will increase dramatically.1 Dementia does not only lead to cognitive deficits, but also to a decline in physical performance.2,3 Together, these declines will reduce the person's capacity to perform instrumental activities of daily living (eg, household activities) and eventually, the basic everyday activities (eg, bathing, eating, dressing).4 The ability to perform these activities is essential to a person's autonomy and, consequently, to his or her quality of life.5
Unfortunately, dementia cannot be cured, but the decline in physical performance can be slowed by physical activity interventions.6 Physical performance can be considered a construct that describes the basic abilities necessary to accomplish physically demanding tasks, with mobility, balance, and strength as the underlying domains.7 These domains can be evaluated by using speed measures or tasks that assess functional mobility,8–11 dynamic balance (eg, balance during walking),12,13 and static balance14 and tests that measure upper-limb15–17 and lower-limb18 strength.
In order to measure the effect of exercise on these 3 domains in people with dementia, a set of suitable and feasible tests is needed. Within the scope of the present study, “suitable and feasible” implies that the tests also need to be suitable for older people with varying degrees of cognitive impairment. Therefore, test instructions should be simple, and the tests easy to administer, perform, score, and interpret, as well as cost-effective. Crucially, the tests also need to be reliable to ensure that changes in test scores reflect changes in performance and are not caused by variability in the test. Apart from fatigue and learning effects, the reliability of such tests is assumed to be also influenced by the characteristics of the individual being assessed, such as age, sex, and level of cognitive impairment.11,19
In the current study, we evaluated the reliability of 6 widely used physical performance tests in older people diagnosed with dementia. Specifically, the focus of our investigations was on examining the tests with regard to their relative reliability (in terms of consistency of within-group position)11,20 and absolute reliability (as reflected in the degree of variation between repeated measurements).21,22 There were several reasons for this specific focus. First, there is evidence to suggest that cognitive impairment affects the reliability of different measurements.23 Second, there are few studies that have tested the reliability of common tests in our population of interest, with 2 studies solely examining their relative reliability in small and selective samples.22,24 The study by Ries and colleagues11 is the only study that systematically evaluated the reliability of functional mobility and endurance outcomes in older people with Alzheimer disease. The authors reported large between-subject variability and recommended minimal detectable change (MDC) scores at the 90% confidence interval (CI) to monitor performance and treatment outcomes.11
The 2-fold goal of our study accordingly was to investigate the relative and absolute test-retest reliability of 6 common physical performance tests gauging mobility, balance, and strength in a group of older people with dementia, while analyzing the effect of cognitive impairment on the reliability measures, and to provide and address the relevance of MDC scores for all outcome measures.
Method
Participants
Our study was approved by the local medical ethics committee. If individuals were eligible for participation, informed consent was obtained from their legal representatives. A total of 58 participants were recruited between 2009 and 2011 from 6 different nursing homes and 2 day care centers around the city of Groningen, the Netherlands. The study started within 2 months of the initial selection, during which time informed consent was obtained and assessments organized and scheduled. All participants were 70 years or older and diagnosed with dementia by the national Care Indication Center (CIZ), whose diagnosis and referral are mandatory in order to gain access to special geriatric care in the Netherlands. The diagnostic criteria from the CIZ are identical to the American Psychiatric Association's Diagnostic and Statistical Manual of Mental Disorders, fourth edition, (DSM-IV) criteria for dementia.25 Exclusion criteria were a score of 9 or lower on the Mini-Mental State Examination (MMSE)26 to prevent measurement errors based on the incapacity to adhere to the protocol,21,27 vision problems hampering mobility or test performance, a history of psychiatric illness (eg, schizophrenia or bipolar disorder), neurological illness (eg, stroke or epilepsy), alcoholism, systemic or other brain diseases that could account for the cognitive impairment, or the use of a wheelchair for mobility or physical problems that could potentially affect physical performance (eg, a sprained ankle or [severe] musculoskeletal disorders).
Physical Performance Tests
The participants performed the assessments of gait speed, functional mobility, and dynamic balance twice during each of the 2 test sessions, all without practice trials.
Gait speed was measured using the 6-m walk test,24 which requires participants to walk 6 m in a straight line at their normal pace. The use of assistive walking devices was allowed. The outcome measure was the mean duration of 2 attempts, converted to walking speed (m/s), with higher scores indicating better performance. The relative reliability of the 6-m walk test has previously been demonstrated to be excellent (intraclass correlation coefficient [ICC]=.92) in older women with moderate dementia (MMSE=17.79, SD=7.17).24
Dynamic balance was assessed with the Figure-of-Eight Walk Test (F8W),12,28,29 which requires participants to walk 2 laps of a standard, 10-m-long course shaped like a figure eight (with 15-cm-wide contours). They are instructed to walk and follow the contours as fast and accurately as possible.13 The fastest of 2 attempts, and thus the best performance, was noted.30 To our knowledge, the reliability of the F8W has not been investigated in older people with dementia, but 2 previous studies did demonstrate that in older people who were cognitively healthy, its relative reliability was excellent (ICC=.92, and ICC=.98, respectively).31,32
Functional mobility was evaluated with the Timed “Up & Go” Test (TUG),10 requiring participants to stand up from a chair, walk 3 m, turn around, walk 3 m back, and sit down again in the same chair, all at their normal pace. The use of hands and normal walking aids was allowed. The outcome measure was the mean (in seconds) of 2 trials, with faster scores indicating better performance. The TUG is reliable and valid for quantifying functional mobility10,33 and has been found to be reliable in older people with Alzheimer disease (ICC≥.95; standard error of measurement [SEM]=2.48; minimal detectable change [MDC]=4.86).11 We included the TUG to allow comparison with the study by Ries et al.11
Static balance was gauged with the Frailty and Injuries: Cooperative Studies of Intervention Techniques–4 (FICSIT–4).14 The participants were asked to adopt 4 different stances (ie, parallel, semi-tandem, tandem, and single-leg stances) with their eyes open and without assistive devices and to try to maintain each stance for 10 seconds, with stances being sequentially adopted. The FICSIT–4 scale score ranges from 0 to 5 (0 for unsuccessful and 1 for successful parallel stance, 2 for semi-tandem stance, 3 if parallel stance was maintained less than 10 seconds, 4 for parallel stance, and 5 for single-legged stance). If a participant maintained the parallel or semi-tandem stance less than 10 seconds but more than 3 seconds, an additional 0.5 point was awarded.14 Higher scores thus indicate better performance. The FICSIT–4 showed moderate reliability (r=.66)14 in older people who were healthy, with pretests and posttests scheduled 3 to 4 months apart. To our knowledge, the scale has not been studied in older people with dementia to date.
Lower-limb strength was assessed with a modified version of the 30-second sit-to-stand test from the Senior Fitness Test.34 To prevent misinterpretation with the original test, we labeled our edition as the “Chair Rise” Test (CRT). We asked our participants to rise from the chair, stand up straight, and sit down again as often as possible within 30 seconds.18,34,35 To minimize anxiety, prevent differences in the execution of this test, and maximize between-subject comparisons, our participants (in contrast to the original protocol) were allowed to use their hands when rising. The total number of sit-to-stands34 constituted the outcome score, with higher scores indicating better performance. The original sit-to-stand test34 showed good relative reliability among older people who were cognitively healthy (ICC=.84 and ICC=.92, for male and female participants, respectively)18 and has, to our knowledge, not been studied in older people with dementia.
Grip strength was measured with a Jamar dynamometer (Sammons Preston Rolyan, Bolingbrook, Illinois). While standing and holding the dynamometer in their dominant hand, with the arm extended and the palm of their hand facing their leg, the participants were instructed to squeeze the grip as hard as possible. The strongest of 3 attempts (in kilograms) was recorded, with higher values reflecting better performance. The relative reliability of grip strength as measured with the Jamar dynamometer was earlier found to be excellent (ICC=.92)36 in elderly people without cognitive impairment, but moderate (ICC=.72) in older people with dementia.24
Global Cognitive Functioning
The participants' global cognitive abilities were assessed by the primary researcher (C.G.B.), who is a trained neuropsychologist, using the MMSE.26 All participants were assessed in the week prior to their first physical test. Scores on the MMSE range from 0 to 30, with a score below 10 being indicative of severe cognitive impairment and scores between 10 and 19 and between 20 and 24 reflecting moderate and mild cognitive impairment, respectively.37,38
Procedure
For the practical approaches to optimize the communication with our participants, we refer to the extensive description Ries and colleagues11 provided in their 2009 study of patients with Alzheimer disease. In short, creating a relaxed, pleasant atmosphere and using simple commands were key elements. Each assessment was first demonstrated to the patient, and, if necessary, cues or gestures were provided.11 To keep test conditions comparable, variations in staff training, time of day, location, and sequence of tests were kept to a minimum. To prevent bias, examiners were blinded from previous test scores and, if possible, for the level of cognitive functioning.
All participants performed the 6 physical tests in the same sequence at baseline and at the second session scheduled 1 week later. The tests were all administered at the patients' own nursing homes or day care centers by 5 trained bachelor degree and master degree students from the Human Movement Sciences program of the Center of Human Movement Sciences, University Medical Center Groningen, the Netherlands.
Two of the test sites had insufficient space for the F8W, and 12 participants did not perform this test. Another 6 participants were unable to perform the CRT due to arthritis, knee operations, or other knee problems. One participant could not perform the grip-strength test because of failure of the equipment.
Data Analysis
The data were analyzed using SPSS 16.0 for Windows (SPSS Inc, Chicago, Illinois) and Excel 2003 for Windows (Microsoft Corporation, Redmond, Washington). First, the data were analyzed for skewness, kurtosis, and heteroscedasticity using the Koenker test. When necessary (P<.05), the data were log transformed. Relative test-retest reliability was calculated with the ICC, which reflects the consistency to which the within-group position is maintained.11,20 The ICC was calculated using the 2-way, random, absolute agreement on single measures model with a 95% CI. An ICC above .70 is deemed sufficient for group comparison, but for individual monitoring, the ICC should exceed .90 to .95.39
Even with a high ICC, the trial-to-trial consistency of physical measurements can be poor, especially in heterogeneous data sets.20–22 Thus, we also considered their absolute reliability,21,22 which we calculated with the Bland-Altman 95% limits of agreement (LoA) and SEM.20,40,41 To facilitate interpretation of the results, the SEM is reported in the same quantity used for the original measurement (eg, kilograms for grip strength, meters per second for speed, seconds for time). It thus provides the range within which a participant's true score may fall.42 If the SEM is small, indicating high absolute reliability, the true score is close to the recorded score.20 The probabilities of the normal curve then can be applied to the SEM,11 meaning that, with a probability of 68%, the score on a next assessment will be within 1 SEM from the original score. Moreover, with a probability of 95%, the next score for the same participant will be within 2 SEMs from the first score. The following formula was used20:
The 95% CIs for the SEM were calculated as described by Stratford and Goldsmith43:
The abbreviations in the latter formula have the following meaning: SSE=the sum of squared errors in the analysis of variance (ANOVA) table; χ2α,dfe=the chi-square value for probability level α; and dfe=the degrees of freedom of the SSE provided in the ANOVA table.43 The square roots of these 2 values provide the borders for the 95% CI of the SEM.43
Finally, to be able to interpret changes in test scores, the MDC with 95% CI was calculated11:
The MDC is the required magnitude of observable change that exceeds the anticipated measurement error and within-subject variability.44 In other words, if a participant's score exceeds the value of the MDC, it can be said to reflect a true change in performance with 95% confidence.
The calculations were performed for the total group and stratified by level of cognition, distinguishing between participants with mild cognitive impairment (MMSE≥20) and those with moderate cognitive impairment (MMSE=10–19).37,38 No overlap in the CI of the ICC or the SEM was taken to indicate a statistically significant difference in performance scores for the groups with mild and moderate decline.45
For a visual inspection of the similarity between the 2 measurements, Bland-Altman plots were created with the LoA. For nonskewed data, the following formula was used to calculate the LoA46:
Mean difference ± 1.96 SD.
For skewed data, the following formula was used to calculate the LoA46:
with σER2 reflecting the residual-error variance.
Role of the Funding Source
The research was funded by the University Medical Center Groningen, Groningen, the Netherlands.
Results
Table 1 presents the characteristics of the 58 participants in the final sample. Seventeen participants were male, and 41 were female, with ages ranging from 70 to 92 years. No significant differences in age, sex, or the use of walking aids were found between the participants with mild cognitive impairment (MMSE=20–28) and those with moderate cognitive impairment (MMSE=10–19). However, the differences for place of residence were statistically significant.
Characteristics of the Participantsa
Table 2 presents the relative and absolute reliability values for the 6 physical performance tests for the total group. The relative reliability of the F8W, the TUG, and Jamar dynamometer was excellent (ICC>.90), and good for the 6-m walk test, the CRT, and the FICSIT–4 (ICC=.75–90). The width of the CI of the ICCs ranged between .05 and .20, with the TUG having the smallest CI and the FICSIT–4 having the largest CI. The absolute reliability of the tests, measured with the SEMs and MDCs, was large.
Descriptive and Reliability Measures of the Physical Performance Tests in the Study Group Based on a 1-Week Test-Retest Intervala
The Figure shows the Bland-Altman plots with the 95% LoA for the 6 tests calculated for the total group.40,46 The data of the F8W, the TUG, and the Jamar dynamometer were positively skewed and heteroscedastic, with higher means yielding higher variability, as is reflected by the wider LoAs. The data of the 6-m walk test, the CRT, and the FICSIT–4 were homoscedastic and, consequently, had a constant LoA.
Bland-Altman plots showing the levels of agreement for the heteroscedastic and the homoscedastic data for the 6 tests evaluated. The 2 measurements were 1 week apart. CRT=Chair Rise Test, FICSIT–4=Frailty and Injuries: Cooperative Studies of Intervention Techniques–4, nmb=number.
Table 3 lists the test scores and reliability values as a function of cognitive functioning (assessed with the MMSE). The CRT was the only test yielding a significant group difference, with participants with milder cognitive deficits achieving better scores. We found no significant between-group difference for relative reliability, but the absolute reliability of the TUG and CRT did show a significant difference, as reflected in their elevated MDCs. The MDC of the TUG was smaller (3.96 seconds) in participants with mild cognitive impairment versus those with moderate cognitive impairment (8.07 seconds). The MDC of the CRT was larger (4.21 stands) in participants with mild cognitive impairment versus those with moderate cognitive impairment (2.30 stands).
Baseline and Retest Outcomes (and Standard Deviations) and Reliability Values for the 6 Physical Performance Tests Stratified by Current Cognitive Functioninga
Discussion
The main goal of our study was to evaluate the relative and absolute reliability of 6 physical functioning tests in older people (70–92 years) with dementia, with a focus on tests gauging gait speed, dynamic balance, functional mobility, static balance, lower-limb strength, and grip strength. Additionally, we analyzed the effects of cognitive impairment on the reliability coefficients.
Relative Reliability
The results showed that the relative reliability was excellent for the TUG, F8W, and Jamar dynamometer (ICC>.90) and good for the 6-m walk test, CRT, and FICSIT–4 (ICC=.75–.90). The differences in relative reliability between the participants with mild cognitive impairment and those with moderate cognitive impairment were nonsignificant.
The values we obtained for the F8W, Jamar dynamometer, 6-m walk test, and CRT were similar to those earlier reported for similarly aged participants with24 and without18,31,36 dementia. The values we recorded for the TUG were somewhat lower than those Ries et al reported for patients with Alzheimer disease (ICC=.985–.988).11 It is likely that this disparity was caused by differences in the characteristics of the 2 patient groups. The percentage of female participants in our sample was higher than that in the study by Ries and colleagues.11
A study solely evaluating female patients with different subtypes of dementia showed lower relative reliability scores for the TUG (ICC=.87) and the dynamometer test (ICC=.70).24 In general, men are stronger and have more endurance than women, and by excluding male participants, the group becomes more homogeneous, decreasing the relative reliability of these tests. Accordingly, when male and female participants are considered as a single group, it causes an upward bias in the reliability coefficient.
The TUG, F8W, and Jamar dynamometer values exceeded the threshold for minimal acceptable reliability (ICC=.90) and thus may be useful for individual monitoring.11,39 However, for that goal, the absolute reliability also should be considered to establish the within-subject test-retest variability, which we do in the next section.
Given their lower ICC scores, the 6-m walk test, the CRT, and the FICSIT–4 do not appear suitable for individual performance monitoring. However, because all 6 tests exceeded the threshold for group comparisons (ICC>.70),39 they do seem suitable for use in cross-sectional or controlled intervention studies.
Absolute Reliability
The absolute reliability of a test provides an estimate of the precision of its outcome scores on repeated testing.47 The SEM and the MDC are easy to interpret because they are expressed in the same units as the original measure and, as such, are very useful for clinicians to determine individual improvement.42 They conveniently allow the 95% CI (2 SEMs) to be computed for the true score and the range in which a next score, from a stable participant, would be expected. The MDC is based on the SEM, but is more conservative (∼2.7 SEMs). If a score change is larger than the MDC, this difference is not caused by a measurement error or patient variability (with a probability of 95%).11 Because the MDC and SEM are so closely linked, this discussion will focus solely on the MDC.
To interpret the MDC correctly, the variance of the data should remain constant with increasing means (homoscedastic distribution). A homoscedastic distribution was true for the 6-m walk test, FICSIT–4, and CRT. It required an improvement of 0.27 m/s and an increase of 1.52 points for the MDCs of the 6-m walk test and FICSIT–4 to be exceeded. The absolute reliability of the CRT was influenced by the participants' level of cognitive impairment. Consequently, it took an improvement of 4.21 stands (mild cognitive impairment) or 2.30 stands (moderate cognitive impairment) to exceed the MDC. It is possible that the higher absolute reliability for the participants with moderate cognitive impairment is explained by a floor effect.
For the F8W, the TUG, and Jamar dynamometer, the variance did not remain constant with incremental means (heteroscedastic distribution; see Figure). Here, the MDCs should be interpreted more cautiously. Given the heteroscedastic properties of the data, the MDC increases with an increase of the mean (as is reflected by the V-shaped lines in the Bland-Altman plots in the Figure).46 This finding indicates that the participants who attained lower scores on these 3 tests showed less variability than their peers achieving higher scores. Consequently, for the F8W, TUG, and Jamar dynamometer, clinically relevant changes might not be detected as such (for low scores), or the importance of changes might be overestimated (for high scores). These problems should be kept in mind when interpreting their respective MDCs.
For the F8W to exceed the MDC, an improvement of 17.35 seconds was required, and improvement on the dynamometer test needed to be in excess of 7.59 kg. The results of the TUG were affected by the participants' cognitive abilities, requiring an improvement of 3.96 seconds for participants with mild cognitive impairment and 8.07 seconds for those with moderate cognitive impairment. The distinction on the TUG between participants with mild and moderate cognitive impairment is in line with the findings of a study among patients with Alzheimer disease.11
Although the MDC should facilitate the appraisal of individual improvement on certain tests, the large margins of improvement the tests appeared to require (eg, 7.59 kg for grip strength) warrant discussion of their practical relevance. The first issue we will address is whether it is realistic to expect increases in performance larger than the MDC. The second issue we will address is whether performance improvements lower than the MDC have any clinical relevance (which, ideally, should not be the case).
To address the first issue, the systematic review of Blankevoort and colleagues6 shows that only 1 study out of 16 showed a postintervention improvement larger than the MDCs for the TUG, the sit-to-stand test, and gait and balance abilities measured with the Tinetti scale.35 This finding suggests that improvements exceeding the MDC are not viable; thus, these tests are probably unsuitable to quantify treatment effects within this specific population.
Only a limited amount of information about clinical relevance is available. In a study of frail, older adults, among whom were patients with dementia, van Iersel et al concluded that an increased walking speed of 0.21 m/s reduced the (expert-rated) risk of falling.48 This value is below the MDC computed in our study (0.27 m/s), rendering gait speed, as measured with the 6-m walk test, a less suitable measure to detect changes of this magnitude in fall risk. The more sophisticated GAITRite (CIR Systems Inc, Sparta, New Jersey) walkway system yielded a smaller MDC (0.11 m/s)11 and might be more suitable to assess clinically relevant changes in gait speed. Van Iersel and colleagues also judged an improvement of 10.1 seconds on the TUG as clinically relevant.48 As this value is larger than the MDCs computed in our study, the TUG appears suitable to detect clinically relevant improvements of this magnitude (as judged by experts). Unfortunately, we were unable to compare our MDC findings on the other tests with the literature, as we did not find similar studies reporting clinically relevant improvements in older people with dementia.11,49,50
In summary, we conclude that the MDCs obtained for the 6 physical performance tests evaluated limit their applicability to detect individual improvements in older people with mild to moderate cognitive deficits in the targeted domains, as: (1) the increases in performance need to be very large to exceed the MDC, and (2) the MDCs may be too large to allow small, but clinically relevant, changes to be detected. Future research should focus on the development of more sensitive tests to monitor physical performance and identify criteria for clinical relevant changes in this population.
Limitations
This study has several limitations. First, we were unable to retrieve the etiologies (eg, Alzheimer disease or vascular dementia) of the dementia syndromes from the patients' medical records, as diagnoses were mostly reported as “dementia” or “dementia syndrome.” Six participants had MMSE scores higher than 24 (the cutoff for mild cognitive deficit). All 6 participants were attending geriatric adult day care. These findings mean that they had diagnoses of dementia according to the DSM-IV criteria, which is necessary for approval by the CIZ for participation in geriatric adult day care. More importantly, the MMSE is a global cognitive screening instrument and thus suitable to differentiate groups, but not appropriate to diagnose individuals.
Second, we modified elements of some of the original test protocols. For example, instructions were repeated if necessary, and hand use was allowed in the CRT, our equivalent of the sit-to-stand test. These adjustments may have influenced the comparative validities of the tests. Given the correlation between upper- and lower-extremity strength (r=.50), it is not likely that the use of hands had a large effect on the outcome of our CRT, although further research is necessary to determine the exact impact.
Third, our sample size was based on convenience, and a post hoc analysis showed that, for most tests, a sample of 50 individuals was required, but as 58 participants completed our test, this did not pose a problem.
Fourth, because the participants were tested at their place of residence and because examiners had to interact with the participants, the examiners could not be completely blinded from the level of cognitive functioning. The examiners did not, however, have any information regarding the MMSE scores of the participants at the moment of testing.
Finally, although the generalizability of our study appears adequate given the heterogeneity of the participants, its generalizability might be hampered by the limited geographical variability.
Conclusion
The relative reliability of the 6 physical performance tests—6-m walk test, F8W, TUG, FICSIT–4, CRT, and the Jamar dynamometer—was good to excellent. The tests are thus all applicable for cross-sectional and controlled intervention studies of older people with mild to moderate dementia. However, their MDC values were large, which seriously complicates the detection of clinically relevant changes in this population. Future research should focus on the development of more sensitive tests to assess and monitor physical performance in people with dementia and to define criteria for clinically relevant changes.
Footnotes
-
All authors provided concept/idea/research design and writing. Mr Blankevoort provided data collection. Dr Scherder provided data analysis. The authors thank all of the students, participants, and institutions for their cooperation.
-
The research was funded by the University Medical Center Groningen.
- Received May 20, 2011.
- Accepted September 4, 2012.
- © 2013 American Physical Therapy Association