Abstract
Background The modified Dynamic Gait Index (mDGI) measures the capacity to adapt gait to complex tasks utilizing 8 tasks and 3 facets of performance. The measurement stability of the mDGI in specific diagnostic groups is unknown.
Objective This study examined the psychometric properties of the mDGI in 5 diagnostic groups.
Design This was a cross-sectional, descriptive study.
Methods A total of 794 participants were included in the study: 140 controls, 239 with stroke, 140 with vestibular dysfunction, 100 with traumatic brain injury, 91 with gait abnormality, and 84 with Parkinson disease. Differential item functioning analysis was used to examine the comparability of scores across diagnoses. Internal consistency was computed using Cronbach alpha. Factor analysis was used to examine the factor loadings for the 3 performance facet scores. Minimal detectable change at the 95% confidence level (MDC95%) was calculated for each of the groups.
Results Less than 5% of comparisons demonstrated moderate to large differential item functioning, suggesting that item scores had the same order of difficulty for individuals in all 5 diagnostic groups. For all 5 patient groups, 3 factors had eigenvalues >1.0 and explained 80% of the variability in scores, supporting the importance of characterizing mobility performance with respect to time, level of assistance, and gait pattern.
Limitations There were uneven sample sizes in the 6 groups.
Conclusions The strength of the psychometric properties of the mDGI across the 5 diagnostic groups further supports the validity and usefulness of scores for clinical and research purposes. In addition, the meaning of a score from the mDGI, regardless of whether at the task, performance facet, or total score level, was comparable across the 5 diagnostic groups, suggesting that the mDGI measured mobility function independent of medical diagnosis.
The recovery of mobility, especially walking, is an important goal in both the neurologic and geriatric populations and, therefore, an essential part of rehabilitation.1–3 Walking in daily life requires the ability to adapt gait to a variety of complex tasks and environmental demands.2 The Dynamic Gait Index (DGI) is a commonly used clinical measure that evaluates the capacity to adapt gait to complex walking tasks encountered in everyday life.4 Many studies have examined the psychometric properties of the DGI in a number of patient populations, including those with stroke,5,6 Parkinson disease (PD),7 multiple sclerosis (MS),8–10 and vestibular dysfunction.11–13 Despite its psychometric strength, the original DGI had a number of limitations, including a scoring system that combined 3 aspects of performance (gait pattern, level of assistance, and time) into a single ordinal score, as well as a ceiling effect in high-functioning populations.11,14–16
A modified version of the DGI (mDGI) was recently developed by Shumway-Cook and colleagues.17 The mDGI retains the original 8 tasks but expands the scoring system to evaluate 3 correlated but unique aspects of walking performance: gait pattern, level of assistance, and time. This initial study investigated the psychometric properties of the mDGI in a sample of 855 adults with mobility limitations associated with a variety of neurologic diagnoses and 140 control participants with no neurological impairment. The mDGI demonstrated good psychometric properties in these 2 groups, including strong evidence for reliability (internal consistency, interrater agreement, and test-retest reliability); good internal validity; and evidence for discriminant validity at the individual task, performance, and total score levels. Factor analysis supported the unique contribution of the 3 aspects of performance, and Rasch analysis showed that the expanded scoring system enabled a greater range of measurement with minimal ceiling effect.
This previous research examined the mDGI in a large and diverse sample of people with mobility limitations due to a variety of neurologic diagnoses; however, the psychometric properties of the mDGI in individual patient populations are not yet known. Effective measurement requires that items within a scale have the same rank order of difficulty across different groups, which is referred to as item invariance. Item invariance is investigated using differential item functioning (DIF) analysis, which flags items within a scale that function differently in various populations. The presence of DIF suggests that items within a scale may vary in difficulty, depending on characteristics of the person being tested. Three levels of DIF are identified: negligible, slight to moderate, and moderate to large. Linacre and Wright18 have shown that a Rasch difficulty difference of 0.64 is equivalent to a moderate to large DIF level. A study by Dye et al11 used DIF to examine whether population characteristics affected the measurement characteristics of the original DGI in 117 patients with either dizziness or imbalance. The study showed minimal DIF associated with the characteristics of fall history, age, sex, and symptom (dizziness versus imbalance), which supports the item invariance of the original DGI for these 2 populations (patients with dizziness versus imbalance). However, the item invariance (DIF) of the mDGI in a broader category of diagnoses is not known. Thus, one goal of the current study was to examine item invariance using DIF in the mDGI across 5 patient groups.
In addition to item invariance, psychometric analysis must consider construct invariance across groups. Construct invariance means that scores have the same meaning across groups and do not vary as a function of patient characteristics (eg, age, sex, medical diagnosis). Construct invariance is investigated through factor analysis to determine whether the factor structure of the scale is the same across groups. The previous study on the mDGI established 3 correlated factors for the mDGI scores in a large and diverse population of patients with mobility limitations.17 The current study extended this research by examining construct invariance using factor analysis in 5 different patient populations that were a subset of the larger sample used in Shumway-Cook et al.17 It is possible that statistical results from a large heterogeneous sample mask results for specific diagnostic groups. The initial study had a large sample size (N=995) and a wide variety of neurologic diagnoses. The current study examined the degree to which results from this large and diverse sample applied to 5 specific patient populations within the larger sample.
Thus, the overall purpose of this study was to examine the measurement properties of the mDGI in 5 diagnostic populations (stroke, PD, vestibular dysfunction, gait abnormality, and traumatic brain injury [TBI]) in order to: (1) determine the item invariance of the mDGI using DIF analysis and (2) examine construct invariance of the mDGI scores, including internal consistency and independence of performance facet scores, using factor analysis.
Method
An in-depth review of the methods used to investigate the reliability and validity of mDGI was previously reported.17 A brief overview of the study's methods is presented here.
Recruitment
An e-mail was sent via the American Physical Therapy Association's Section of Neurology listserve to recruit potential clinical sites. From within each of these sites, participants with neurologic impairments currently receiving physical therapy for balance and mobility problems were evaluated using both the DGI and mDGI. A potential participant needed to be able to walk 6.1 m (20 ft) without physical assistance of another. The use of an assistive device was permitted. For the control cohort, a convenience sample of adults was recruited from volunteers responding to a flyer posted in the University of Washington Department of Rehabilitation Medicine and in retirement communities in the greater Seattle-Bellevue area. Inclusion criteria for the control cohort included: aged between 15 and 99 years, have no neurologic diagnosis, able to walk without the physical assistance of another person for a distance of 6.1 m, not currently receiving physical therapy, and able to give informed consent. Participants provided written informed consent prior to testing.
The analyses presented in this article used a subset of data (794 participants) presented in a previous publication.17 Previous research suggests that DIF detection is difficult with samples sizes less than 100.19,20 However, it has been reported that sample sizes of 60 are adequate to identify moderate DIF.21 Therefore, 5 diagnostic groups with at least 60 participants were included in the analyses. These diagnostic groups included participants diagnosed with stroke, PD, vestibular dysfunction, TBI (including concussion and closed head injury), and gait abnormality, defined as patients referred for physical therapy under the International Classification of Diseases, ninth edition (code 781.2: gait abnormality) and who had no other neurologic diagnoses.
Modification of the DGI
All 8 items from the original DGI were retained with minor modifications made to distance ambulated (6.1 m) and 4 test items (change of pace, stepping over obstacles, pivot turn, and stairs).17 The original scoring system was modified to establish ordinal scores for 3 separate aspects of walking performance: gait pattern (0–3), level of assistance (0–2), and time, which also was converted to an ordinal scale (0–3). Performance scores at the task level were calculated by adding the scores for time, gait pattern, and level of assistance, resulting in a score ranging from 0 to 8 for each of the 8 tasks. A total score for each of the 3 aspects of performance was calculated, characterizing walking performance with respect to time (range=0–24), gait pattern (range=0–24), and level of assistance (range=0–16). A total score for the mDGI was calculated by combining the 3 performance scores, for a total score range from 0 to 64.
DIF
For the purpose of this article, item score is the term used to describe mDGI task subscores (time level, gait pattern, and level of assistance scores for each of the 8 tasks). Differential item functioning analysis was used to investigate item score invariance across groups and to determine whether the 8 mDGI tasks and the 3 performance facet scores function in the same way for the control cohort and the 5 diagnostic groups. Total task scores (range=0–8) were calculated by summing the 3 performance facet scores for each task (eg, time level score [0–3], gait pattern score [0–3], and level of assistance [0–2]). For the DIF analysis, item scores were compared between the control cohort and each of the 5 diagnostic groups, as well as between each diagnostic group and the other 4 diagnostic groups. Given the large number of DIF comparisons in this analysis (ie, 24 subscores × 2 comparisons per diagnostic group × 5 diagnostic groups, for a total of 240 comparisons), the likelihood of identifying DIF by chance was increased dramatically. If experiment-wise type 1 error rate is .05, we could expect to see 5% of the comparisons flagged for DIF by chance alone. To minimize type 1 error (the identification of DIF by chance alone), the following 2 criteria were used to identify items with moderate to large DIF: (1) the standard criterion for identification of moderate to large DIF is a contrast in Rasch item difficulty (c) ≥0.64,18 and (2) to control for experiment-wise error, we set the alpha level at P<.001.
Factor Analysis
The factor analysis procedure uses correlations among scores from different variables to identify variables that cluster together.22 Each cluster represents a “latent” factor that “causes” an examinee's performance on the variables in the cluster. A latent factor is a theoretical underlying factor hypothesized to influence a number of observed variables.23 The number of latent factors in a set of data is determined using a statistic called an eigenvalue. To decide how many latent factors define the correlations among all variables, 3 criteria are generally used: eigenvalues >1, percentage of total variance >60%, and scree plots, which visually demonstrate whether additional factors contribute to the meaning of the results. Because scree plots are a plotting of the eigenvalues, using scree plots is redundant and was not done in this study. However, the first 2 criteria were used. Factor loadings are correlations between variable scores and scores from the latent factor. When researchers state that a variable “loads” on a factor, they mean that scores from the variable are highly correlated with the latent factor.
We used SPSS version 19.0 (SPSS Inc, Chicago, Illinois) to conduct factor analyses for each group with a sample size larger than 60. We conducted an exploratory factor analysis with oblique rotation as was done in the initial mDGI validity studies. Oblique rotation was applied in recognition of the inevitable intercorrelations among scores. We expected that both gait pattern and level of assistance were likely to affect the time taken for an individual to complete a task. Rather than allow the number of factors to vary, we set the factor number at 3 to assess whether the 3 factors found in the initial validity study, reported by Shumway-Cook et al,17 would generalize to the subgroups investigated in the present study. The patient population used in the study by Shumway-Cook et al17 included a larger sample size (N=855) and a wider variety of diagnostic groups. It is possible that the internal structure of scores for this large and heterogeneous sample masked a unique structure for a specific diagnostic group. Therefore, it was important to determine whether the factor structure found in the aggregate sample reported by Shumway-Cook et al17 also was found for each of the 5 diagnostic groups that were the focus of the current study. Thus, we investigated the percentage of variance explained, the correlations among factors, and the factor pattern loadings for each mDGI score in each of the 5 diagnostic groups.
Alpha Coefficients
To investigate whether mDGI scores are reliable for individuals in different diagnostic groups, we obtained alpha coefficients for each of the mDGI scores (mDGI total, facet total, and task total scores) for the 5 diagnostic groups.
Minimal Detectable Change
The minimal detectable change at a 95% confidence level (MDC95%) was calculated for mDGI total score, each of the 3 performance facet scores for the control cohort, and the 5 diagnostic groups using the reliability estimate for the alpha coefficients for each group. The MDC95% was calculated by first computing a classical standard error of measurement (SEM) for each score using the equation:
where sx is the standard deviation for the total or facet score and ax is the alpha coefficient (estimate of reliability) for the total score or facet score. The SEM can be used to identify a score band within which the true score of the examinee falls (a confidence interval). An interval of ±1.96 × the SEM × the square root of 2 is the band within which we have 95% confidence that an examinee's true score is likely to fall.24 Score changes outside of that interval are likely to be true changes.
Finally, we compared the percentage of individuals in each group scoring the highest possible score (ceiling effect) or the lowest possible score (floor effect) in the original DGI and the mDGI to determine whether the mDGI extended the range of measurement beyond the DGI.
Role of the Funding Source
This study was supported by a grant from the Walter C. and Anita C. Stolov Research Fund, Department of Rehabilitation Medicine, University of Washington.
Results
Participants/Sociodemographics
The analyses presented in this article used a subset of data (794 participants) presented in a previous publication17 and included individuals with stroke (n=239), vestibular dysfunction (n=140), TBI (including head injury and concussion) (n=100), gait abnormality (n=91), and PD (n=84), as well as a individuals with no neurological impairment (the control cohort) (n=140). Table 1 summarizes sociodemographics by diagnostic group. Mean age in the control cohort was 66 years, and mean age in the 5 diagnostic groups varied from 54 years (TBI) to 80 years (gait abnormality). Although none of the participants in the control cohort used a gait device, the percentage of participants in the 5 diagnostic groups using a gait device varied from 24% (PD) to 45% (stroke and gait abnormality). The percentage of participants reporting falls was 39% in the control cohort and varied from a low of 44% in participants with vestibular dysfunction to 68% in participants diagnosed with gait abnormality.
Sociodemographics of Samplea
DIF Analysis
Differential item functioning analysis was used to investigate whether mDGI item score difficulties were invariant across the 5 diagnostic groups. Two item difficulty comparisons were made for each diagnostic group: (1) between each diagnostic group and the control cohort and (2) between each diagnostic group and the performance of all other groups combined. The number of item scores demonstrating moderate to large DIF was <5% based on our criteria (c≥0.64 and P<.001). The percentage of scores demonstrating moderate to large DIF by diagnoses were: stroke <2%, vestibular dysfunction <3%, PD <1%, TBI 0%, and gait abnormality 0%. Not only was DIF minimal across the different comparisons, but there was no discernible pattern in the item scores that were flagged for moderate to large DIF in the comparisons within and across groups. No item score showed DIF in more than one comparison, nor was there any DIF within a diagnostic group that could be explained by characteristics of the given group. Given that DIF was <5% across all of the comparisons and the lack of discernible patterns in DIF, it was possible to assume task score invariance across groups and to proceed with the remaining analyses. (Specific DIF results are available from the authors.)
Performance Facet Analysis
Factor analysis.
Table 2 presents the factor pattern matrix from the exploratory factor analysis for the control cohort and 2 of the 5 diagnostic groups: stroke and vestibular dysfunction. For the control cohort, 2 factors had eigenvalues >1.0 and explained 84% of the total variance. For this cohort, both gait pattern and time loaded on the same factor and represented 16% of unique variance, while level of assistance was a separate factor representing 14% of unique variance. In contrast, the factor structure found for each of the 5 diagnostic groups was similar to that reported for the large and heterogeneous sample in the study by Shumway-Cook et al.17 Across these groups, 3 factors had eigenvalues >1.0: time, gait pattern, and level of assistance. The 3-factor solution explained between 74% and 79% of the total variability in scores across the 5 diagnostic groups. Correlations between the factors ranged as follows: between time and gait pattern, the range was r=.41 to r=.68; between gait pattern and level of assistance, the range was r=.29 to r=.60; and between level of assistance and time, the range was r=−.39 to r=−.63. Within each analysis, the 3 factors explained different amounts of unique variance. The unique variance explained by the time factor ranged from 11% to 13%; the unique variance explained by the gait pattern factor ranged from 9% to 12%; and the unique variance explained by the level of assistance factor ranged from 5% to 11%. These data suggest that the factor structure for the mDGI is the same for all 5 diagnostic groups. In addition, the results suggest that, although time and gait pattern could be used as proxies for one another in the control cohort, among individuals with disabilities, all 3 performance facet scores are needed to characterize mobility function.
Comparing the Factor Pattern Matrix for mDGI Scores for Control Cohort and 2 Diagnostic Groupsa
Internal consistency for task, performance facet, and total scores.
As shown in Table 3, the internal consistency estimates for mDGI total score and the 3 performance facet scores were quite strong, with alpha coefficients >.90, suggesting that these scores can be reliably used in each of the 5 diagnostic groups. Alpha coefficients for individual task scores were lower but still acceptable. Lower alpha coefficients for task scores were expected because each task score was a composite of scores related to gait pattern, time, and level of assistance.
Alpha Coefficients for Total, Performance, and Individual Task Scores Across Groupsa
Minimal Detectable Change
A comparison of MDC95% for the mDGI across the groups is summarized in Table 4. The average MDC95% for mDGI total score was 7.0 for all 5 diagnostic groups combined, with a range from 6.8 (PD) to 7.4 (stroke and TBI). The average MDC95% for time total score was 3.1 for all groups combined, with a range from 2.8 (gait abnormality) to 3.6 (vestibular dysfunction). The average MDC95% for gait pattern total score was 4.0 with a range from 3.7 (PD) to 4.1 (TBI). Finally, the average MDC95% for level of assistance was 2.4, with a range from 1.5 (PD) to 2.8 (stroke).
MDC95% by Group for mDGI Total Score and Performance Facet Scoresa
Floor and Ceiling Effects
In 5 of 6 groups, the percentage of individuals scoring the highest possible total score was lower for the mDGI than for the DGI. Specific group differences were as follows: Forty-four percent of the control cohort scored 24 out of 24 on the DGI, while only 33% scored 64 out of 64 on the mDGI. Among participants with stroke, 4% scored 24 out of 24 on the DGI, and only 0.5% scored 64 out of 64 on the mDGI; for participants with PD, 7% scored 24 on the DGI, and 1.7% scored 64 on the mDGI. Among participants with vestibular dysfunction, 8% scored 24 on the DGI, and 3.5% scored 64 on the mDGI. Among participants with TBI, 3.3% scored 24 on the DGI, and none scored 64 on the mDGI. For these groups, the mDGI extended the range of measurement for participants who were high functioning, thus reducing the ceiling effect beyond that found in the original DGI. None of the participants in the gait abnormality group scored the highest possible score on either the DGI or mDGI, suggesting that both tests had a limited ceiling effect in this group. No participant in any group scored the lowest possible score for either test; however, an inclusion criterion for this study was the ability to walk 6.1 m without the physical assistance of another person, thus limiting the chance for a floor effect in this group of participants.
Discussion
A key validity question for any widely used assessment is whether scores have the same meaning across groups, over time, and in different conditions.25 The overall purpose of this study was to examine the psychometric properties of the mDGI for 5 diagnostic groups: stroke, PD, vestibular dysfunction, gait abnormality, and TBI. We hypothesized that the mDGI scores would have comparable meaning across populations supporting the validity of the mDGI scores for these diagnostic groups. Results from this study provide multiple lines of evidence to suggest the validity of mDGI scores for the 5 diagnostic groups targeted in this study. Lines of evidence include consistent internal structure, good reliability, and a lack of DIF.
DIF
The DIF analyses were used to investigate whether the Rasch locations of the subscores were invariant across groups. The DIF analyses, presented here, provide strong support for scale invariance across these 5 diagnostic groups. First, <5% of the item location contrasts suggested DIF. Second, there was no discernible pattern of DIF in cases where DIF results met both the criterion for moderate to large DIF and the experiment-wise alpha level. Therefore, it is likely that the few instances of moderate to large DIF found in these data represent type 1 error. The lack of significant DIF suggests that the mDGI scores were invariant across the 5 medical diagnostic groups and the control cohort. This finding means that the item scores had the same order of difficulty for individuals across groups. The results of this study are similar to those described by Dye et al,11 who reported minimal DIF associated with the characteristics of fall history, age, sex, and symptom (dizziness versus imbalance) in the original DGI. Results from both studies suggest scale stability of DGI and mDGI over diverse populations.
Factor Analysis
Results from the exploratory factor analyses suggested that the factor structure for mDGI was the same for all 5 diagnostic groups and supported the importance of using all 3 measures of performance (time, level of assistance, and gait pattern) to characterize mobility function in people with neurologic diagnoses. Results from the factor analyses suggest that for the control cohort, time and gait pattern could be used as proxies for one another; however, among individuals with mobility limitations due to neurologic impairment, all 3 performance measures were needed to fully characterize mobility performance. These findings confirm and extend those reported previously by Shumway-Cook and colleagues,17 who examined the factor structure for the mDGI in a large (N=995) and diverse sample of individuals with and without mobility limitations. Shumway-Cook et al17 reported that the mDGI measured 3 unique and moderately correlated aspects of performance, which together explained approximately 80% of the variability in scores, further adding to the evidence for validity of mDGI scores. The results of the factor analyses presented here suggest that the 3 factors reported by Shumway-Cook et al17 are relevant to specific patient populations as well. It was possible that the internal structure of scores found for the larger and more diverse sample in the study by Shumway-Cook et al17 masked a unique structure for specific diagnostic groups. Therefore, it was important to demonstrate in the current study that the factor structure found for the aggregate group also was found for each of the 5 diagnostic groups that were the focus of this study. Subsequent research should verify the results from this study with additional independent samples.
Based on the results of the current analyses, we recommend that clinicians use all 3 mDGI performance facet scores to characterize changes in 3 aspects of mobility function: the pattern of gait used for a walking task, the level of assistance needed, and the time required to complete each of the tasks. Given this recommendation, it is important to consider what changes in mobility function would be considered significant enough to represent change beyond that explained by measurement error.
Minimal Detectable Change
Minimal detectable change analysis is used to determine whether change is likely to be real change rather than measurement error. The average MDC95% for the total mDGI score was 7.0 for all 5 diagnostic groups combined. The average MDC95% for the gait pattern total score was 4.0, with a range from 3.7 (PD) to 4.1 (TBI). This result is higher compared with results of previous studies by Huang et al7 and Romero et al26 on the original DGI ordinal scale. These researchers found that MDC was 2.9 for participants with PD and in community-dwelling older adults, respectively. However, MDC95% for the gait pattern score of the mDGI is slightly higher than that reported by Hall and Herdman27 in patients with vestibular dysfunction (MDC95%=3.2) and slightly lower than in persons with stroke (MDC=4 points).5 Until further research has been conducted with a range of diagnostic groups, we recommend that the MDC95% for scores from the mDGI be set at 7 for the total score, 4 for the gait pattern score, 2 for level of assistance score, and 3 for time level score. This recommendation is consistent with previously reported MDC95% for the mDGI.17 These criteria should be re-evaluated as further research on the mDGI emerges.
Ceiling Effects
One of the original rationales for altering the scoring system for DGI was the reported ceiling effect among individuals with mobility impairments who are high functioning. This ceiling effect limited the degree to which DGI scores could detect change (both increases and decreases in mobility function) for individuals who are higher functioning. The modified scoring system for mDGI increased the range of measurement for individuals in the control group and in the 5 diagnostic groups beyond that provided by the original DGI, thus reducing the likelihood of a ceiling effect, which has been reported in previous research for individuals with vestibular dysfunction11 and with stroke,5 as well as for older adults with mobility disability.15
Limitations
The major limitation of this analysis was the varying numbers of cases across diagnostic groups. The research was based on a convenience sample from a variety of clinics and not on prevalence rates of these diagnoses in the general population. In addition, some groups for which data were available in the dataset (eg, individuals with MS) were not included in these analyses because the total numbers of participants was <60.
A second limitation was that severity and time since onset or diagnosis were not recorded, nor was the location of neural pathology in the stroke or TBI groups. Thus, it was not possible to identify subclassifications within any groups, which increased the heterogeneity of patients within each of the diagnostic groups.
Future Research
Future research will further explore the patterns of performance on the mDGI across diagnostic groups and consider the clinical implications of these patterns for the treatment of mobility impairments.
In conclusion, the results from this study provide further evidence to support the validity of the mDGI in quantifying mobility limitations for a wide variety of neurologic diagnoses. The meaning of a score from the mDGI, whether at the task, performance facet, or total score level, was comparable across 5 diagnostic groups, suggesting that the mDGI measures mobility function independent of diagnosis. The results also support the value of characterizing mobility function in terms of all 3 performance facets (time, gait pattern, and level of assistance) for individuals with mobility impairments.
Footnotes
All authors provided concept/idea/research design, writing, and data analysis. Dr Matsuda and Dr Shumway-Cook provided data collection, project management, and fund procurement.
All procedures were approved by the University of Washington Human Subjects Division.
This study was supported by a grant from the Walter C. and Anita C. Stolov Research Fund, Department of Rehabilitation Medicine, University of Washington. The authors thank the testing sites, physical therapists, and all of the participants for their contributions to this study.
- Received July 9, 2013.
- Accepted February 14, 2014.
- © 2014 American Physical Therapy Association