Abstract
Background Although the principal goal of hip fracture management is a return to the pre-event functional level, most survivors fail to regain their former levels of autonomy. One of the most effective strategies to mitigate the fracture's consequences is therapeutic exercise.
Purpose The purpose of this study was to review and quantify the reported effects of an extended exercise rehabilitation program offered beyond the regular rehabilitation period on improving physical functioning for patients with hip fractures.
Sources The Cochrane libraries, PubMed, CINAHL, PEDro, and EMBASE were searched to April 2012.
Study Selection All randomized controlled trials comparing extended exercise programs with usual care for community-dwelling people after hip fracture were included in the review.
Data Extraction and Synthesis Two reviewers conducted each step independently. The data from the included studies were summarized, and pooled estimates were calculated for 11 functional outcomes.
Results Thirteen trials were included in the review and 11 in the meta-analysis. The extended exercise program showed modest effect sizes (ESs), which reached significance, under random theory, for knee extension strength for the affected and nonaffected sides (ES=0.47, 95% confidence interval [CI]=0.27–0.66, and ES=0.45, 95% CI=0.16–0.74, respectively), balance (ES=0.32, 95% CI=0.15–0.49), physical performance-based tests (ES=0.53, 95% CI=0.27–0.78), Timed “Up & Go” Test (ES=0.83, 95% CI=0.28–1.4), and fast gait speed (ES=0.42, 95% CI=0.11–0.73). Effects on normal gait speed, Six-Minute Walk Test, activities of daily living and instrumental activities of daily living, and physical function subscale of the 36-Item Short-Form Health Survey (SF-36-PF) did not reach significance. Community-based programs had larger ESs compared with home-based programs.
Conclusions To the authors' knowledge, this is the first meta-analysis to provide evidence that an extended exercise rehabilitation program for patients with hip fractures has a significant impact on various functional abilities. The focus of future research should go beyond just effectiveness and study the cost-effectiveness of extended programs.
Approximately one half of elderly white women and one quarter of elderly white men will sustain an osteoporotic fracture in their lifetime.1 Of these fractures, hip fracture is the most serious, with a mortality rate of 20% to 30% during the first year after the fracture.1
Even though the principal goals of management are a return to a pre-event functional level and the prevention of recurrent fractures, 50% of survivors fail to regain their former levels of autonomy and mobility.1 Findings from longitudinal studies have indicated that the functional status of patients with hip fractures declined following reduction of rehabilitation services.2,3 Relative immobility following discharge home contributes to deterioration of balance and to muscle weakness, increasing the likelihood of subsequent fractures.
Exercise has been shown to be helpful in reducing impairments, functional limitations, and disability in elderly people who are healthy.4–6 Furthermore, a recent Cochrane review provided evidence that progressive resistance training can effectively improve physical functioning in elderly people, including improving muscle power and performance of basic activities of daily living (ADL).7
There is an increasing interest in studying the effects of a booster exercising program offered beyond the regular rehabilitation period for patients with hip fracture returning to the community. However, the effects of such a program for an extended period are still unclear. For instance, a trial conducted by Tinetti et al,8 which included patients after they had returned to their homes, showed no significant effects of a multicomponent rehabilitation program that included functional therapy and strengthening exercises for a 1-year period. On the other hand, Binder et al,9 in a 6-month trial of extended outpatient rehabilitation that included progressive resistance training, found differences in physical function, quality of life, and disability; the intervention group improved significantly compared with the control group. Resnik et al,10 in a narrative review that included 12 studies of exercise interventions offered to people after hip fracture, concluded that there was limited support for the effectiveness of exercise on performance of basic ADL tasks. Furthermore, Handoll et al,11 in their 2011 Cochrane review, concluded that there was insufficient evidence to establish a clear strategy to improve mobility after hip fracture surgery.
Given these conflicting findings, a systematic review and quantitative summary are warranted to estimate the effects of such an intervention and thereby help in guiding professional practices. The review question was: Among community-dwelling people who have had surgical repair of a hip fracture, does an extended exercise rehabilitation program offered beyond the regular rehabilitation period improve physical functioning compared with usual care? Any study of an exercise program offered beyond the standard therapy period was included. In these studies, the control group received the usual care, which could vary in nature and intensity, depending on the country where the study was carried out.
Method
Data Sources and Search Strategy for Identification of Studies
The search strategy was developed in conjunction with a health sciences librarian. The search period covered all years from inception to April 2012 and included the following databases: PubMed, EMBASE (OvidWeb), CINAHL, the Physiotherapy Evidence Database (PEDro), the Cochrane Central Register of Controlled Trials (CENTRAL), and the Cochrane Bone, Joint and Muscle Trauma Group specialized register. Searches were undertaken using MeSH headings and text words as suitable; no language restriction was applied. An example of the search strategy used in one of the databases is shown in eAppendix 1. Additionally, the Current Controlled Trials Registry,12 the ClinicalTrials.gov registry, and the Australian New Zealand Clinical Trials Register (ANZCTR) were searched (see eAppendix 2 for eligible ongoing studies).
Weekly downloads from AMEDEO13 of “fracture” articles in new issues of 15 journals were examined. Finally, the bibliographies of all eligible articles and related reviews, as well as recent conference proceedings, were checked. When the data in the selected articles were not reported in a way that allowed pooling, attempts were made to obtain these data from the authors. The PRISMA guidelines14 were followed and fulfilled in this review. A combined library of the retrieved articles was created using Reference Manager software, version 12 (Thomson Reuters, New York, New York).
Criteria for Considering Studies for This Review and Meta-analysis
Two reviewers assessed the studies independently using the following inclusion criteria:
-
The study was a randomized clinical trial (RCT).
-
The majority of study participants were patients who had sustained a hip fracture and were community dwelling at the time of fracture and after discharge. This criterion was needed because people in long-term care have specific needs that preferably should be studied separately.
-
The intervention included an extended home- or community-based exercise rehabilitation program, offered after or extended for more than a regular rehabilitation period; the program could have started immediately after discharge or later, but was longer than the rehabilitation period that was typical for the country of origin.
-
Outcomes reflecting physical function constructs were reported. As a result, feasibility or pilot studies without data on physical function outcomes were excluded if attempts to contact the authors failed (see eAppendix 2 for excluded articles and reasons of exclusion).
Data Collection and Quality Assessment
All of the steps of this systematic review were done independently by 2 reviewers (M.A. and O.E.). A structured extraction form was created and tested using 4 different studies, not included in the review, and modified as needed after each study. Initial agreement was measured using crude agreement and the kappa statistic. All discrepancies between the 2 reviewers were discussed, and if a consensus was not reached, the third author (N.E.M.) was consulted to decide.
The methodological quality of the included articles was assessed using the PEDro scale. The PEDro scale is a valid measure of the methodological quality of clinical trials; it assesses studies' quality on a scale from 0 to 10. A study with a PEDro score of ≥6 is considered level-1 evidence (6–8: good; 9–10: excellent), and a score of ≤5 is considered level-2 evidence (4–5: fair; <4: poor).15 The scale's reliability and validity have been reported elsewhere.16–18 For this review, studies with PEDro score of ≤4 were considered to have poor methodological quality and, therefore, were excluded.
The World Health Organization's International Classification of Functioning, Disability and Health (ICF) was used as a framework for classifying outcomes used in the selected trials.19 Based on the ICF, functioning includes: (1) Body Functions and Structure and (2) Activities and Participation. Although the ICF graphic model distinguishes between activity and participation, the ICF's categories do not. In this study, all outcomes were linked to one of the ICF's major domains, and the distinction between the 2 concepts (activity and participation) was made because they are considered conceptually different.20 We used in this review the ICF definition for activity and participation, where activity is the execution of a task or action by an individual and participation is the involvement in a life situation in society.
Data Synthesis, Analysis, and Assessment of Heterogeneity
Crude agreement and the Cohen kappa statistic were used to measure agreement between the 2 reviewers. When there were repeated measures over the course of the study, the first time point after intervention was selected, particularly if the primary time point was not clear. The selection of the first time point has several advantages. It helps to differentiate between effects due to natural recovery and those due to intervention. The intrusion of the effect of confounding variables such as cointerventions is minimized, and the ability to pool all available data from all studies at this time point is enhanced.
The Hedges g effect size (ES) and 95% confidence intervals (CIs) were used to describe the effect of each intervention on the target outcomes. The Hedges g statistic was selected, as it corrects slight overestimations that may arise from small samples and measurement variability among selected studies by calculating adjusted ES.21,22 Comprehensive Meta-analysis software (version 2, Biostat, Englewood, New Jersey) was used to combine data across studies. Effect size was calculated by taking the difference in the mean change (before and after intervention) of a variable (target outcome) between an intervention group and a control group and dividing it by the postintervention pooled standard deviation of that variable.21,22 In 2 studies,23,24 the data reported were too sparse to calculate ES using this estimator and required the use of other estimators of Hedges g (see eAppendix 3).
Heterogeneity among comparable trials was examined using both the standard Q and I2 statistics.25 A significant Q test indicates only the presence of heterogeneity among the data included, whereas the I2 index quantifies its magnitude. The latter is defined as low (I2≤33%), moderate (34%≤I2<67%), or high (I2≥67%).25 The random-effects model was used to calculate the combined effect because it is more conservative, especially for small samples. When the number of studies is small and the within-study variance is large, the heterogeneity test based on the Q statistic may have low power even if the between-study variance is significant.25 Forest plots were used to illustrate the pooled outcomes.
Evidence of publication bias was checked using both funnel plots and quantitative methods (Classic Fail-safe N tests, Begg and Mazumdar rank correlation test, Egger test of the intercept, and Trim and Fill test).26 The Classic Fail-safe N tests calculate the number of “null” studies that need to be located and included in order for the combined P value to exceed the significance level (ie, P>.05). The rank correlation test (Kendall tau-b), as suggested by Begg and Mazumdar, tests the significance of the inverse correlation between study size and ES. A significant correlation would suggest that bias exists. Similarly, a significant value of the intercept in the Egger test suggests that bias exists. Finally, the Trim and Fill method was used as recommended by Duval and Tweedie; it basically determines where the missing studies (if any) are likely to fall, adds them to the analysis, and then recomputes the combined effect. Because each of these quantitative methods has its limitations, all the aforementioned methods were used together.26
We explored the robustness of the pooled-effect estimates for each outcome by conducting a post hoc (unplanned) sensitivity analysis. This analysis was done by dropping one study at a time to test the effect on the pooled ES. Finally, unplanned subgroup analysis was conducted to compare studies based on the setting (community-based versus home-based) where the intervention took place. This approach will help to determine any source of heterogeneity among the studies.
Results
Figure 1 illustrates the process used to select the 13 articles included in the systematic review and the 11 articles deemed suitable for the meta-analysis. The 2 studies27,28 excluded from the meta-analysis had poor methodological quality (level-2 evidence, PEDro scale <5) and insufficient follow-up. Interrater agreement for all stages of study selection was high (>90% crude agreement, kappa=0.78–1).
PRISMA flow chart of the systematic review and meta-analysis. RCT=randomized controlled trial.
Characteristics of Included Studies
The key characteristics of included studies are summarized in Table 1. Publishing dates ranged from 1997 to 2012. Twelve of the 13 articles were published in 2002 or later, which indicates a growing interest in this topic. The geographic distribution of studies is as follows: United States (5 studies), Europe (4 studies), Australia (3 studies), and Taiwan (1 study). Collectively, 1,107 people participated in the 13 studies (intervention: 614, control: 493). The number of participants in each study ranged from 25 to 180; however, 6 studies had 24 participants or fewer in each arm. The average age of participants ranged from 73 to 84 years, and 81% were women. Overall, the duration of the interventions was from 1 to 12 months.
Characteristics of the Included Studies
Two studies29,30 had 2 intervention groups. As the goal of this review was not to compare types of exercises (eg, weight bearing versus non–weight bearing), we decided to statistically combine both intervention groups. To arrange studies in a meaningful order, articles in all tables and plots throughout this review are presented chronologically based on the date of publication.
Methodological Quality of Included Studies
The results of the methodological assessment are summarized in Table 2. The level of agreement on methodological quality generally was very high (>90% crude agreement, kappa=0.7–1). The PEDro scores for the included studies varied from 4 to 8 points. As expected, the therapists and participants were not blinded in any study. This finding corresponds with the conclusion of Boutron et al31 that patient blinding is difficult to achieve in nonpharmacological studies. Surprisingly, despite the fact that all of the studies were RCTs, only 4 reported concealment.
PEDro Scores for Included Studies
Treatment Characteristics
Tables 3 and 4 summarize the intervention characteristics according to the setting in which the exercise training took place (community versus home). The length of exercise programs extended from 2 to 12 months for community-based programs and from 1 to 12 months for home-based programs. Community-based programs offered 16 to 80 supervised sessions, and home-based programs offered 0 to 56 home visits. Session frequency varied from 2 to 3 times per week for community-based programs to daily sessions for home-based programs. For strengthening exercises, the intensity and number of sets were similar for both locations. Intensity of strengthening exercises ranged from a fixed 1 kg, regardless of patients' abilities, to 100% of the 1-repetition maximum (1RM), and there were 2 to 3 sets for each individual muscle. Session duration was from 45 to 135 minutes for community-based programs versus 30 to 45 minutes for the home-based programs. Adherence, defined as the percentage of exercise sessions attended out of the prescribed sessions, ranged widely from 3% to 99% for community-based programs and from 45% to 98% for home-based programs.
Intervention Characteristics of Community-Based Groupa
Intervention Characteristics of Home-Based Groupa
Quantitative Analysis Results
Over all trials, data were pooled for 11 different functional outcome measures: knee extension strength for the affected side, knee extension strength for the nonaffected side, balance, physical performance-based tests, Six-Minute Walk Test (6MWT), Timed “Up & Go” Test (TUG), fast gait speed, normal gait speed, ADL, instrumental activities of daily living (IADL), and physical function subscale of the 36-Item Short-Form Health Survey (SF-36-PF). The pooled ESs derived from the random-effects model are presented in the forest plots (see Fig. 2, in which the combined forest plot summarizes the effects for all of the included studies, and eAppendix 4, for a single plot for each outcome with full details).
Combined forest plot of all pooled outcomes. ES=effect size (all ESs pooled under the random-effects theory), 95% CI=95% confidence interval, ADL=activities of daily living, IADL=instrumental activities of daily living, SF-36-PF=physical function subscale of the 36-Item Short-Form Health Survey questionnaire.
Knee muscle strength for the affected side was reported in 6 studies.9,23,24,30,32,33 The pooled ES was 0.47 (95% CI=0.27 to 0.66, P<.001). Pooled sample sizes were 224 intervention group participants and 224 control group participants. Despite some evidence for low heterogeneity as quantified by I2, it was not statistically significant (Q value=5.2, P=.38, I2=5.5).
Knee muscle strength for the nonaffected side was reported in 5 studies.9,23,30,32,33 The pooled ES was 0.45 (95% CI=0.16 to 0.74, P=.002). Pooled sample sizes were 179 intervention group participants and 138 control group participants. The heterogeneity among studies was low, and heterogeneity tests were not statistically significant (Q value=5.9, P=.2, I2=30.1).
Balance measures were reported in 7 studies9,23,30,32–35 using 4 different indexes (Berg Balance Scale, Functional Reach Test, distance test, and global balance measure for older patients). The pooled ES was small but significant (ES=0.32, 95% CI=0.15 to 0.49, P<.001). The pooled sample sizes were 373 intervention group participants and 276 control group participants. All ESs were homogenous (Q value=4.8, P=.57, I2=0.00).
Four studies reported results of physical performance-based tests9,30,33,36 using 4 indexes (Tinetti Performance-Oriented Mobility Assessment–part 2, original and modified Physical Performance Tests, and Physical Performance Mobility Examination). The full versions of these scales were reviewed, and we found that these scales and subscales examine the same construct—basic functional abilities, mainly mobility. Thus, because we were using Hedges g ES, corrected for bias and data pooled under random-effects theory, we decided to combine data from these tests. The combined ES was 0.53 (95% CI=0.27 to 0.78, P<.001). The pooled sample sizes were 152 intervention group participants and 108 control group participants. All ESs were homogenous (Q value=2.62, P=.45, I2=0.00).
Four studies reported results for the 6MWT.29,34–36 The pooled effect was not statistically significant under the random-effects theory (Hedges g ES=0.22, 95% CI=−0.12 to 0.57, P=.21), but was significant under the fixed-effects theory (Hedges g ES=0.26, 95% CI=0.03 to 0.49, P=.02). Pooled sample sizes were 228 intervention group participants and 161 control group participants. Heterogeneity was moderate but not statistically significant (Q value=5.7, P=.13, I2=47.5).
Three studies reported TUG scores24,33,34 (Hedges g ES=0.83, 95% CI=0.28 to 1.4, P=.003). Pooled sample sizes were 157 intervention group participants and for 148 control group participants. The heterogeneity was significant, and its magnitude was high (Q value=8.4, P=.02, I2=76.1).
Previous research provided support to analyze fast and habitual gait speeds separately.37 Four studies reported fast gait speed as an outcome measure9,33,34,36 (Hedges g ES=0.42, 95% CI=0.11 to 0.73, P=.008). Pooled sample sizes were 172 intervention group participants and 118 control group participants. Heterogeneity was moderate but was not statistically significant (Q value=4.36, P=.22, I2=31.24).
Normal gait speed was reported in 4 studies23,29,30,36 (Hedges g ES=0.16, 95% CI=−0.17 to 0.48, P=.35). Pooled sample sizes were 81 intervention group participants and 64 control group participants. Evidence of low heterogeneity was found, but the findings were not significant (Q value=3.3, P=.33, I2=11.2).
Measures of ADL were reported in 4 studies9,33,35,38 using 4 indexes (Barthel Index, Katz Index of Independence in Daily Living, basic ADL, and lower-extremity physical activities of daily living [LPADL]) (Hedges g ES=0.14, 95% CI=−0.07 to 0.35, P=.2). Pooled sample sizes were 211 intervention group participants and 207 control group participants. No evidence of heterogeneity was found (Q value=2.4, P=.48, I2=0.00).
Measures of IADL were reported in 4 studies9,33–35 using 4 indexes (Lawton IADL, Nottingham Extended Activities of Daily Living score [NEADL], Older Americans Resources and Services Instrument, and IADL Score) (Hedges g ES=0.2, 95% CI=−0.07 to 0.48, P=.14). However, under the fixed-effects theory, the ES was significant (Hedges g ES=0.2, 95% CI=0.007 to 0.41, P=.04). Pooled sample sizes were 249 intervention group participants and 159 control group participants. There was moderate but not significant heterogeneity (Q value 5.08, P=.16, I2=40.97).
Finally, the SF-36-PF was reported in 4 studies9,29,35,36 (Hedges g ES=0.2, 95% CI=−0.03 to 0.44, P=.09). Pooled sample sizes were 174 intervention group participants and 155 control group participants. All ESs were homogenous (Q value=1.7, P=.6, I2=0.00).
Clinical Relevance to Patients
The significant ESs under the random-effects model found in this study generally were modest to high (0.42–0.83), except for balance (0.32). Furthermore, all of the significant ESs in this review were clinically important. In a systematic review of health-related quality-of-life studies, Norman et al39 found that the minimal important difference (MID) in most studies, regardless of how it was measured, was half a standard deviation (ie, an ES equivalent to 0.5). For example, using the calculated ES of 0.42 for fast gait speed and the largest standard deviation found by Binder et al9 and reported in this review (SD=0.4), the change in fast gait speed would be calculated to be 0.17 m/s.* Palombaro et al40 reported the minimal clinically important difference (MCID) in fast gait speed as 0.1 m/s, and Alley et al41 found 0.17 to 0.26 m/s to be a substantial clinical improvement. Thus, 0.17 m/s falls within the range of the substantial improvement previously reported.41 Similar conclusions can be drawn for the other significant ESs in this review.
Publication Bias
Overall, funnel plots did not indicate signs of publication bias, nor did the quantitative tests (Classic Fail-safe N tests, Begg and Mazumdar rank correlation test, Egger test of the intercept, and Trim and Fill test).26 Because tests for funnel plot asymmetry are not recommended when there are fewer than 10 studies, only an example of a funnel plot will be presented in eAppendix 5.42
Sensitivity Analysis and Subgroup Comparison
Sensitivity analysis did not result in a change of significance when any single trial was excluded, suggesting that our results were robust, except for outcomes reported with a small number of studies (ie, TUG reported in 3 studies). When Sylliaas and colleagues' study34 was removed from the analysis, TUG ES became insignificant, under the random-effects theory (ES=0.77, 95% CI=−0.17 to 1.7).
Analyzing subgroups separately based on where the intervention took place (community-based versus home-based) suggested additional findings worth investigating. The community-based group had larger ESs for all outcomes and narrower CIs. On the other hand, the home-based group had smaller and, most likely, insignificant ESs. There was almost no heterogeneity within each group. A comparison of the 2 groups' ESs is presented in eAppendix 6.
Discussion
This review supports the hypothesis that an extended exercise program has a positive impact on physical function. Previous studies1–3 have shown that people have poor functional outcomes after hip fractures, yet little has been done to help patients with hip fractures return to a prefracture functional level after they are discharged to the community. Following fracture, patients are at high risk of entering a vicious cycle in which fear of falling along with postfracture pain and muscle weakness contribute to relative immobility and lead to a deterioration of balance, more muscle weakness, and an increase in the likelihood of subsequent fractures. As regular rehabilitation programs offered postfracture have been shown to be helpful but not sufficient to restore a patient's prefracture functional level,2 an extended exercise program has been proposed as a promising strategy to improve patients' functional capacities.
Despite the differences among exercise programs across the included studies, current results from this meta-analysis demonstrate a significant positive impact on the strength of knee extensors, balance, performance-based tests, TUG scores, and fast gait speed. To our knowledge, this is the first meta-analysis of extended rehabilitation for people with hip fractures, and the results support its effectiveness. Other reviews either have been very broad, preempting a concrete conclusion, or have had different scopes,43–46 but none covered only extended rehabilitation. For instance, in their 2011 Cochrane review, Handoll et al11 studied mobilization strategies across all care settings (inpatient acute care, inpatient rehabilitation, community-based rehabilitation), making the scope of the review very broad, which contributed heterogeneity to the effects and consequently made it difficult to compare the studies included in their review or to pool the estimates for outcomes. Moreover, new trials have been published warranting the current meta-analysis.
Results similar to our findings have been published for other conditions (eg, stroke). Few reviews and meta-analyses have assessed the effect of exercise after acute and subacute periods.47,48 A 2011 meta-analysis of trials provided evidence that a variety of conventional motor rehabilitation and physical therapy interventions, if applied for 6 months after a stroke, can improve functional outcomes.47
A total of 59 targeted outcomes (including outcomes of different muscle groups) were identified from the selected 13 trials. Rehabilitation research commonly incorporates multiple outcomes in one study, and this practice makes summarizing across studies challenging because there is no common outcome (see eAppendix 7 for all outcome measures used in this review's studies). Based on the ICF classification, outcomes of body functions, body structures, and activities predominated, whereas participation-level outcomes were rarely measured. One explanation is that the measurement of participation is a relatively new field.49 Moreover, it is easier to detect the effect on function or even activity than on participation. However, participation is the outcome that may be most important to people with disabilities, their families, and society.49,50 Thus, if rehabilitation goals are to be fully achieved, rehabilitation interventions should target enhancing patients' independence and participation.
The comparisons between community-based and home-based studies included in the analyses showed that pooled ESs for the community-based studies (for individual studies and as a group) were larger and more likely to be statistically significant. Significant and larger ESs in the community-based studies could be explained by higher exercise intensity and the ability to use more sophisticated equipment. This finding corresponds to the conclusions from previous systematic reviews51,52 that higher-intensity training was associated with greater strength improvement among older populations, as opposed to low- and moderate-intensity exercising. Nevertheless, such intensive interventions might result in lack of adherence to the study and a decrease in the number of participants willing to participate.27,53 Another explanation might be that the group setting enhances social interaction among people sharing the same condition, which may enhance participation in more intensive community-based programs, reduce cost, and improve motor learning.54,55 These findings from the subgroup comparison should be interpreted with some caution, as the differences between the 2 groups could be due to confounding factors other than the setting of the intervention. Moreover, the number of studies in each subgroup was too small to yield concrete conclusions.
The clinical implications of our study's findings warrant further attention. Our findings suggest that significant functional improvement can be gained later in the recovery process than is usually believed. For example, Koot et al56 reported that a very small possibility exists of a patient with a hip fracture making further recovery after 4 months. This improvement possibly means there is no “plateau” and any observed plateau may be a consequence of less intensive therapy. If further studies corroborate our findings, it might lead to a change in existing practices and recommendations.
The advantage of a meta-analytical approach is apparent when considering that many of the individual studies were not powered to detect even moderate effects. Adequately powered randomized trials of rehabilitation interventions require considerable resources, expertise, and a sizable base population. Nevertheless, the answers for these important questions may come from carefully designing, in a standardized manner, more locally feasible studies to feed meta-analyses.57
In this review, we decided to use PEDro scale, a scoring system, to assess the methodological quality of the included trials. The advantages of the scoring system lie in its ease of use and ability to compare scores across studies without difficulty. On the other hand, there is no gold standard to verify the validity of these scores, making it difficult to know whether they are measuring the methodological quality properly. Furthermore, these scores do not take into account the relative importance of different items included in the scales, and sometimes they incorporate items that are not related to study quality or bias assessment. Cochrane reviews now consider risk of bias tools in preference to methodological quality assessment scales.42
Finally, areas of future research should emphasize cost-effectiveness as well as effectiveness. Even modest gains in mobility and balance may translate to substantial cost savings if a second hip fracture is prevented or admission to long-term care is delayed. Overall, this review and meta-analysis are a promising step toward enhancing the recovery after hip fractures. This methodology provides the highest level of evidence to guide practice.
Study Limitations
Some limitations of this study should be noted. There were differences among studies in intervention parameters, providers, setting, time-point assessments, and outcome measures. Nevertheless, we tried to accommodate these differences. To account for using different measurement indexes, data were pooled only when the same construct was being measured using the Hedges g ES calculated under the random-effects model. Another limitation could be that the number of studies included was relatively small; nevertheless, the results for many outcomes were significant, and the publication bias assessment supported the absence of publication bias. As the evidence accumulates, the ability to do extra meta-analyses with larger samples will be easier. Using a scoring system to assess trial quality rather than risk of bias tools is an additional limitation.58
Footnotes
-
All of the authors contributed to research ideas, writing, and data collection. The authors acknowledge the assistance of the Life Sciences Library at McGill University and Mrs Shaima Fayiz for help in organizing the manuscript. Additionally, Mr Auais would like to thank the Canadian Institutes of Health Research (CIHR) for providing him with a travel award to present this study at the Canadian Physiotherapy Association's 2012 Congress; May 24–27, 2012; Saskatoon, Saskatchewan, Canada.
-
↵* ES=Δ/SD, 0.42=Δ/0.4, Δ=0.168.
- Received August 29, 2011.
- Accepted July 12, 2012.
- © 2012 American Physical Therapy Association