Abstract
Background Given the prevalence of chronic nonspecific neck pain (CNSNP) internationally, attention has increasingly been paid in recent years to evaluating the efficacy of therapeutic exercise (TE) in the management of this condition.
Purpose The purpose of this study was to conduct a current review of randomized controlled trials concerning the effect of TE on pain and disability among people with CNSNP, perform a meta-analysis, and summarize current understanding.
Data Sources Data were obtained from MEDLINE, Cumulative Index to Nursing and Allied Health Literature (CINAHL), EMBASE, Physiotherapy Evidence Database (PEDro), and Cochrane Central Register of Controlled Trials (CENTRAL) databases from their inception to August 2012. Reference lists of relevant literature reviews also were tracked.
Study Selection All published randomized trials without any restriction regarding time of publication or language were considered for inclusion. Study participants had to be symptomatic adults with only CNSNP.
Data Extraction Two reviewers independently selected the studies, conducted the quality assessment, and extracted the results. Data were pooled in a meta-analysis using a random-effects model.
Data Synthesis Seven studies met the inclusion criteria. Therapeutic exercise proved to have medium and significant short-term and intermediate-term effects on pain (g=−0.53, 95% confidence interval [CI]=−0.86 to −0.20, and g=−0.45, 95% CI=−0.82 to −0.07, respectively) and medium but not significant short-term and intermediate-term effects on disability (g=−0.39, 95% CI=−0.86 to 0.07, and g=−0.46, 95% CI=−1.00 to −0.08, respectively).
Limitations Only one study investigated the effect of TE on pain and disability at follow-up longer than 6 months after intervention.
Conclusions Consistent with other reviews, the results support the use of TE in the management of CNSNP. In particular, a significant overall effect size was found supporting TE for its effect on pain in both the short and intermediate terms.
Neck pain is one of the most common musculoskeletal disorders, second only to low back pain,1 with an annual prevalence among the general and workforce populations of 30% to 50%.2 Although the natural history of this condition appears to be favorable, rates of recurrence3 and chronicity4 appear high. The course of neck pain often is characterized by exacerbations, and more than one third of patients with neck pain will develop chronic symptoms lasting more than 6 months.5 In particular, chronic nonspecific neck pain (CNSNP) (ie, chronic neck pain without any specific disease detected as the underlying cause of the complaints6) represents the vast majority of cases, contributing to substantial health care costs, work absenteeism, and loss of productivity at all levels.7,8
In order to decrease this social burden of disability, the use of interventions with demonstrated efficacy for specific outcomes is clearly essential.9 Increased attention has been paid in recent years to evaluating the efficacy of various conservative therapeutic interventions used by physical therapists to manage CNSNP,10 especially therapeutic exercise (TE).11 However, few rehabilitation studies are designed with the expressed intention of determining effectiveness under routine clinical conditions and with study participants generally representative of a particular clinical population, rather than the tightly controlled conditions of a randomized controlled trial (RCT).
Despite the growing number of studies assessing the efficacy of this intervention, substantial inconsistencies continue to exist, in part, due to insufficient evidence regarding optimal dose-response relationships, the best mode for delivering the service, and the differential outcomes of different types of exercise on CNSNP,12 leaving little clarity for evidence-based clinical practice. For example, 4 recent reviews present conflicting results regarding the benefit of strengthening exercises for relieving neck pain symptoms. Sarig-Bahat11 and Sihawong et al,10 in their reviews of 2003 and 2010, respectively, found relatively strong evidence supporting the efficacy of dynamic resisted strengthening exercises of the neck-shoulder musculature. In the intervening years, Kay et al12 concluded in 2009 that the evidence of efficacy for strengthening exercises was unclear, and Ylinen,13 in 2007, found moderate evidence supporting the efficacy of dynamic and isometric-resisted strengthening exercises.
One limitation of previous reviews has been the tendency to aggregate results pertaining not only to CNSNP but also to different and heterogeneous conditions (eg, whiplash-associated disorder, myofascial neck pain, degenerative changes, cervicobrachialgia, back and shoulder pain) while simply referring to them as “chronic mechanical neck disorders.” Inconsistencies among the reviews also are likely due to differences in search dates, characteristics of interventions, mixing of neck disorder durations, and incompatibility in the analysis of results obtained from comparison versus placebo-controlled trials.12,14 In addition, RCTs published in the past decade often have lacked sufficient power to draw clear and definitive conclusions.15 These persistent methodological inconsistencies justified the need for a study that explicitly targeted its population of interest, characteristics of RCTs, and duration of follow-up as inclusion criteria in order to determine a more accurate estimate of the efficacy of TE and its impact on pain and disability outcomes in patients with CNSNP, as a first step in unraveling the tangle of inconclusive evidence to date.
Method
Data Sources and Searches
Our literature search was aimed at identifying all available studies that evaluated the effect of TE in relieving pain and improving function and disability outcomes in people with CNSNP. Records were identified by searching multiple literature databases, including MEDLINE, Cumulative Index to Nursing and Allied Health Literature (CINAHL), EMBASE, Physiotherapy Evidence Database (PEDro), and Cochrane Central Register of Controlled Trials (CENTRAL), from their inception to August 2012. The key word “neck pain” was used at the first level of inquiry to ensure that our search began as broadly as possible. Queries were limited to RCTs as type of publications and to those involving human adult participants (18 years or older). Additional records were searched through other sources to complement the database findings; manual research of reference lists of relevant literature reviews and indexes of peer-reviewed journals were used.
Study Selection
Types of studies.
Several criteria were used to select eligible studies. We included published RCTs without any restrictions on publication date or language. Quasi-RCT and nonrandomized controlled trials were excluded. Among RCTs, only trials with a control or comparison group were considered for inclusion in the study. These comparison trials included: (1) intervention versus placebo or sham intervention, (2) intervention versus no-exercise intervention or comparator (eg, self-care, advice, continuing with ordinary or recreational activities), and (3) intervention versus standard practice (eg, wait list, usual care). Our criterion for designating a study as a “comparison” trial required that the investigators compare TE plus another intervention versus this same intervention (eg, exercise and electrotherapy versus electrotherapy only) in a comparably matched group. Furthermore, the study intervention had to be performed with identical treatment parameters in all study arms.
Types of participants.
The participants had to be symptomatic adults aged 18 years or older, with a diagnosis of CNSNP or chronic neck muscle pain, also called trapezius myalgia. Because our initial review used “neck pain” as the key phrase to ensure the broadest sweep of the literature, we implemented additional criteria in our further review. Neck pain was considered chronic when it emerged from the text that participants reported neck pain of more than 3 months' duration16 or, in the absence of this explicit description, when the authors themselves designated the pain as “chronic.” Trapezius myalgia generally accounts for a vast proportion of nonspecific neck pain17; therefore, studies using this term to describe participants were included.
Trials were excluded if any of the participants received a specific diagnosis such as radiculopathy, myelopathy, fracture, infection, dystonia, tumor, inflammatory disease, or osteoporosis.15 Similarly, trials were excluded if some or all of the participants had whiplash-associated disorder, myofascial neck pain, neck pain associated with trauma, degenerative changes, fibromyalgia, or cervicobrachialgia. The trials investigating mixed populations such as people with neck and back pain, neck and arm pain, neck pain and headache, and neck and upper-limb pain were all excluded, with the exception of those investigating neck and shoulder pain, provided that neck pain could be considered a primary complaint.
Types of interventions.
Among all types of conservative interventions used by physical therapists for the management of chronic neck pain, only TE was considered in our study. Any other interventions such as education, manual therapy, traction, physical agents and modalities, cognitive-behavioral therapy, and multidisciplinary rehabilitation were excluded. Also, exercise used in combination with other passive interventions was excluded. Finally, trials were excluded if the prevention of neck pain was the main clinical purpose of the study intervention.
Types of outcome measures.
To be eligible for inclusion, a study had to assess pain by a visual analog scale, a numerical pain rating scale, or patient self-report as a primary outcome measure. Disability was assessed as a primary outcome measure if the chosen instrument measured the impact of chronic neck pain on everyday life, beyond work or leisure-time activities. If more than one measure of an outcome of interest was reported within the same study, only one was considered. We chose the measure that would most likely provide the most conservative estimate of the effect of TE on the outcome due to the magnitude of the pain or disability. For example, in the case of pain, we selected the measure that most nearly corresponded to the question “What is your worst pain?” to be used in our analysis. Trials investigating the effect of TE on pressure pain threshold or pressure pain tolerance, electromyographic signals, range of motion, or strength or endurance of cervical muscles were excluded. Similarly, health-related quality of life, patient satisfaction, global perceived effect, work-related measures, depression, and other psychosocial measures were not considered in our analyses. When possible, we extracted study findings at baseline (before intervention), after intervention, and at every reported follow-up within 12 months.
Adopting the categorization proposed by Chow and colleagues18 in their systematic review and meta-analysis on the efficacy of low-level laser therapy in the management of neck pain, duration of follow-up was defined as short term (0–1 month), intermediate term (1–6 months), and long term (>6 months).
Data Extraction and Quality Assessment
Two review authors (I.G., F.T.) independently conducted study selection and data extraction. A third author (P.P.) was consulted in the case of persisting disagreement. Reviewers were not blinded to information regarding the authors, journal of origin, or outcomes for each reviewed article. Using a standardized form, data extraction addressed participants, types of intervention, follow-up times, clinical outcome measures, and findings that were reported. These data are detailed in Table 1. Methodological quality of studies was assessed using the PEDro scale, which has been shown to be reliable19 and valid20 for rating the quality of RCTs. Two independent assessors (I.G., F.T.) obtained or extracted from the PEDro database the score for each trial when available. Trials were not excluded on the basis of quality.
Characteristics of Included Studiesa
Data Synthesis and Analysis
Data were synthesized using a meta-analytic method based on a random-effects model due to the significant heterogeneity and because this method accounts for both within-study and between-study variance; this approach weights studies by the inverse of the variance and incorporates heterogeneity into the model.21 All effect sizes were pooled using the Hedges g statistic because it incorporates a small sample bias correction.22 Comprehensive Meta-Analysis V.2.2 software (Biostat, Englewood Cliffs, New Jersey)23 was used for the statistical analyses. Standardized mean differences (SMDs) with 95% confidence intervals (95% CIs) were calculated for continuous data. Standardized mean differences were used because different measures were adopted by each study to address the same clinical outcome. To interpret effect size calculated with SMD, we used the method described by Cohen24 as a guide to identify small (0.20), medium (0.50), or large (0.80) effects. Calculation of effect size was based only on the best possible data (ie, final means, standard deviations, and sample sizes of intervention and control groups). Selected studies for which these crucial parameters were not directly reported, or obtainable by contacting authors, were not included in the meta-analysis. In cases where different articles covered results from the same study population, data from only one article were pooled. When a trial was designed to compare more than 2 treatments (ie, comparison trial), we broke up the control group into several parts so that the total numbers would add up to the original size of the group in order not to count the control group patients twice.25
The Q and I-square statistics were used to assess heterogeneity among studies. The Q statistic has low power as a comprehensive test of heterogeneity,26 especially when the number of studies is small (ie, most meta-analyses). Conversely, the Q statistic has too much power as a test of heterogeneity if the number of studies is large.27 A significant Q value indicates a lack of homogeneity of findings of studies. Following the approach of Higgins and Thompson,27 heterogeneity was qualified as low (25%–50%), moderate (50%–75%), or high (≥75%). Potential publication bias was assessed using the Egger t test.
Results
We identified 2,574 studies through database searching. No additional eligible studies were identified through other sources. After removing duplicates and screening titles and abstracts of all remaining unique articles, 55 full-text articles needed to be assessed to verify their eligibility for the inclusion in the present study. Ultimately, 46 of them were excluded for various reasons (Fig. 1), resulting in 9 studies28–36 included in the qualitative synthesis, 7 of which were eligible for quantitative synthesis by pooling their data for meta-analysis. Overall, the 9 included studies, conducted in Europe, Australia, and Asia, were published from 1999 to 2012, with 7 of them being published in the last decade. Specifically considering only the 7 pooled studies, the number of patients who were enrolled and completed baseline assessments ranged from 20 to 265, with a mean sample size of 92 participants. The mean age of the study participants was approximately 39 years (range=29–45). The majority of the participants were female (90%).
Flowchart of the selection of the studies for the present meta-analysis.
Quality Assessment
Trial quality was generally medium, with 5 out of 9 trials scoring at least 5 on the PEDro scale28–30,33,34 (Tab. 1). The quality criteria related to blinding were never met. However, it should be noted that blinding patients or therapists is not feasible in trials involving exercise as the intervention. Another quality criterion that was commonly unmet (only 2 out of 9 studies) was the requirement that at least one key outcome was obtained from more than 85% of the participants initially allocated to groups.29,34
Outcomes of Treatment
Table 2 presents the follow-up study findings for pain and disability with respect to the pooled effect size for intervention outcomes, 95% CI values, assessment of heterogeneity across studies (Q and I-square statistics), and Egger t test for potential publication bias. Forest plots for each outcome are shown in Figures 2 and 3. Forest plots depict the effect size calculated for each study by outcome as well as the overall effect size obtained for the outcome across studies at each time interval. Forest plots also indicate whether the effects obtained in each study across studies favor the control group or the intervention group. When more than one form of TE was explicitly analyzed in the same study, one letter in alphabetical order was assigned to each of them.
Pooled Effect Sizes of Outcomes for People With Chronic Nonspecific Neck Pain
Standardized difference in means and 95% confidence intervals (95% CI) for effect of the therapeutic exercise on pain at short-term and intermediate-term follow-ups compared with control. Superscript letters a, b, and c represent the different arms of a single study following the order as reported in Table 1.
Standardized difference in means and 95% confidence intervals (95% CI) for effect of the therapeutic exercise on disability at short-term and intermediate-term follow-ups compared with control.
Pain.
Nearly all studies (n=6/7) assessed this outcome in the short term, 5 studies had intermediate-term follow-up, and only 1 study had long-term follow-up. Because 2 studies had more than one experimental arm, these RCTs had 9 intervention protocols to analyze for short-term effect. There were 7 treatment arms in the 5 studies that reported intermediate-term follow-up. Only 1 study met our operational definition of long-term follow-up of pain. Among the 6 studies30–35 that assessed pain during the first month after the intervention, the overall effect size of TE was medium and significant (g=−0.53), with a range from −0.86 to −0.20. In the 5 studies31–34,36 that assessed pain between 1 and 6 months after the intervention, the overall effect size of TE was medium and significant (g=−0.45), with a range from −0.82 to −0.07. Only 1 study34 assessed pain between 6 and 12 months after the intervention, and the overall effect size was very small and not significant (g=−0.04). A moderate heterogeneity of findings appeared for 2 outcomes: short-term pain30–35 and intermediate-term pain31–34,36 (P<.05). A significant and positive Egger t test appeared for one outcome (ie, short-term pain)30–35 (P<.05).
Disability.
The majority of studies (n=4/7) assessed this outcome in the short term, 3 studies had intermediate-term follow-up, and only 1 study had long-term follow-up. For the 4 studies30,31,33,34 that assessed disability during the first month after the intervention, the overall effect size of TE was medium but not significant (g=−0.39). In the 3 studies31,33,34 that assessed disability between 1 and 6 months after the intervention, the overall effect size was medium but not significant (g=−0.46). Only 1 study34 assessed disability between 6 and 12 months after the intervention, and the overall effect size was very small and not significant (g=−0.14). A high heterogeneity of findings appeared for 2 outcomes: short-term disability30,31,33,34 and intermediate-term disability.31,33,34 No significant and positive Egger t test was found for any of the 3 outcomes.30,31,33,34
Discussion
This updated systematic review and meta-analysis aimed to determine a more accurate estimate of the effect of TE on pain and disability outcomes in people with CNSNP. We found 9 studies28–36 investigating the efficacy of TE that met our inclusion criteria, of which 7 were deemed appropriate for a meta-analysis. The most important finding we obtained by pooling these 7 studies was a medium and significant overall effect size for TE in reducing pain in the short term (<1 month) and intermediate term (1–6 months) and a medium but not significant overall effect size in reducing disability in the short term and intermediate term. It was not possible to calculate an overall effect size for TE at long-term follow-up (6–12 months) due to the lack of studies examining this endpoint.
From a qualitative point of view, our results are in line with those presented by most of the literature in recent years10–14,37 that has supported the benefit of TE in the management of chronic neck pain. One of the earliest complete systematic overviews and meta-analyses on conservative management of mechanical neck pain, published by Aker et al in 1996,38 only cautiously recommended manual treatments in combination with other treatments, among which TE would be included. More recently, Hurwitz and colleagues from the US Bone and Joint Initiative39 have suggested that therapies involving exercise are more effective than alternate strategies for management of neck pain. Our analysis specifically contributes to highlighting the efficacy of TE alone for the management of CNSNP, particularly given that we found a significant overall effect size supporting this kind of intervention for reducing pain in the short term and intermediate term, which does not appear to have been reported in the literature.
From a quantitative point of view, these findings are different from those obtained by 2 other recent systematic reviews and meta-analyses on this topic.14,37 Gross et al,37 in 2007, concluded that exercise alone demonstrated intermediate-term and long-term benefits in reducing both pain and disability, whereas Leaver et al,14 in 2010, found specific exercises able to produce only a significant short-term effect on pain reduction. The discordance between these 2 conclusions was one of the reasons for undertaking the present study. Our intent was to extrapolate a more accurate estimate of the overall potential efficacy of TE in the management of CNSNP by addressing some methodological issues that had not previously been taken into account (ie, isolating studies dealing specifically with adults with CNSNP of at least 3 months' duration as the population of interest and specifically TE as the intervention).
Study Limitations
The most important limitation of the present work is the limited number of available studies that prevented us from making additional analyses and resolving other methodological issues. As a consequence, we were not able to explain our data heterogeneity by conducting subgroup analyses or to detect the presence of some potential mediating factors (eg, type, duration, intensity, and frequency of training regimens or particular population characteristics).
Another limitation is the quality of the included studies, which was generally medium to low. The requirements for at least one key outcome to be obtained from more than 85% of the participants initially allocated to groups and for an analysis by “intention to treat” were typically never met. The blinding criteria of the PEDro scale lower the methodological quality of exercise-related trials even when blinding all patients and therapists may not be feasible.10,11 Publication bias is another potential limitation of our review. A strong publication bias, however, is unlikely because studies in all languages and for any year of publication were included and authors of included studies were contacted for any unpublished date. Furthermore, although the Egger t test turned out to be significant for one outcome, it is known that the meaningfulness of such a test suffers from the small number of studies and small samples and from the heterogeneity and different quality of the studies. Using only clinical trials may have influenced the potential publication bias, but also allowed us to derive our conclusions from higher-quality studies.
Clinical Implications
Combining data, the results of our meta-analysis sustain a conclusion in favor of TE in the management of pain associated with CNSNP. In particular, based on the overall effect size of TE as derived from pooled studies, we found that the use of exercise programs for reducing pain in the short term (<1 month) and intermediate term (1–6 months) could be supported. It was not possible to evaluate the efficacy of TE at long-term follow-up (6–12 months) due to the lack of studies examining this endpoint.
Future Research
Future studies are needed to clarify the efficacy of different forms of TE and specifically on different subgroups of people with CNSNP who may have different etiologies or prognoses that help to explain outcomes.40 The possibility of spontaneous relief of chronic symptoms, as reported in control groups of several RCTs,33,34,36 as well as the baseline presence of negative prognostic factors could greatly change final results, independently from the real efficacy of the experimented TE. It will be imperative, therefore, to grow the body of evidence in favor of TE by conducting well-designed RCTs with higher-quality scores and to describe more precisely the population studied and the exercise regimen used. Future studies also should account for the time required for tissue adaptations as a result of TE when determining an appropriate time frame for follow-up.10,13 Then, we can begin to understand the effectiveness of TE for this condition in routine clinical practice.
Footnotes
Dr Bertozzi, Dr Gardenghi, Dr Villafañe, Dr Capra, Dr Guccione, and Dr Pillastrini provided concept/idea/research design. Dr Bertozzi, Dr Gardenghi, Dr Turoni, Dr Capra, Dr Guccione, and Dr Pillastrini provided writing. Dr Gardenghi, Dr Turoni, and Dr Capra provided data collection. Dr Bertozzi, Dr Capra, and Dr Guccione provided data analysis. Dr Capra provided project management. Dr Villafañe and Dr Pillastrini provided consultation (including review of manuscript before submission).
- Received October 4, 2012.
- Accepted April 1, 2013.
- © 2013 American Physical Therapy Association