Abstract
Background Recovery from low back pain (LBP) is multidimensional and requires the use of multiple-response (outcome) measures to fully reflect these many dimensions. Predictive prognostic variables that are present or stable in all or most predictive models that use different outcome measures could be considered “universal” prognostic variables.
Objective The aim of this study was to explore the potential of universal prognostic variables in predictive models for 4 different outcome measures in patients with mechanical LBP.
Design Predictive modeling was performed using data extracted from a randomized controlled trial. Four prognostic models were created using backward stepwise deletion logistic, Poisson, and linear regression.
Methods Data were collected from 16 outpatient physical therapy facilities in 10 states. All 149 patients with LBP were treated with manual therapy and spine strengthening exercises until discharge. Four different measures of response were used: Oswestry Disability Index and Numeric Pain Rating Scale change scores, total visits, and report of rate of recovery.
Results The set of statistically significant predictors was dependent on the definition of response. All regression models were significant. Within both forms of the 4 models, meeting the clinical prediction rule for manipulation at baseline was present in all 4 models, whereas no irritability at baseline and diagnosis of sprains and strains were present in 2 of 4 of the predictive models.
Limitations The primary limitation is that this study evaluated only 4 of the multiple outcome measures that are pertinent for patients with LBP.
Conclusions Meeting the clinical prediction rule was prognostic for all outcome measures and should be considered a universal prognostic predictor. Other predictive variables were dependent on the outcomes measure used in the predictive model.
Prognostic studies have been identified as a research priority as health care providers attempt to differentiate between patients with a more favorable prognosis and those with a poor prognosis.1 Prognostic studies involve the identification of baseline characteristics that are associated with a specific outcome at a given time point. As an example, researchers recently have recognized that selected characteristics of patients with low back pain (LBP) tend to have a good prognosis regardless of treatment provided.2 Using this information, more intensive care can be provided to patients with a poor general prognosis, whereas less intensive care can be provided to those who are inclined to improve regardless of intervention. Use of prognosis in this respect has the potential to decrease overall health care costs associated with selected treatment interventions for specific conditions and allow further exploration for others.3
It is important to recognize the differences between prognostic studies and prescriptive studies. Prognostic studies look only toward whether outcome is favorable in patients who have dedicated baseline characteristics, whereas prescriptive studies are designed to identify targeted interventions based on a collection of baseline findings to improve the potential for a successful outcome. A number of limitations exist for prescriptive studies, including the use of single-arm designs, which cannot provide information on treatment effects; small sample sizes; overfitting the model; spectrum bias; report of wide confidence intervals (CIs), resulting in imprecise predictor variables; and lack of validation studies.4–7 Much of the supposed improvement in many prescriptive studies is likely associated with the identification of individuals with dedicated baseline findings who are inclined to improve regardless of the intervention provided.8 The identification of patients with predictive characteristics who are inclined to improve regardless of a targeted intervention is the hallmark of a prognostic study.
A potentially overlooked limitation in prognostic studies is the fact that the strength of identified or reported predictor variables is dependent on the definition of response (the outcome variable and the definition of a successful outcome for that variable). This limitation raises concern when, during treatment decision making, clinicians rely solely on selected predictive models that were derived from a single outcome measure. In essence, prognostic variables based on a single outcome measure may not fully represent all aspects of recovery from multidimensional conditions such as LBP.
Because recovery from LBP is complex and involves multiple dimensions, Deyo and colleagues9 suggested broadening the contextual scope and including a battery of outcome variables that reflect the presence of symptoms, function, general well-being, work disability, and satisfaction with care. Although not mentioned by Deyo and associates,9 the measurement of pain also is considered an important outcome measure associated with LBP recovery.10 In addition, cost-effectiveness as an outcome measure has been used in previous studies.11,12 At present, specifically in terms of recovery, there is no fixed outcome that is considered appropriate for all contexts of LBP.13 To our knowledge, there are no cases in which the multiplicity of dimensions is considered when creating predictive models for patients with LBP. Consequently, it is likely that different outcome measures could generate very dissimilar predictive models, depending on the instrument used to capture the construct.
A previous study by Weigl et al14 identified such limitations with reported predictor variables across different definitions of response (outcome measures). In their study of identified predictors for response to rehabilitation in patients with hip or knee osteoarthritis, the authors were able to report on a set of what they termed “stable predictors” across 3 different predictive models with different response definitions (outcome measures). They reported that predictors that are not stable across different definitions of response can weaken the variables' credibility as predictors, whereas prognostic variables that are present or stable in all or most models, regardless of the outcome measure, could be considered “universal prognostic variables” and warrant further exploration. If a prognostic variable is truly robust, it is likely to manifest itself across all outcome measures used.15 The objective of the current study was to explore the potential of universal prognostic variables across 4 different outcome measures, both when the outcome was dichotomized and when it was preserved as a continuous measure, using data from a manual therapy (manipulation versus mobilization) trial for LBP. We hypothesized that selected prognostic variables would be “universal” prognostic variables, regardless of the outcome measures used within each model.
Method
Design
This study was a secondary database analysis of a randomized controlled trial (RCT). The RCT compared thrust and nonthrust manipulation in the management of LBP; thus, both groups in the study received a dedicated manual therapy approach.
Participants
The RCT enrolled 149 patients with LBP. All patients were from 16 distinct outpatient physical therapy practices in the United States. Study inclusion required an age of 18 years or older with mechanically producible LBP, whereas exclusion criteria included the presence of a tumor, metabolic diseases, rheumatoid arthritis, osteoporosis, prolonged history of steroid use, or signs consistent with nerve root compression (any of the following: reproduction of LBP or leg pain with a straight leg raise of <45°, muscle weakness involving a major muscle group of the lower extremity, diminished lower-extremity muscle stretch reflex, or diminished or absent sensation to pinprick in any lower-extremity dermatome). Individuals with a prior surgical history of the lumbar spine and current pregnancy also were excluded. Prior to inclusion in the study, all participants signed an informed consent statement.
Intervention
The intervention was a comprehensive rehabilitation intervention that included either thrust or nonthrust manipulation for the first 2 visits only, followed by physical therapist–directed care after the initial 2 visits. The details of the intervention have been described elsewhere.16 All patients in the RCT were treated by 1 of 17 highly trained physical therapists from 1 of 10 states in the United States. Clinicians had undergone extensive manual therapy training or certification in orthopedic manual therapy, or were manual therapy fellows of the American Academy of Orthopaedic Manual Physical Therapists.
Outcome Measures Used in the Predictive Models
Four outcomes measures were used for constructing the predictive models and were selected based on their variability of constructs. The 4 outcome measures represented: (1) disability, (2) pain perception, (3) total visits, and (4) perception of extent of recovery. The Oswestry Disability Index (ODI)17 was used to measure disability and is a scale that consists of 10 questions, each scored from 0 to 5, with higher scores indicating greater disability. The Numerical Pain Rating Scale (NPRS)18 was used to measure the patient's level of pain. For the NPRS, participants were asked to indicate the intensity of their current back pain using an 11-point ordinal scale ranging from 0 (“no pain”) to 10 (“worst pain imaginable”). The variable of total visits was used to measure duration of treatment. Self-report of extent of recovery (0%–100%) was used to measure perception of recovery. Participants were asked the question, “What percent, 0% (meaning not at all) to 100% (meaning totally recovered), do you feel that you have recovered at this point?” The self-report of extent of recovery is a variant of the Single Alphanumeric Evaluation (SANE),19 which has been used previously with patients with shoulder pain19 and LBP.20 Data for the 4 outcome measures were obtained at baseline, at the end of the second visit, and at discharge. Clinicians discharged each patient when they felt the patient met their maximum recovery from treatment. The average days of care (evaluation date to discharge date) was 35.7 (SD=29.9).
Prognostic Variables Used in the Predictive Model
Ten prognostic variables were selected based on prior representation in the published literature, or based on expectations derived from clinical experience. The variables of body mass index (BMI),21,22 NPRS at baseline,23 ODI at baseline,24 Fear-Avoidance Beliefs Questionnaire work subscale (FABQ-W) at baseline,25 whether an individual met the clinical prediction rule (CPR) for spinal manipulation,26 duration of symptoms (weeks), and age23 are well founded in the literature and have been acknowledged as prognostic variables. The FABQ-W27 is a 7-item questionnaire that examines an individual's beliefs about the relationship of work and pain. Fear-avoidance beliefs have been associated with current and future disability and with work loss in patients with acute and chronic LBP. The CPR for manipulation28,29 is a predictive model that involves 5 variables (no pain below knee, symptoms of <16 days' duration, FABQ-W score of <19, 1+ hips with internal rotation range of motion of >35°, and 1+ hypomobile lumbar segment) and has been proposed to be both prescriptive and prognostic.30 The variables were coded as present or not present during the initial baseline visit, with an operational definition of meeting the rule set of at least 4 out of 5 variables being present.
The variables of irritability30,31 and medical diagnosis (using the International Classification of Diseases, ninth edition [ICD-9 code]),32 were selected based on clinical experience. Irritability was a concept espoused by Maitland33 and includes 3 primary situational identifiers: (1) the vigor of activity required to provoke a patient's symptoms, (2) the severity of those symptoms, and (3) the time it takes for the symptoms to subside once aggravated (ie, pain persistence). The variable was dichotomously coded, as recommended by Maitland,33 as present or not present, with present qualified as any one or more excessive findings recognized on the 3 identifiers. Medical diagnosis codes included: (1) Lumbar Sprain/Strain (ICD-9 code 847.2), (2) Disc Displacement (ICD-9 code 722.1), (3) Lumbago (ICD-9 code 724.2), (4) Lumbosacral Joint Sprain/Strain (ICD-9 code 846.0), (5) Lumbosacral Instability (ICD-9 code 724.6), and (6) “other,” which represented a number of ICD-9 codes, including those associated with osteoarthritis and degenerative conditions. We created a dichotomous variable titled “diagnosis” by combining all strains and sprains (ICD-9 codes 847.2 and 846.0) into one group and combining all other remaining codes (722.1, 724.2, 724.6, and “other”) into another category.
Because the data that were used were part of an RCT, and despite the fact that the study found no difference between allocation of thrust or nonthrust as a technique, group allocation (thrust versus nonthrust assignment) also was included in the modeling to estimate its overall contribution to the outcomes. Group allocation was included in all 4 models as a predictor.
Data Analysis
All analyses were performed using Statistical Package for the Social Sciences, version 18.0 (SPSS Inc, Chicago, Illinois). Baseline characteristics, including means, standard deviations, and frequencies, were reported.
Logistic Regression Modeling
After dichotomizing the outcome variables, logistic regression modeling with a backward stepwise deletion (0.05 enter and 0.10 exit)34 was used to create predictive models. The ODI was dichotomized at a 50% improvement,35 whereas the NPRS was dichotomized at 2.5 points, which has been recommended as a clinical meaningful change for patients with nonspecific LBP.36 The 50% improvement on the ODI was calculated as follows: [(baseline ODI score−final ODI score)/(baseline ODI score)] × 100. We selected a self-reported extent of recovery of ≥75% a priori because there are no values in the literature that provide a meaningful cutoff. In addition, we selected 6 total visits for LBP because we felt that it more accurately represents both good comprehensive care and a cost-effective recovery. As stated previously, the 10 predictors used for modeling were BMI, baseline NPRS score, baseline ODI score, baseline FABQ-W score, group allocation (whether the participant received mobilization or manipulation), whether the participant met the CPR for manipulation, duration of symptoms (weeks), medical diagnosis, irritability status, and age.
Linear Regression Modeling (Poisson Regression)
Prognostic models were created using backward linear stepwise regression (0.05 enter and 0.10 exit),34 which targeted the final strongest model that was significant and had the highest F value (effect size) and the greatest explanatory power for the outcome variable (R2) for the ODI change scores, NPRS change scores, and self-report of extent of recovery. Because the variable of total visits was count data, we used a Poisson regression analysis, which expresses the log outcome rate as a linear function of a set of predictors. We targeted the z scores and R2 for the Poisson model. The same 10 predictors used for logistic regression modeling were used for linear regression modeling. A test of normal distribution of the model for linear regression was performed using a normal probability plot and a plot of the residuals, which used case-wise diagnostics and removal of outliers that were 3 standard deviations outside the mean.
To assess collinearity in the modeling, a variance inflation factor (VIF) and tolerance values were run for each covariate. A mean VIF close to 1 represents little collinearity, whereas 10 or greater is very poor and reflects high collinearity.37 For all regression calculations, a P value of ≤.05 was considered significant.
Results
Table 1 outlines the descriptive statistics of the sample. The age range of the sample was diverse and included individuals 18 to 88 years of age (X̅=48.2, SD=14.9). Most participants (73.2%) did not display irritability at baseline, and the majority (91.3%) were white. Body mass index* ranged from 18.7 to 46.7 lb/in2, total visits ranged from 3 to 28, and total days of care ranged from 3 to 150. The majority of participants (50.3%) demonstrated acute LBP, with the total sample exhibiting an average duration of symptoms of 33.9 weeks (SD=98.9). The baseline means for the ODI and the NPRS demonstrated a moderately debilitated group (μ=30.6, SD=15.7 and μ=5.2, SD=2.1, respectively).
Descriptive Statistics of the Sample (N=149)
There was little to no collinearity among the variables in each of the 4 models. Our VIFs were 1.04 for the ODI, 1.04 for the NPRS, 1.02 for total visits, and 1.01 for extent of recovery.
For the logistic regression modeling, all 4 models were significant. After the backward stepwise deletion, 3 variables were significantly associated with a 50% reduction in ODI scores, 3 variables were significantly associated with an NPRS change score of ≥2.5, 3 variables were significantly associated with 6 or fewer visits, and 2 variables were significantly associated with an extent of recovery scores of 75% or higher. Meeting the CPR at baseline was significant for all 4 predictive models, followed by no presence of irritability at baseline (2 of 4 models), diagnosis (2 of 4 models), and younger age (1 or 4 models). Group allocation (whether the participants received mobilization or manipulation) was not associated with the outcome for any of the 4 predictive models. Participants who met the CPR were 4.8 (95% CI=1.8, 10.4) times as likely to improve than those who did not meet the CPR, for a change score of ≥2.5 on the NPRS, and 4.0 (95% CI=1.6, 9.8) times as likely to improve than those who did not meet the CPR, for an improvement in rate of recovery of 75% or greater. The logistic regression modeling results are presented in Table 2.
Backward Stepwise Logistic Regression Modeling, Including Measured Variables, Individual P Values for Each Model Variable, and Odds Ratios and 95% Confidence Intervalsa
For the linear regression modeling, all 3 of the predictive models demonstrated normality and were significant (P<.01), with the NPRS change score model exhibiting the highest F value (57.4) and the strongest R2 value (61.9). Five predictor variables were associated with the ODI change score, 4 variables were associated with the NPRS change score, and 2 variables were associated with the extent of recovery. For the Poisson regression analysis, 3 variables were associated with total visits. Individual prognostic variables within the models included: (1) meeting the CPR at baseline (4 of 4 models), (2) lower ODI score at baseline (2 of 4 models), (3) NPRS score at baseline (1 of 4 models), (4) diagnosis (2 of 4 models), (5) no presence of irritability at baseline (2 of 4 models), (6) duration of symptoms at baseline (2 of 4 models), and (7) younger age (1 of 4 models). Within the 4 models, meeting the CPR for manipulation at baseline was the only variable present in all 4 of the models. Table 3 presents the results of the backward linear regression modeling (and Poisson modeling for total visits), including captured variables, individual P values for each prognostic variable, beta coefficient values, and CIs (or z scores) for each prognostic variable, model adjusted R2, and model significance.
Backward Linear Regression Modeling,† Including Measured Variables, Individual P Values for Each Model Variable, Coefficient β Values for Each Variable (and 95% Confidence Intervals), Model F Values, Model Adjusted R2 Values, and Model Significance and Poisson Regression Analysis,‡ Including Captured Variables, Individual P Values for Each Model Variable, Coefficient Values for Each Variable (and 95% Confidence Intervals), Model χ2 Values, Model Adjusted Pseudo R2 Values, and Model Significancea
Discussion
Our study endeavored to determine which variables were associated with specific prognosis in a sample of patients who received a manual therapy approach in the management of LBP. Our objective was to explore the potential of universal prognostic variables across 4 different outcome measures, both when the outcome was dichotomized and when it was preserved as a continuous measure, to identify universal prognostic variables. Other authors15 have suggested that predictive variables that are significant across all regression models are more robust and stable. We elected to examine 4 unique outcome measures representing disability, pain, total visits, and self-perceived extent of recovery, and we opted to dichotomize and preserve the outcome variables as continuous measures (or count variables when appropriate). These outcome variables were selected to represent the multidimensional aspects of recovery that are purported to be relevant in patients with LBP. In addition, looking at multiple outcomes and universal predictors across all the outcomes reduces the risk of finding predictors that occur by chance alone. Meeting the CPR for manipulation was identified as a universal prognostic variable. Other variables, such as diagnosis, shorter duration of symptoms, ODI score at baseline, and irritability at baseline, were present in 2 of 4 models in the linear (Poisson) regression model, whereas irritability and diagnosis were present in 2 of 4 models in the logistic regression analysis.
The CPR for manipulation was the only prognostic variable represented in all 4 of the prognostic predictive models (for different outcomes); thus, it is the only universal predictor of all of the prognostic variables assessed. It is worth noting that it was present in all 4 of the models for both the linear and logistic regressions. Thus, in this study, individuals with LBP who received both mobilization and manipulation approaches with a multimodal treatment and who met the CPR for manipulation were likely to respond favorably (across several definitions) compared with those who did not meet the CPR. These findings support the suggestions of Kent and colleagues,26 who, after use of a novel formula, identified that the CPR for lumbar manipulation was both prognostic and prescriptive of a positive response to a specific treatment. Indeed, the method used in most current CPR derivation studies has included single-arm trials, which has been recognized as an erroneous mechanism for determining prescriptive predictors.5,8 Single-arm trials likely capture prognostic variables, in addition to or in lieu of prescriptive variables, and fail to differentiate a true treatment effect. It is worth noting that the original validation study for the CPR for lumbar manipulation29 was validated using a proper methodological design, which included an RCT that had 4 separate arms.
No irritability at baseline was associated with a positive outcome in 2 of 4 of the outcome measures (total visits and extent of recovery) for both forms of regression. Irritability is considered a composite construct, which is likely why past reliability estimates have been reported as fair (prevalence-adjusted bias-adjusted kappa=0.43; 95% CI=0.23, 0.62),31 and reliability for irritability does not appear to be related to acuity or chronicity. Irritability judgments typically are used by physical therapists to modify clinical decision-making processes.30 In most cases, when a patient is claimed to be irritable, clinicians modify the vigor in their examination and treatment; thus, there is a chance that less intense manual therapy and exercises were provided to those who were irritable. Why no irritability at baseline was related to extent of recovery and total visits, and not ODI or pain, is unknown. Its presence in 2 of 4 models suggests the concept of irritability warrants further exploration.
We dichotomized diagnostic ICD-9 codes into those that reflected lumbosacral sprains and strains and those that consisted of “other” conditions, most frequently, lumbago. The “other” category was selected as the reference variable. The finding that diagnosis was associated with prognosis and the outcome measures of ODI change score and 6 or fewer visits in the logistic and linear models was an interesting and unexpected finding. This finding suggests that a specific diagnosis can influence outcome in patients with LBP, a concept that is slightly different from the clinical practice guidelines suggested in a previous publication.38 Indeed, those participants diagnosed with sprains and strains exhibited significantly poorer overall outcomes compared with individuals with other ICD-9 codes, especially lumbago. The exclusion criteria in this study eliminated individuals with lumbar radiculopathy and other conditions that encompass 2 of the 3 categories in Chou and colleagues'38 low back recommendations; thus, it is unlikely that those comparative diagnoses included conditions with “red flags.” This finding also warrants further exploration.
Chronicity of symptoms does appear to matter for most outcome measures, as shorter duration of symptoms was present in 1 of 4 predictive logistic models and in 2 of 4 linear models and has been recognized as a feature in many prognostic studies.23 A shorter duration of symptoms is 1 of the 5 predictor variables in the CPR model,28,29 but was still uniquely represented in all of the backward stepwise regression models in which CPR also was identified as a prognostic factor, with low scores of collinearity. This variable is likely prognostic in that the negative psychosocial and debilitative consequences of chronic LBP are not overtly present yet and, as such, spontaneous recovery is likely in many cases.39
As previously stated, younger age and report of pain at baseline have been recognized previously as prognostic variables.23 Younger age was significantly associated with a 50% reduction in ODI scores, which, to our knowledge, is a novel finding. Although a number of studies have suggested that the FABQ40,41 or the FABQ-W42 was associated with a poor prognosis, we did not find an association with any of the 4 measures in our study. Cleland and colleagues43 also were unable to find a relationship between outcome and FABQ score in their prognostic study and suggested that the FABQ and its subscales are not an effective screen for poor recovery in people with non–work-related pain, a population similar to the participants in our study. The ODI score at baseline was a predictor for the linear regression models and was associated with ODI change scores and NPRS change scores.
It is worth noting that for the logistic regression analysis, we selected cutoffs that were advocated in the literature (when available), and we selected reasonable cutoffs when more than one recommendation was available as in the instance of a pain change score.37 We realize that our selection of cutoff scores for the outcome variables will significantly affect the prognostic relationships with the independent predictors. During research of an outcome, a cutoff value has to be chosen a priori to judge which changes were worthwhile.37 At this time, a universally accepted meaningful cut-point for recovery from LBP does not exist in the literature.13 Certainly, the cut-points are likely to be different based on the severity of symptoms within the population, the condition of interest, and other factors.
Limitations
There are limitations to this study. Only 4 outcome measure constructs were used in this study, and representation from measures that are associated with patient satisfaction and with presence and breadth of symptoms may more fully complement the multidimensional nature of LBP. A larger sample and better prognostic predictors would likely have improved the precision of the regression-based estimates, which resulted in CIs that were very wide in some cases. An RCT is not the optimal study design for prognostic studies. It is likely that the sample is less generalizable than a cohort study, a more appropriate form of design. The assessment of the prognostic outcomes was not blinded to prognostic findings in the study. In addition, dichotomization of continuous outcome measures, which is used during logistic regression, may misclassify some study participants into the referent or index groups. Thus, any time there are changes to the cut-point in a logistic regression, they potentially may change the proportions in the groups and may fail to find the same predictors within a dedicated model.
Conclusion
Our secondary database analysis shows that the CPR was prognostic for all outcome measures and should be considered a universal prognostic predictor. Different outcome measures used in separate predictor models are populated by different predictors. Variables that are not prognostic across several definitions of response may account for only a small part of the recovery of a patient, and we suggest that investigators in future prognostic studies consider reporting only those variables that remain constant across several definitions of response (outcomes). Future studies should investigate the universal prognostic variables for LBP recovery, or universal prescriptive variables for dedicated LBP treatment approaches before fully advocating single-outcome measures-oriented predictive models.
Footnotes
-
Dr Cook, Dr Learman, Mr O'Halloran, and Mr Showalter provided concept/idea/research design. All authors provided writing. Dr Learman, Mr O'Halloran, Mr Showalter, and Mr Kabbaz provided data collection. Dr Cook and Dr Goode provided data analysis. Mr Showalter provided project management and institutional liaisons. Dr Learman, Mr O'Halloran, and Mr Kabbaz provided study participants and facilities/equipment. Dr Learman, Mr Showalter, Mr Kabbaz, Dr Goode, and Dr Wright provided consultation (including review of manuscript before submission).
-
The study was approved by the Walsh University Human Ethics Review Board.
-
The study was a secondary database analysis of a randomized controlled trial registered at ClinicalTrials.gov: Identifier NCT01438203.
-
↵* Body mass index was calculated as: Weight (lb)/(Height [in])2 × 703, where 1 lb=0.4536 kg and 1 in=2.54 cm.
- Received May 24, 2012.
- Accepted August 2, 2012.
- © 2013 American Physical Therapy Association