Abstract
Background Minimal clinically important improvement (MCII) is the smallest outcome measure change important to patients. Research suggests that MCII is dependent on patients' baseline functional status measures.
Objective The purposes of this study were: (1) to confirm whether MCII is dependent on patients' admission scores and (2) to test whether MCII is dependent on selected demographic characteristics.
Study Design and Setting This was a prospective, longitudinal, observational cohort study of 6,651 patients with orthopedic knee impairments treated in 332 outpatient rehabilitation clinics in 27 states in the United States.
Outcome Measures Patient self-reports of functional status (FS) from the Lower Extremity Functional Scale were assessed using a computerized adaptive testing application (0–100 scale).
Methods An anchored-based longitudinal method, with a 15-point Likert-type scale (−7 to +7), was used to provide a global rating of change (GROC). The MCII threshold for the GROC was defined at a cut-score of +3 or greater and was determined using nonparametric receiver operating characteristic curve analysis for each of the following variables: sex, symptom acuity, age group, and quartile of baseline FS scores.
Results The results showed that MCII was dependent on patient baseline and demographic characteristics. Patients who were male, were younger, had more-acute symptoms, or had lower FS scores at admission required more FS change to report meaningful change.
Limitations As this study was a secondary analysis, how the length of treatment mediated the relationship between the independent and dependent variables was unclear.
Conclusions Although a single MCII index may provide a standard cut-score defining the smallest FS change that is meaningful to patients, researchers and clinicians should be aware that MCII is context specific and not a fixed attribute. Current results may help researchers, clinicians, and policy makers to interpret FS change related to the importance of the change to the patient.
Measurement of patient-reported outcomes (PROs) in health care continues to evolve, with health status questionnaires increasingly being used in medical research and clinical practice.1–5 As recommended by the US Food and Drug Administration,6–8 regardless of whether the primary endpoint for a clinical trial is based on laboratory tests or clinical examinations, it is useful to display individual responses using PROs from the patient's perspective. Showing the individual patient's pretreatment-posttreatment PRO score change usually is interpreted as a treatment benefit. Moreover, clinical effectiveness studies have to show whether the score changes have exceeded minimal clinically important change (ie, a priori responder definition). Any patients whose answers allow them to reach the threshold are considered responders.6,9 The proportion of responders to total patients represents the likelihood of patients responding favorably to the treatment.9 Comparisons of interventions then can be assessed as differences in the percentage of responders, a number that is more meaningful to both patients and physicians.10
One approach to establishing the empiric evidence for any responder definition is to use anchor-based methods to derive the minimal clinically important difference (MCID),11,12 the smallest change score meaningful to the patient for an outcome measure. The MCID is the minimal amount of change on a scale required to be considered a clinically important change by the patient.13 Because MCID determined via an anchor-based method is independent of sample size and is based on patients' judgments of whether the amount of improvement after the intervention or therapy is meaningful to them, investigators are encouraged to develop MCID estimates of patient-reported outcome measures to support the empiric evidence for responder definition6 in clinical trials.
As the MCID could indicate change scores in both improvement and deterioration and thus have compounding meanings, Hart and colleagues14–17 preferred to use the term “minimal clinically important improvement” (MCII) to address the smallest meaningful improvement score important to patients. Recent work has implied that MCID varies depending upon the range of scores at baseline,12,18,19 symptom acuity (ie, acute versus chronic),4,20,21 and pain location20 and has suggested that the MCID index is context specific.20 However, sample sizes of previous studies were small (n=60,19 143,18 191,20 226,12 and 44221), and only one factor per study was investigated. The phenomenon of MCID dependency across different patient characteristics remains unclear, and the entire distribution of patient responses to the external anchor (eg, 15-point global rating of change [GROC]) is unknown. Because an anchor-based MCID approach has become popular to define responders, comprehensive understanding of MCID's dependency has not been studied and thus warrants future studies.
The current study built on previous work where we developed, simulated, and applied body part–specific computerized adaptive testing (CAT) applications14–16,22–24 for patients with a variety of impairments seeking rehabilitation in outpatient therapy clinics. The primary outcome measure was patient self-report of physical functional status (FS) from the Lower Extremity Functional Scale (LEFS) assessed using a CAT application (0–100 scale). Previous studies have shown that functional status measures estimated by the knee CAT are reliable, valid, responsive, sensitive to change (although dependent upon intake FS), usable, and clinically interpretable.14,22,25,26 In the current study, we examined whether MCII depends upon patients' scores at admission and selected demographic characteristics. We estimated MCII: (1) using all patients regardless of intake FS measure and (2) using patients grouped by sex, symptom acuity, age, and quartile of baseline FS scores.
Method
Data Collection
Data were collected from a prospective, longitudinal, observational cohort of patients with orthopedic knee impairments seeking rehabilitation in outpatient therapy clinics participating with Focus On Therapeutic Outcomes, Inc (FOTO),27,28,* a medical rehabilitation outcomes company. Patients seeking rehabilitation entered demographic data and completed self-report surveys using Patient Inquiry computer software developed by FOTO27,28 prior to initial evaluation and therapy. Clinical staff entered demographic data at intake. The CAT was administered again at the conclusion of rehabilitation. Under these 2 conditions, data were labeled “intake” and “discharge,” respectively. The functional status change score was calculated by subtracting the FS score at intake from the FS score at discharge.
In addition, participants responded to a single question about degree of perceived improvement on a 15-point Likert-type scale (−7 to +7) to provide a GROC score11 at discharge as an anchor to define the magnitude of change perceived by the patient. Examples of “better” response options, compared with no change (0) or getting worse (−1 to −7), are: (1) “hardly any better at all,” (2) “a little better,” (3) “somewhat better,” (4) “moderately better,” (5) “a good deal better,” (6) “a great deal better,” and (7) “a very great deal better.”
Data were selected from the CAT database if patients: (1) were 18 years of age or older, (2) were managed for an orthopedic impairment of the knee, (3) received outpatient physical therapy, (4) completed the knee CAT between January 2005 and October 2009, and (5) reported GROC at discharge. Data represent a sample of convenience.
Knee CAT
Computerized adaptive testing is a form of computer-based test administration in which each patient takes a customized test where a computer administers items tailored to the current estimate of the patient's ability (eg, FS).29 Briefly, the adaptive test started by administering the most informative item30 at median-level difficulty (ie, walking 2 blocks) (see Appendix for the flowchart of how the LEFS was administered). After each response, the patient's FS ability with associated standard error was estimated.31 There were 2 stopping rules: (1) the standard error for the provisional ability was less than 4 out of 100 FS units, and (2) each change in provisional ability estimates for the last 3 administered items was less than 1 out of 100.22 If a stopping rule was not met, the computer selected the most informative item given the current FS estimate. The computer continued to administer items until a stopping rule was satisfied, at which time a final estimate of ability and its standard error were calculated. The final FS score represented a point estimate for each patient's lower-extremity functional status on a 0 to 100 scale, with higher scores representing higher functioning. Using the CAT algorithm, patients may receive a different set of items when taking a CAT at different time points, depending upon the change in their level of FS, but the FS scores were placed on the same metric, and thus scores were mathematically comparable. Functional status, as assessed using LEFS items, was operationally defined as the patient's perception of his or her ability to perform functional tasks described in the FS items, which represents the “activity” dimension of the World Health Organization's International Classification of Functioning, Disability and Health.32
Development,22 simulation,22 and use14,22,25,26,33 of the LEFS CAT have been described previously. Briefly, the item bank for the CAT was developed using items from the LEFS,13 a scale with strong psychometric properties13,34–36 and broad clinical and research acceptance.37 The LEFS has been widely used in clinical studies of patients with varied lower-extremity musculoskeletal dysfunction,38,39 involving hip,15,40–43 knee,14,25,44,45 and foot and ankle impairments.46–50 During routine administration in the clinic, the LEFS CAT items were presented by asking the participant, “Today, do you or would you have any difficulty at all with:” followed by activities such as “performing heavy activities around your home.” The computer selected items from the 18-item LEFS item bank and used the 5 original LEFS response categories: (1) “extreme difficulty,” (2) “quite a bit of difficulty,” (3) “moderate difficulty,” (4) “a little bit of difficulty,” and (5) “no difficulty.”
Unidimensionality and local independence of the 18-item bank from the LEFS were supported.22 The LEFS CAT items were fitted to the Andrich51 rating scale 1-parameter item response theory model. Item location parameters supported a clinically logical hierarchical structure of the item bank. The LEFS items demonstrated differential item functioning (DIF)52 by lower-extremity body part affected (hip, knee, or ankle and foot), where patients with different body part impairments had different probabilities of passing or performing a specific item or task even if they had similar FS estimates.53 Therefore, the LEFS CAT was developed with items calibrated using data from patients with specific body part impairments (ie, hip, knee, or ankle and foot), which makes the knee CAT a body part–specific or condition-specific CAT.14,22,25 Results from our previous MCII analysis14 indicated 9 or more FS change units overall represented clinically meaningful improvement, where 67% of patients with discharge data reported FS change equal to or greater than MCII and MCII was dependent upon intake FS, with patients perceiving improvement with fewer FS units as intake FS scores increased.
Data Analysis
The MCII of an instrument can be determined via distribution-based or anchor-based methods.54,55 Distribution-based methods are based on the statistical characteristics of scores of the obtained sample, including the effect size, the standardized response mean, and the responsiveness statistic.56 Anchor-based methods take the external criteria that are clinically relative as the anchors (or reference line) for determining the MCII, including patient GROC11,13,56,57 and diagnostic test method.58 As suggested by de Vet et al4 and Hays et al,59 anchor-based methods estimate whether group change is big enough to be regarded as clinically important, and the concept of “minimal importance” is explicitly defined and incorporated into these anchor-based methods.4 Therefore, anchor-based methods provide more clinically relevant results.
In this study, we used the GROC scale described by Jaeschke et al11 as the comparison standard. Participants were asked about their perceived improvement (ie, GROC) at discharge. Using receiver operating characteristic (ROC)60 analyses, we identified the MCII. We equated important change with a global rating score of ≥3. Participants were dichotomized by their GROC scores as those who did not improve (ie, GROC scores <3) versus those who improved (ie, GROC scores ≥3). We estimated MCII: (1) using all participants regardless of intake FS measure and (2) using participants grouped by sex, symptom acuity, age, and quartile of baseline FS scores. The area under the ROC curve (AUC), standard error, and 95% confidence interval were used to describe the ROC results. Generally, a random classifier has an area of 0.5, and an ideal classifier has an area of 1. The AUC values using ROC analysis are classified in several levels, where 0.50 to 0.75 is considered fair, 0.75 to 0.92 is good, 0.92 to 0.97 is very good, and 0.97 to 1.00 is excellent.61
Because equating important change with a different criterion may lead to different results, we also equated important change with a global rating score of ≥4 for exploratory purposes. In this case, participants were dichotomized by their GROC scores as those who did not improve (ie, GROC scores <4) versus those who improved (ie, GROC scores ≥4), and we compared these results with the results obtained using GROC scores of ≥3.
Because the ROC of a classifier shows its performance as a trade-off between sensitivity (true positive rate) and specificity (true negative rate), the MCII cut-score for each analysis was identified by selecting the FS change score with the largest (sensitivity + specificity)/2.
For variables used in the ROC curve analysis, sex was categorized as male and female. Symptom acuity, which we operationally defined as the number of calendar days from the date of onset of the condition being treated in therapy to the date of initial therapy evaluation, was categorized as acute (<22 days), subacute (22–90 days), and chronic (>90 days). Age was categorized as 18 to 44, 45 to 64, and 65 years and older. All of these category definitions have been used in previous development and validation studies14–17,22–25,43,49,62–64 and have been shown to be good indicators when assessing known group construct validity.14–16
The above MCII cut-scores derived using ROC analyses represented point estimates on ROC curves. To perform statistical comparisons among groups, we took advantage of the bootstrapping approach, a statistical method for estimating the sampling distribution of an estimator by sampling with replacement from the original sample.65 For each variable (ie, sex categories, symptom acuity categories, age groups, and 4 quartile of baseline FS scores, for a total of 12 independent variables), we randomly selected 80% of the original sample for 30 times, and thus we generated 30 × 12 subsets of original data. For each data set, we followed the same procedure and performed the ROC curve analysis. As a result, we obtained the sampling distribution of the MCII estimates (ie, mean with standard deviation). We then performed statistical comparisons of MCII estimates among sex, symptom acuity, age, and quartile of baseline FS scores using an analysis of variance. To summarize the ROC analysis results, we provided descriptive statistics and plotted the distribution of FS change scores per GROC category by sex, symptom acuity, age, and quartile of baseline FS scores.
Results
Data from 6,651 patients with orthopedic knee impairments who received outpatient physical therapy in 332 outpatient clinics in 27 states (United States) were analyzed (Tab. 1). The participants' mean age was 53 years (SD=17, range=18–94). Fifty-six percent of the participants were female. Most of the participants reported their symptoms as chronic (41%) or subacute (27%) versus acute (23%) (missing values=9%). The mean intake FS value was 43 (SD=14). The mean discharge FS value was 61 (SD=17). On average, FS scores improved 18 points (SD=17) at discharge. Identification of medical or surgical diagnoses was optional in the data collection, but of the participants (75%) with medical or surgical codes, the most common conditions were postsurgical conditions (33%), soft tissue disorders (27%), arthropathies (4%), and sprains and strains (4%).
All participants had both GROC and FS change data. Of these, 124 (1.9%) reported moderate to severe deterioration (ie, GROC scores −7 to ≤−3), 135 (2.0%) reported mild deterioration (ie, GROC scores >−3 to ≤0), 1,054 (15.8%) reported mild improvement (ie, GROC scores >0 to ≤3), and 5,338 (80.3%) reported moderate to large improvement (ie, GROC scores >3 to 7). For the entire sample, ROC analyses (Tab. 2) supported that an FS change score of 12 or more units represented clinically meaningful improvement. Figure 1 displays the GROC (x-axis) and mean FS change score (y-axis) based on the entire sample (blue line) and related frequency count associated with each GROC category (green bar).
Sensitivity to Change Estimated Using Receiver Operating Characteristic Analyses for the Entire Sample and Intake Functional Status (FS) Scoresa
Relationships between global rating of change (GROC) and mean functional status (FS) change score (N=6,651). FS change score=discharge FS score − intake FS score, count=frequency count of the number of participants.
When participants were grouped by baseline FS measures and 4 ROC analyses were run (1 per quartile of FS intake measures), the ROC analyses (Tab. 2) supported that FS change scores of 14 or more units, 12 or more units, 5 or more units, and 5 or more units represented clinically meaningful improvement for participants in the first quartile (intake FS score 0–33 units), second quartile (intake FS score >33–42 units), third quartile (intake FS score >42–51 units), and fourth quartile (intake FS score >51–100 units) of FS intake measures, which are similar to previously reported values.14 The results suggested that the overall AUCs were good (range=0.75–0.79), and MCII was dependent upon intake FS score, with participants perceiving improvement with fewer FS units as intake FS scores increased.
When we equated important change with a global rating score of ≥4 for exploratory purposes (N=6,651), 95 participants (1.4%) reported moderate to severe deterioration (ie, GROC scores −7 to ≤−4), 164 (2.5%) reported mild deterioration (ie, GROC scores >−4 to ≤0), 2,032 (30.6%) reported mild improvement (ie, GROC scores >0 to ≤4), and 4,360 (65.6%) reported moderate to large improvement (ie, GROC scores >4 to 7). For the entire sample, ROC analyses supported that an FS change score of 12 or more units represented clinically meaningful improvement. When participants were grouped by baseline FS measures and 4 ROC analyses were run per quartile of FS intake measures, the ROC analyses supported that an FS change score of 17 or more units, 12 or more units, 8 or more units, and 5 or more units represented clinically meaningful improvement for participants in the first, second, third, and fourth quartiles of FS intake measures, which are similar to previous findings, with slightly greater values of MCII for the first and third quartiles of intake FS scores.
When participants were grouped by sex, the ROC analyses (Tab. 3) supported that FS change scores of 14 or more and 6 or more units represented clinically meaningful improvement for male and female participants, respectively. When patients were grouped by symptom acuity, the ROC analyses supported that FS change scores of 17 or more units, 6 or more units, and 9 or more units represented clinically meaningful improvement for participants with acute, subacute, and chronic symptoms, respectively. When participants were grouped by age group, the ROC analyses supported that FS change scores of 8 or more units, 12 or more units, and 5 or more units represented clinically meaningful improvement for participants who were aged 18 to 44, 45 to 64, and 65 years and older, respectively. Results of statistical comparisons supported that the MCII estimates differed significantly among groups (all P<.001) (Tab. 4).
Sensitivity to Change Estimated Using Receiver Operating Characteristic Analyses for Sex, Symptom Acuity, and Agea
Statistical Comparisons of the Minimal Clinically Important Improvement (MCII) Estimates Among Groupsa
Overall, the ROC analyses supported that participants who were male, were younger (<65 years of age), or had more-acute symptoms required more FS change to report meaningful change.
Figure 2 displays the GROC (x-axis) and mean FS change score (y-axis) using participants grouped by quartile of baseline FS scores. Figure 3 displays the GROC (x-axis) and mean FS change score (y-axis) using participants grouped by sex, symptom acuity, and age.
Global rating of change (GROC) of participants with different intake functional status (FS) scores (0–100 scale). FS change score=discharge FS score − intake FS score, Q1=quartile 1 intake FS score (0–33), Q2=quartile 2 intake FS score (>33–42), Q3=quartile 3 intake FS score (>42–51), Q4=quartile 4 intake FS score (>51). The GROC is a 15-point scale (−7 to +7). The vertical line indicates the point on the x-axis where GROC=+3.
Global rating of change (GROC) of participants with varied demographic characteristics (ie, sex, symptom acuity, and age). FS change score=discharge FS score − intake FS score. The GROC is a 15-point scale (−7 to +7). The vertical line indicates the point on the x-axis where GROC=+3.
Discussion
The purpose of our study was to examine whether MCII depends upon individuals' scores at admission and selected demographic characteristics, based on patient self-report of functional status from the LEFS using a CAT application. The results support the conclusions of previous studies12,14,18,21 that the MCII was dependent upon baseline state and some patient characteristics.
The results of this study were based on ROC analyses that identified MCII estimates determined by global rating scales reported by participants who provided the highest average sensitivity plus specificity for change. Therefore, the MCII estimates represent sensitivity to change indexes at the individual patient level. We did not calculate estimates of important group-level change, which would be expected to be smaller than the reported patient-level MCII estimates.66 Because MCII estimates were based on participants' judgments of whether the amount of improvement was meaningful to them, such estimates might be more clinically relevant than other indexes estimated from group-level analyses, such as effect size,67 standardized response mean,4 or Guyatt Responsiveness Index score.54 The results may assist researchers, clinicians, and policy makers in interpreting clinical versus statistical importance of specific studies or interpreting the change related to responders in clinical trials relevant to physical therapy.
There is debate regarding use of the GROC for determining MCII. First, different cut-scores of +2 to +3,56 +1 to +3,11 and +3 or greater14–16,68 have been used to represent minimal clinically important change on a 15-point Likert scale. In our study, we chose a cut-score of +3 or greater (+3=“somewhat better”) to define the MCII, based on findings in previous studies14–16 that supported this level of change as an adequate estimate of important improvement. When we performed a sensitivity analysis and equated important change with a global rating score of ≥4 for exploratory purposes, the same or slightly higher values of FS change units represented clinically meaningful improvement for some quartiles of baseline FS scores. The results were similar, but different levels of GROC provide different MCII values, which were expected. Therefore, a researcher needs to select and report the levels of GROC to be able to compare studies. Second, the validity of utilizing a retrospective GROC approach has been criticized69 because of the potential for recall bias, and the rating results might be influenced by status at discharge, as the GROC relies on retrospective judgments experienced over weeks or months. Nonetheless, many researchers have proposed the retrospective GROC approach as one external anchor to capture a patient's perception of important improvement, along with ROC analyses as a valid sensitivity-to-change method.11,14–16,54,56,68 The results suggest estimating clinically important change is complex and warrants multiple methods of estimation; thus, future studies should examine different ways of assessing important change.
The underlying reason for the observed MCII dependency phenomenon is still unknown. There are several possible explanations. First, when baseline scores (intake FS) are negatively correlated with change (FS change score), regression to the mean is common.70 Second, it is logical that patients with lower intake FS scores have the opportunity to improve more and patients with higher FS scores have less potential to change, which would imply the potential for differences in MCII related to intake FS scores. Third, there may be differences in perceptions of improvement when a patient moves from a dysfunctional state to a more functional state. Fourth, patients may equate important change to crossing over from a dysfunctional state to a functional state. In any event, further research on the relationship between intake FS scores and MCII is recommended.
There has been discussion regarding whether clinically important change is the same for improvement and deterioration.4 Given that only 11% of the sample had decreased functional status (ie, FS change score <0) and 2% experienced deterioration during the course of therapy (ie, GROC <0), as displayed in Figures 1, 2, and 3, it might not be easy to obtain precise estimates of change for deterioration due to a small sample size, but future studies should examine this issue.
Fortunately, our sample was relatively large, so we were able to adequately describe the distribution of FS change scores per GROC category. Prior to the analysis, we hypothesized that: (1) a mean FS change score at a GROC of +3 is a good predictor of MCII threshold, and (2) the mean FS change score will increase as the GROC increases. The ROC analysis supported that inferences made from the mean FS change score were similar to results based on the ROC curve analysis (ie, patients who were male, were younger, had more-acute symptoms, or had lower FS scores at admission required higher MCII thresholds) (Tabs. 2 and 3). As shown in Figures 1, 2, and 3, the relationship between the GROC and the mean FS change score was not a linear, monotonic curve. It appears there are 2 areas of the curves: a monotonically increasing side represented by positive GROC values and a variable side represented by negative GROC values. We do not have an explanation for these data, but 2 hypotheses might be that participants misinterpreted the 15-point Likert scale or that the 15-point Likert categories represent too fine a scale that participants had difficulty discriminating. In addition, in our data, small frequency counts for GROC scores of <0 hindered calculation of stable parameter estimations for degradation. We adopted Jaeschke and colleagues'11 15-point GROC scale, but we did not ask the preparatory question of their approach. The results suggested the original 2-layer approach proposed by Jaeschke et al11 may be a solution to avoid the confusion by first asking patients to indicate whether they felt worse, about the same, or better and then administering the 7-point GROC scale. Future studies are recommended to clarify these relationships and hypotheses.
Results from our previous ROC analyses14 supported that 9 or more FS change units represented clinically meaningful improvement based on the entire sample, whereas our current study supported that 12 or more FS change units represented clinically meaningful improvement. The inconsistency of the results may be due to different study samples or the instability of ROC estimates. We suggest MCII estimates should be validated in large samples and cross-validated to check stability. In addition, although an MCII estimate based on a larger sample size should represent a more stable and accurate result, there are currently no recommendations as to the sample size sufficient to obtain a stable MCII estimate using the ROC curve analysis.
Because the MCII is essentially an estimate, researchers may argue that one value would be sufficient for the entire population. Nonetheless, MCID or MCII estimates have been used to define the “responders”6,9 or “practical significance” (versus statistical significance) of dependent variables such as FS change. If a single MCII value is used for an instrument where MCII is dependent upon intake FS or other variables, researchers may overestimate or underestimate the percentage of responders or overinterpret the practical significance of FS change scores, given samples of individuals with more-severe impairments or with higher levels of functioning. Therefore, our data suggest that multiple levels of MCII may be important for more-accurate analyses.
This study was based on a sample of patients who reported GROC at discharge. To investigate potential selection bias, we randomly selected another 6,651 patients with orthopedic knee impairments who completed the CAT between January 2005 and October 2009 but did not have GROC data. Because the sample size was large (N=13,302), we set the significance level at .001. Overall, patients who had GROC data, compared with patients who did not respond to the GROC item, were older (mean age=52.8 years [SD=17] versus 51.0 years [SD=17], t=6.2, df=12,182, P<.001). Groups were not different based on sex (χ2=0.4, df=1, P=.506), symptom acuity (χ2=5.1, df=2, P=.077), or intake FS scores (mean=43.0 [SD=14] versus 43.7 [SD=15], t=2.8, df=13,300, P=.006). These results suggested our analyses of intake FS, age (although the difference appeared to be of negligible clinical importance), sex, and symptom acuity data were not affected by patient selection bias.
Because this study was a secondary analysis of prospectively collected data via a proprietary database management company (ie, FOTO), the researchers were not in control of the data collection procedure and did not have a specific timetable for assessment of patients. Patients responded to the GROC question at the conclusion of discharge, with varied numbers of visits to the clinic. As a result, it was unclear how the length of treatment mediated the relationship between the independent and dependent variables, although our data showed a very low correlation between FS change score and treatment duration (Pearson r=.06), as well as between GROC score and treatment duration (Pearson r=−.06). Generalizability of results may be limited because there was the potential for patient selection bias related to which patients were asked to take the CAT. There may be differences between participating clinics compared with clinics that do not collect data using FOTO. However, use of proprietary database management companies offers the opportunity for studies related to practical application and assessment of psychometric properties of various measures in large samples that would not be available under routine, extramurally funded projects.
Conclusion
The results supported the baseline state and patient characteristics dependency of MCII estimates. Although a single MCII value may provide a standard cut-score defining the smallest meaningful change from the perspective of the patient for outcome measures, researchers and clinicians should be aware that the MCII is context specific and not a fixed attribute.
Estimates of MCIDs for deterioration appeared to represent a different pattern compared with improvement, warranting future studies.
Appendix.
Lower Extremity Functional Scale (LEFS) Computerized Adaptive Testing (CAT) Algorithm: An Example
Footnotes
-
Dr Wang, Dr Hart, and Mr Stratford provided concept/idea/research design. Dr Wang and Dr Hart provided writing. Dr Hart and Mr Mioduski provided data collection. Dr Wang provided data analysis. Mr Mioduski provided project management and participants. Dr Hart and Mr Stratford provided consultation (including review of manuscript before submission).
-
This project was approved by the Institutional Review Board for the Protection of Human Subjects from the Focus On Therapeutic Outcomes, Inc.
-
↵* Focus On Therapeutic Outcomes, Inc, PO Box 11444, Knoxville, TN 37939-1444.
- Received July 13, 2010.
- Accepted January 10, 2011.
- © 2011 American Physical Therapy Association