Abstract
Background In the McKenzie system of mechanical diagnosis and therapy (MDT), a reliable system for the management of spinal problems, classifications are used to guide management strategies. For the classification of extremity disorders, interexaminer agreement has not been investigated with patients.
Objective The study objective was to investigate interexaminer agreement for provisional MDT extremity classification with patients.
Design This was a reliability study with examiner masking.
Methods A therapist with an MDT credential observed the assessments made by 2 therapists with MDT diplomas, who successively performed MDT assessments for 33 patients with extremity pain on the same day. Immediately after each evaluation, all 3 therapists assigned the most appropriate MDT classification from 15 categories; they were unaware of each other's selection. The observed agreement and the Cohen kappa were calculated for the MDT classifications.
Results The observed agreement for the 15 MDT categories of classification between the therapist with an MDT credential and the first therapist with an MDT diploma was 78.8%. The Cohen kappa was .72 (95% confidence interval=.54, .89), indicating good agreement. However, the observed agreement between the 2 therapists with MDT diplomas when the patient was assessed separately was 42.4%. The Cohen kappa was .21 (95% confidence interval=.01, .41), indicating poor agreement.
Limitations Study limitations included convenience sampling of patients, the small number of examiners, and the limited extremity experience of the therapists with MDT diplomas.
Conclusions Interexaminer agreement for provisional MDT extremity classification was good when the examiners were seeing the same patient concurrently but poor when the patient was seen successively. Further studies are needed to establish which factors, including study method, are responsible for the divergent results of the MDT assessments of extremity disorders.
The McKenzie system of mechanical diagnosis and therapy (MDT)1–3 is a conservative method for assessing, classifying, and treating musculoskeletal disorders. There is an increasing body of research about the application of MDT to extremity problems.4–11 A clinician applying MDT uses information from a detailed history and mechanical loading strategies, including testing of repeated movement and static loading, to determine a classification for guiding management strategies. A biopsychosocial perspective is applied in the clinical reasoning process.12
There are 6 primary classifications for extremity problems in MDT: derangement, articular dysfunction, contractile dysfunction, postural, spinal, and other. Features of the 6 classifications are detailed elsewhere.13,14 The “other” classification has subgroups, such as serious pathology, trauma, and soft tissue disease process. Each subgroup has specific criteria identified from features in the history and findings from the examination (Appendix). Once identified, each subgroup has a corresponding management strategy. A provisional classification is chosen at the initial session, and a final subgroup classification is determined through several sessions by confirming the provisional classification or choosing a correct classification.
Standardized training has been established for MDT; such training would be expected to facilitate interexaminer agreement for assigning a classification (classification agreement). The training is currently being conducted in 32 countries. There are 2 training levels: (1) therapist with an MDT credential, which confers a minimal level of knowledge and skills in the use of MDT, and (2) therapist with an MDT diploma, which includes 660 hours of training and confers an advanced level of knowledge and skills in the use of MDT.
Agreement in the classification of extremity disorders between a therapist with an MDT credential and a therapist with an MDT diploma has been reported (kappa=.78–.90).13–15 However, a qualitative study showed that most therapists with an MDT credential did not feel confident in using MDT for extremity problems.14 Reliability studies of the extremities so far have used patient vignettes.13–15 It is possible that the lack of confidence in using MDT for extremity disorders14 is associated with the difficulty of data collection through the interactive aspect of the assessments, which could only be identified with real patients. Furthermore, it is important to include therapists with diplomas in clinical research to fully understand MDT.16 Therefore, investigating classification agreement for therapists with MDT diplomas and real patients is justified. In the literature, reliability studies with real patients have had 2 methods: (1) simultaneous assessments, in which one examiner performs the assessment and the other observes,17–19 and (2) successive assessments, in which the examiners perform the assessments separately, usually with a short time period between the assessments.20–22
The purpose of this study was to investigate the agreement of MDT provisional classifications for extremity disorders with real patients during simultaneous and successive assessments. The hypothesis was that the agreement of MDT classifications would be acceptable for 2 conditions: (1) a therapist with an MDT credential and a therapist with an MDT diploma performing simultaneous assessments and (2) a therapist with an MDT diploma performing successive assessments.
Method
Design Overview
For this study, 2 of 3 therapists with MDT diplomas separately performed, on the same day, successive MDT assessments of real patients with musculoskeletal disorders in the extremities. A therapist with an MDT credential observed the assessments performed by the 2 therapists with MDT diplomas. Immediately after the assessment, each therapist with an MDT diploma chose the most appropriate MDT provisional classification, and a notation of the decision was kept in a sealed envelope. Each therapist with a diploma waited in a separate room and was not allowed to contact the other therapist with a diploma until both therapists with diplomas finished selecting the classification to maintain masking. The therapist with an MDT credential also chose a classification after the initial MDT assessment, and a notation of the decision was kept in a sealed envelope. The examiners were not allowed to discuss the classification with the other examiners until all of the examiners finished selecting the classification to maintain masking. The time between the first and second MDT assessments ranged from 15 minutes to 3 hours. A research assistant opened the envelopes and entered data in a spreadsheet to complete masking. Each patient and assessor provided written informed consent before data collection.
Patients
Convenience sampling via advertising in the local community was conducted from October 2014 to January 2015. Inclusion criteria were as follows: (1) age of 18 to 65 years, (2) pain in an extremity hindering daily activities, (3) no self-reporting of spinal pain or loss of spinal movement, and (4) diagnosis of musculoskeletal extremity disorders by an orthopedic surgeon. Patients were excluded in the following 2 circumstances: (1) symptomatic and mechanical responses to mechanical loading strategies had been changed so much by the first therapist with an MDT diploma that the second therapist with an MDT diploma could not detect the same symptomatic and mechanical responses to mechanical loading strategies and (2) patients provided to the second examiner diagnostic information found by the first examiner.
Examiners
This study included 3 therapists with MDT diplomas in Japan. All of these therapists see and treat patients with musculoskeletal pain and use MDT for all patients with extremity disorders, and all of them are MDT instructors. The author, a therapist with an MDT credential, observed all assessments of patients to identify any who should be excluded.
Measures
Demographic details, symptom duration, and pain intensity for patients were assessed with the P423 and the Medical Outcomes Study 36-Item Short-Form Health Survey, version 2. The P4 is a reliable and valid 4-item questionnaire with 4 numerical rating scales, with ratings from 0 to 10 (0=no pain; 40=highest possible pain level).24 The Medical Outcomes Study 36-Item Short-Form Health Survey, version 2, is an established measure of health status.25 Eight health status variables can be evaluated (physical function, role–physical, bodily pain, general health, vitality, social function, role–emotional, and mental health).26 A value of 50 is the Japanese standard value, and higher values indicate better health status.27 The assessors provided information about their experience with MDT.
The examiners assigned the most appropriate MDT classification from 15 categories, including derangement, articular dysfunction, contractile dysfunction, postural, spinal, and 10 others (serious pathology, trauma/recovering trauma, inflammatory, chronic pain syndrome, post-surgery, mechanically inconclusive, peripheral nerve entrapment, structurally compromised, soft tissue disease process, and vascular) (Appendix).28 Furthermore, the 2 therapists with MDT diplomas provided a provisional mechanical loading strategy to perform as home exercises to confirm or further explore the most appropriate MDT classification, that is, the final subgroup classification.
Data Analysis
Descriptive analysis was used. The Cohen kappa and observed agreement, expressed as a percentage, were calculated for MDT classification agreement between the therapist with an MDT credential and the first therapist with an MDT diploma and between the 2 therapists with MDT diplomas. The Cohen kappa was the primary outcome measure, and observed agreement was the secondary outcome measure. Evaluation criteria for the kappa value were as follows: ≤.40=poor, .41 to .60=moderate, .61 to .80=good, and .81 to 1.00=very good.29 IBM SPSS version 21.0 (IBM Corp, Armonk, New York) was used for statistical analysis. The level of statistical significance was set at 5%.
Sample size estimation was conducted with PASS 14 Power Analysis and Sample Size Software 2015 (NCSS, LLC, Kaysville, Utah). The estimated kappa value was set at .78 on the basis of a previous study,14 and the prevalence of each subgroup was extracted from a previous study,10 in which the proportion of people with the “spinal” classification was arbitrarily set at 30%. In a test for agreement between 2 raters with the Cohen kappa, a sample size of 31 patients was necessary to detect a true kappa value of .78 (α=.05; β=.1). Consequently, 33 patients were included. This number was chosen to account for any patients who would be excluded when presentations classified as derangement were fully resolved by the first therapist with an MDT diploma or when the symptomatic and mechanical responses to mechanical loading testing were changed so much that the second therapist with an MDT diploma could no longer detect the same symptomatic and mechanical responses to mechanical loading testing.
Role of the Funding Source
This study was supported by a Saitama Prefectural University Research Grant.
Results
Thirty-three patients were included, and none were excluded (Tab. 1). Demographic information, including symptom information, is summarized in Table 1. Information about each assessor is presented in Table 2.
Demographics of 33 Patients With Extremity Problems
Information About Therapistsa
The observed agreement for the 15 MDT categories of provisional classification between the therapist with an MDT credential and the first therapist with an MDT diploma was 78.8%. The Cohen kappa was .72 (95% confidence interval [CI]=.54, .89), indicating good agreement. The observed agreement between the 2 therapists with MDT diplomas was 42.4% when successive assessments were used. The Cohen kappa was .21 (95% CI=.01, .41), indicating poor agreement. The Cohen kappa and observed agreement between the examiners are shown in Table 3. The classifications assigned by the therapist with an MDT credential after the first observation and by each therapist with an MDT diploma are shown in Table 4. The most common discrepancies in subgroups chosen by the 2 therapists with MDT diplomas involved the “derangement” and “spinal” classifications (5 patients) and the “derangement” and “mechanically inconclusive” classifications (4 patients). In 7 of the 9 patients, the “spinal” or “mechanically inconclusive” classification was selected by the first therapist with an MDT diploma, and the “derangement” classification was selected by the second therapist with an MDT diploma. Ten patients (30%) were categorized into the “mechanically inconclusive” classification by at least one therapist with an MDT diploma. For 6 of the 10 patients, the provisional loading strategies used by both therapists with MDT diplomas were the same.
Cohen Kappa and Observed Agreement Between Examiners for 15 MDT Categories of Provisional Classificationa
Subgroups Chosen by Therapists After First Observationa
Post hoc, the “other” subgroups were consolidated into one category of classification, and the analysis was replicated. Consequently, the observed agreement between the therapist with an MDT credential and the first therapist with an MDT diploma was 84.8%. The Cohen kappa was .78 (95% CI=.60, .95), indicating good agreement. However, the observed agreement between the 2 therapists with MDT diplomas was 48.5%. The Cohen kappa was .26 (95% CI=.02, .49), indicating poor agreement.
Discussion
To my knowledge, this is the first study assessing the agreement of MDT classifications for extremity disorders with real patients. The interexaminer agreement for provisional MDT classifications was good during simultaneous assessments, as in previous studies in which patient vignettes were used.13–15 This finding suggests that trained MDT practitioners have achieved standardized clinical reasoning skills through the MDT educational curriculum and can reliably assign a classification for musculoskeletal extremity disorders.
The finding of poor interexaminer agreement for provisional MDT classifications during successive assessments could be a reflection of certain features of MDT and suggests the need for careful interpretation in future studies of MDT classification agreement. Common patterns of classification discrepancies between the therapists with MDT diplomas were identified. The most frequent combination of classifications that accounted for poor agreement involved the “derangement” and “mechanically inconclusive” classifications. Compared with previous data,10 it appears that the prevalence of the “mechanically inconclusive” classification has increased. Ten patients (30%) were categorized into the “mechanically inconclusive” classification by at least one therapist with an MDT diploma; this percentage is significantly higher than that reported for a final subgroup classification in a larger cohort.10 For 6 of the 10 patients, the provisional loading strategies used by both therapists with MDT diplomas were the same. Examiners might have detected enough clear symptomatic responses, mechanical responses, or both at a subsequent session to identify a final subgroup classification, except for the “mechanically inconclusive” classification. Therefore, it is possible that the interexaminer agreement for a final subgroup classification may be enhanced in comparison with the agreement for the provisional subgroup classification reported in the present study.
Another common pattern of classification discrepancy involved the “derangement” and “spinal” classifications. Isolated extremity symptoms have responded to mechanical loading of the spine in some situations. Menon and May30 reported that for a patient with only shoulder pain, treatment with purely mechanical loading of the cervical spine led to complete resolution. Hirokado and Hashimoto31 reported that 67.2% of patients with a medical diagnosis of painful knee osteoarthritis and no symptoms in the lumber spine had reduction of their knee symptoms by mechanical loading of the lumber spine. In MDT, spinal contributions to extremity problems must be excluded before the extremities are tested. However, examiners can use discretion when excluding spinal contributions to extremity problems is achieved.
Importantly, in MDT, the classification of a subgroup at the initial session is provisional and can be changed on the basis of a patient's symptomatic responses, mechanical responses, or both to a loading strategy prescribed as home exercises at the initial session. Abady et al32 reported that an alteration of subgroup classifications occurred in 36.6% of patients with shoulder pain over the course of MDT treatment in a cohort study. In MDT, it is important to achieve the most effective management strategy for each patient even if several sessions are required. For a full understanding of the interexaminer MDT classification agreement reflecting clinical practice, a robust methodology for investigating the interexaminer agreement of the final subgroup classification is needed.
At least 4 possible explanations should be considered for the poor interexaminer classification agreement during successive assessments by the therapists with MDT diplomas. The first is associated with the nature of successive assessments. The MDT assessment involves the use of repeated or sustained loading strategies to elicit rapid and lasting symptomatic and mechanical changes within a session.1–3 In the present study, the therapist with an MDT credential confirmed, through observation, that no symptomatic and mechanical responses to mechanical loading strategies had been changed so much by the first therapist with an MDT diploma that the second therapist with an MDT diploma could no longer detect the same symptomatic and mechanical responses to mechanical loading strategies. However, it is possible that minimum variations in the patient's response influenced the examiners' decision making regarding physical assessments and classifications. The “spinal” or “mechanically inconclusive” classification was selected by the first therapist with an MDT diploma, and the “derangement” classification was selected by the second therapist with an MDT diploma for 7 of 9 patients; however, a larger sample size is needed to determine whether such a high prevalence occurred by chance. With regard to the design of reliability studies, there is an ongoing debate about the issue of a change in a patient's presentation during repeated physical assessments and the effect on an examiner's interpretation of symptoms.19,33
The second possible explanation is associated with symptom duration. In the present study, most of the patients (78.8%) had prolonged symptoms. Clinically, it often takes time for patients with prolonged symptoms to have apparent symptomatic and mechanical responses to mechanical loading strategies. Therefore, it is possible that the low interexaminer agreement was due, in some degree, to a high prevalence of patients with prolonged symptoms.
The third possible explanation is caseload mix. The therapists with MDT diplomas had predominantly treated patients with spinal problems in the preceding 3 months; the proportion of patients with extremity problems ranged from 5% to 30% (Tab. 2). The most common extremity disorders in the study patients were those affecting the shoulder and knee, making up more than 50% of the disorders seen. The average proportions of patients with shoulder and knee problems seen in the preceding 3 months by the therapists with MDT diplomas were 6% and 9%, respectively. One of the therapists had seen no patients with shoulder problems in the preceding 3 months. Substantial training is necessary to use MDT22; therefore, it is possible that the low interexaminer MDT classification agreement was due, in some degree, to a lack of experience in the treatment of extremity problems.
The fourth possible explanation may simply reflect the difficulty, even for therapists with MDT diplomas, of data collection through the interactive aspect of the assessments, which could only be identified in a study with real patients. If it is difficult even for a therapist with an MDT diploma to collect adequate data through the assessments, then it is not surprising that most therapists with MDT credentials lack confidence in using MDT with extremity disorders.14 Perhaps further training in physical assessments other than MDT, including hands-on evaluations, would enhance MDT assessment skills and result in better MDT classification agreement. This hypothesis will need to be investigated by comparing the classification skills of practitioners studying MDT only and those of practitioners studying MDT and other forms of manual therapy.
The limitations of the present study include the convenience sampling of patients, the small number of therapists practicing MDT, and the therapists' lack of experience in treating extremity problems. The prevalence of the “articular dysfunction” and “contractile dysfunction” classifications (3%–4%) in the present study was lower than those reported in larger cohorts,10,34 although the prevalence of the “derangement” classification (27%–42%) was similar to that reported in larger cohorts.10,34 The lack of experience of the therapists with MDT diplomas in treating extremity problems limits the generalizability of the results to therapists who see higher proportions of patients with extremity disorders, despite similar levels of training. The point estimate of the Cohen kappa for the 15 MDT categories of provisional classification between the therapist with an MDT credential and the first therapist with an MDT diploma was good, but the lower limit was .54, indicating moderate agreement. Further studies with different and larger cohorts and examiners are needed to further generalize the agreement of provisional MDT classifications at the initial assessment.
In conclusion, in the present study, the interexaminer agreement for provisional MDT classifications of musculoskeletal extremity disorders was good with concurrent patient assessments but poor with successive assessments of real patients. Further studies are needed to establish which factors, including study method, were responsible for the divergent results for MDT assessments of extremity disorders.
Appendix.
McKenzie Classification: Extremity OTHERa
a Reprinted with permission from The McKenzie Institute International, PO Box 2026, Raumati Beach 5255, New Zealand. Copyright 2015, The McKenzie Institute International. RA=rheumatoid arthritis, OA=osteoarthritis, ROM=range of motion.
Footnotes
The author acknowledges The McKenzie Institute International, Japanese Branch, for assistance with this study and Dr Toby Hall and Mr Richard Rosedale for reviewing the manuscript before submission.
Ethical approval for the study was gained from the Human Medical Ethics Committee in Saitama Prefectural University.
This study was supported by a Saitama Prefectural University Research Grant.
- Received November 13, 2015.
- Accepted April 14, 2016.
- © 2016 American Physical Therapy Association