Abstract
Background The Patient-Rated Tennis Elbow Evaluation (PRTEE) questionnaire is a tool designed for self-assessment of forearm pain and disability in patients with lateral elbow tendinopathy (LET). However, an Italian version of this questionnaire has not been available.
Objective The aims of this study were: (1) to translate and cross-culturally adapt the PRTEE questionnaire into Italian and (2) to evaluate its measurement properties.
Design This was a longitudinal, observational measurement study.
Methods The PRTEE questionnaire was cross-culturally adapted to Italian according to established guidelines. Ninety-five individuals (41 women, 54 men) with unilateral, imaging-confirmed, chronic LET were selected consecutively to assess the measurement properties of the PRTEE questionnaire. Internal consistency, test-retest reliability, construct validity, and responsiveness were estimated.
Results The Italian version of the PRTEE displayed a high degree of internal consistency, with a Cronbach alpha of .95. The test-retest reliability was high for both short-term and medium-term, with intraclass correlation coefficients (2,1) of .95 and .93, respectively. The PRTEE exhibited a strong correlation (r=.77–.91, P<.0001) with the Disabilities of the Arm, Shoulder and Hand (DASH) at the baseline and a moderate correlation (r=.58–.74, P<.0001) at discharge. The responsiveness was higher for the PRTEE than for the DASH.
Limitations A methodological limitation of the study is that due to the small sample size, a factor analysis was not performed to assess convergent validity.
Conclusions The Italian version of the PRTEE questionnaire is internally consistent, demonstrates expected correlations with other measures, and is more responsive than the DASH in Italian patients with chronic LET.
Chronic lateral elbow tendinopathy (LET) caused by a failed healing response of the tendon of the extensor carpi radialis brevis muscle1 is a common cause of arm pain in sporting and working populations. The importance of monitoring the effectiveness of treatment is widely recognized, as is the need for evidence-based health care. Several instruments have been developed to determine the outcome of elbow conditions.2–9 However, the success or failure of treatment for LET is open to interpretation, given the lack of consensus on how to measure treatment outcome in a standardized fashion.3,4
Before being used in different regions of the world, outcome measures need to be translated, culturally adapted, and retested to ensure the validity of the revised instruments.10–12 The cross-cultural adaptation guidelines described by Guillemin et al10 are widely accepted and used for the translation and adaptation of outcome measures. The term “cross-cultural adaptation” is used to describe a process that takes into account both language (translation) and cultural adaptation issues in the preparation of an outcome measure for use in another setting.10
An Italian version of a pain and functional status questionnaire for people with chronic LET has not been available. The aims of this study, therefore, were: (1) to perform a cross-cultural adaptation of the original English version of Patient-Rated Tennis Elbow Evaluation (PRTEE) questionnaire into Italian and (2) to evaluate the measurement properties of the Italian version of the PRTEE in patients with imaging-confirmed chronic LET.
Materials and Method
The data were collected between March 2005 and September 2008 at our outpatient rehabilitation center at the Department of Physical Medicine and Rehabilitation, School of Medicine, “La Sapienza” University of Rome. Informed consent was obtained from all patients prior to participation in the study.
The Cross-Cultural Adaptation Process
Guidelines developed by Guillemin and colleagues10,11 and Beaton et al12 were used for the cross-cultural adaptation and validation of the Italian version of the PRTEE. The English version of the PRTEE6 was independently translated into Italian by 2 non–medical professional translators and one physician whose native language was Italian. The 3 different Italian translations were analyzed by a health care committee (2 physiatrists, 2 epidemiologists, 1 orthopedist, 1 physical therapist), who first ensured that the translations took Italian cultural characteristics into consideration, and then selected a consensus version (version 1) of these translations. Discrepancies were resolved by consensus to achieve conceptual equivalence. This consensus version was translated back into English by 2 other non–medical professional translators whose native language was English. Neither of these translators was aware of the concepts being investigated or had a medical background. At the end of this phase, a new consensus version (version 2) was obtained and, when compared with the original version of the PRTEE,6 was found to be semantically and grammatically equivalent.
At this stage, a meeting was held with the health care committee to finalize the Italian version of the PRTEE. After the committee had confirmed the equivalence of the original PRTEE and the Italian version, we commenced a pilot test on 10 patients (5 women, 5 men; mean age=40.2 years, range=18–72) with chronic LET and on 10 sex- and age-matched individuals who were healthy. The main aim of this phase was to determine whether the participants understood the questions. After they had completed the questionnaire, each participant was asked whether there were any sentences that were difficult to understand. The participants were asked what they thought each question meant. The meaning of the items and tasks and the selected response were discussed. This process ensured that the pre-final version retained adequate equivalence in purpose. All of the questions were considered to be easy to understand by all of the participants who filled out the questionnaire. The reliability, validity, and responsiveness of the final Italian version of the PRTEE (eAppendix) then were evaluated by means of psychometric tests.
Participants
Ninety-five people (41 women, 54 men) with unilateral (66 right, 29 left), imaging-confirmed, chronic LET were consecutively enrolled for the purposes of this study. The participants' mean age was 38.8 years (SD=15.7, range=18–75). At the beginning of the study, the mean elapsed time since onset of LET was 23 months (SD=9, range=8–43).
The inclusion criteria were: clinical diagnosis of chronic LET (ie, persistent or recurrent local pain and muscle weakness that did not respond to conservative measures), confirmed by an imaging evaluation (eg, ultrasound, magnetic resonance imaging), and a pain score of ≥3 cm on a visual analog scale, induced by 2 or more of the following tests: (1) palpation of the lateral epicondyle, (2) resisted wrist extension (Thomsen test), (3) resisted extension of the middle finger, and (4) a chair test, in which the participant was asked to lift a 3.5-kg chair. The exclusion criteria were: age below 18 years; inflammatory or neoplastic disorders; concomitant pathologies in the shoulder or wrist; cervical radiculopathy or thoracic outlet syndrome; history of fracture or dislocation at the elbow; history of elbow surgery; treatment with corticosteroid injections in the previous 6 months; and inability to complete a questionnaire due to cognitive impairment or language difficulties.
Questionnaires
PRTEE.
The PRTEE questionnaire,6 which is an updated version of the Patient-Rated Forearm Evaluation Questionnaire (PRFEQ),7,13 is a 15-item questionnaire specifically designed for patients with LET. The items investigate pain (5 items) and the degree of difficulty in performing various activities (6 specific and 4 usual activity items) due to the elbow problem over the preceding week. Each item has 1 response option (0=no difficulty, 10=unable to perform). The scores for the various items are used to calculate an overall scale score ranging from 0 (best score) to 100 (worst score). The worst score is 100 points, and not 150 points, because the specific and usual activity items are first summarized and then divided by 2, which means they account for a maximum of 50 points, as opposed to 100 points, in the final score. The PRTEE questionnaire, which provides a very quick (it takes 5 minutes to complete), easy, and standardized quantitative description of pain and functional disability in patients with LET, was recently validated as a reliable means of assessing LET.14
Disabilities of the Arm, Shoulder and Hand (DASH).
The DASH questionnaire is an upper-extremity–specific outcome measure,8 which has been shown to be reliable and valid in people with elbow disorders.15 The core of the DASH (part B) is a 30-item disability/symptom scale concerning an individual's upper extremity during the preceding week. The items investigate the degree of difficulty in performing different physical activities because of arm, shoulder, or hand problems (items 1–21); the severity of each of the symptoms of pain, activity-related pain, tingling, weakness, and stiffness (items 24–28); and the effect the symptoms have on social activities, work, and sleep (items 22, 23, and 29) and their psychological impact (item 30). Each item has 5 response options, ranging from “no difficulty or no symptom” to “unable to perform activity or very severe symptom,” and is scored on a 5-point scale. The scores for all the items are used to calculate a scale score ranging from 0 (no disability) to 100 (severest disability). For the purposes of this study, we used the cross-culturally adapted and validated Italian version of the DASH.16
Global rating of change.
At the discharge assessment, 6 weeks after initial assessment, the physician and the participant independently completed a 7-point global rating of change form. “How is the patient today compared with his/her first visit?” and “How are you today compared with your first visit?” were the questions answered by the physiatrist and the patient, respectively. The participant and physiatrist were unaware of each other's responses. The 7 response options were: (1) “very much worse,” (2) “much worse,” (3) “little worse,” (4) “no change,” (5) “little improved,” (6) “much improved,” and (7) “very much improved.” The physician's and the participant's global rating of change scores were averaged to give an overall change score, which was used in this study as the criterion standard of change. This measure of change was used as our external criterion, in the absence of a “gold standard,” for the evaluation of responsiveness.17,18 For this purpose, we chose global rating of change scores of 3 or lower to classify a worsened participant, a score of 4 to classify a stable participant, and scores of 5 or higher to classify an improved participant.
Procedure
At baseline, participants were asked to complete the PRTEE and the DASH questionnaires together in a comfortable room. All participants then underwent the same shock-wave treatment. Because this was not an intervention study, the shock-wave treatment is summarized briefly: the shock-wave treatment was provided by a radial shock-wave generator. Radial shock-wave therapy was administered in 4 sessions, at the rate of 1 session per week. At each session, 2,500 shocks with a pressure of 4 bars (equal to an energy flux density of approximately 0.18 mJ/mm2) and a frequency of 8 shocks per second were applied. Upon discharge at the end of the shock-wave treatment, 6 weeks after the first administration of the PRTEE and DASH questionnaires, participants were asked to complete these questionnaires again.
Data Analysis
Parametric tests were used after using a Kolmogorov-Smirnov test to ensure that the data were normally distributed. The level of statistical significance was set at P<.05. All analyses were conducted using MedCalc, version 11.1.1.0 for Windows (MedCalc Software, Mariakerke, Belgium), GraphPad InStat, version 3.05 for Windows (GraphPad Software Inc, San Diego, California), and STATA software, version 8.2 (Stata Corp, College Station, Texas).
Psychometric Properties
Reliability.
“Reliability” is a generic term used to indicate both the homogeneity (internal consistency) of a scale and the reproducibility (test-retest reliability) of scores.17
Internal consistency of the PRTEE was assessed using Cronbach alpha with 95% confidence intervals (95% CIs), using the data from the baseline questionnaire, and was considered acceptable when Cronbach alpha exceeded .70.19
To assess the test-retest reliability of the PRTEE, the intraclass correlation coefficient (ICC) and 95% CIs were calculated on the basis of a 2-way random-effects analysis of variance.17,20,21 Additionally, standard error of measurement (
To assess the short-term (first test-retest) reproducibility of the PRTEE, the participants were asked to complete this questionnaire again 3 days after the first administration at baseline. To minimize the risk of short-term clinical changes, participants did not receive any treatment during this 3-day interval. An interval of 3 days was chosen for 3 reasons: (1) it minimized the time elapsed between enrollment and the start of the rehabilitation program, (2) participants were unlikely to remember what they had answered 3 days before, and (3) in the absence of any intervention, it was assumed that the participants' clinical situation would remain stable over 3 days.
To assess the reproducibility of the PRTEE in an interval that more closely resembles that used for evaluating individuals in a clinical study, the participants who were classified as stable (global rating of change score of 4) were asked to complete the PRTEE at the end of the shock-wave treatment, 6 weeks after the first administration (second retest). Based on global rating of change scores, this second retest was performed by only 38 participants.
Construct validity.
Construct validity was tested by determining the relationship between the PRTEE questionnaire scores and the scores of the DASH questionnaire at both the baseline and discharge assessments. Pearson correlation coefficients (r values) with 95% CIs were calculated to examine the construct validity. The r values were interpreted as follows: .00 to .19=very weak correlation, .20 to .39=weak correlation, .40 to .69=moderate correlation, .70 to .89=strong correlation, and .90 to 1.00=very strong correlation.22 Because the correlation results in the German version of the PRTEE14 are given only as a coefficient of determination (r2), we also calculated this coefficient to make a direct comparison of our results with those of the German version.
Responsiveness.
Floor and ceiling effects are considered important for the analysis of responsiveness because they indicate limits to the range of detectable change. Floor and ceiling effects were determined by calculating the number of participants who had the best or worst scores possible at both the baseline and discharge assessments in all of the questionnaires. This number indicates the proportion of patients whose condition could not significantly improve or deteriorate because they were already at one end of the range. Floor or ceiling effects are considered to be present if more than 15% of respondents achieve the lowest or highest possible score, respectively.23
Although there is no consensus on the most suitable statistical analysis to assess responsiveness, we decided to use 3 distribution-based methods to assess the responsiveness of the PRTEE and DASH questionnaires—the effect size (ES),24 the standardized response mean (SRM),25 and the Guyatt responsiveness ratio (GRR)26—together with an anchor-based method (ie, the receiver operating characteristic [ROC] curve).17,18 Changes in the PRTEE and DASH measurements following shock-wave treatment in comparison with the baseline measurements were assessed using a paired t test.
The values of the 3 distribution-based methods were based on the data of the participants (n=49) classified as improved according to the consensus judgment of both participants and physicians.
The ES was calculated as the mean difference between the baseline and follow-up scores (ie, mean change scores) divided by the standard deviation of the baseline scores.24 The SRM was calculated as the mean change score divided by the standard deviation of the change scores.25
The ES and SRM scores were interpreted as follows: 0.2=small, 0.5=moderate, and 0.8 or higher=large.25,27
The GRR was calculated as the ratio of the mean change score of the PRTEE or DASH of participants clinically identified as improved divided by the standard deviation of the mean change score of participants clinically identified as unchanged based on the global rating of change.26 If the GRR is larger than 1, the mean change score in clinically improved individuals exceeds the measurement error, and the instrument may be considered to be responsive to an extent that is proportional to the magnitude of the responsiveness ratio.26
The sensitivity (true positive rate) and specificity (true negative rate) of the PRTEE and DASH questionnaires were examined using the ROC curve method. The ROC curve was constructed by plotting the sensitivity values on the y-axis and 1 minus the specificity values on the x-axis for the different change scores values. The ROC curve was calculated on the basis of the questionnaire change score and the global rating of change score (obtained by averaging the participant's and physician's global rating of change scores). When plotting the ROC curve, the global rating of change, used as external criterion, was dichotomized to identify those participants who experienced a clinically meaningful reduction in symptoms.18,28 We chose global change scores of 5 or higher to represent important change and scores of 4 or lower to represent no change. Generally, the area under the ROC curve (AUC) is a measure of the ability of a questionnaire to distinguish between individuals who have and have not changed, according to an external criterion (ie, global rating change score).28 In this study, because for the ROC curve calculation the participants classified as stable were mixed with those who worsened, the AUC assessed the ability of the PRTEE and the DASH to distinguish participants who improved from those who did not improve, whereas the small sample of participants who worsened did not allow construction of an ROC curve to distinguish participants who worsened from those who did not worsen.
An AUC of 1.0 indicates perfect discrimination between these 2 health states. A questionnaire that does not discriminate more effectively than chance will have an AUC of 0.5. As a general rule, AUC values between 0.7 and 0.8 are considered to have acceptable discrimination, those from 0.8 to 0.9 are considered to have excellent discrimination, and those above 0.9 are considered to have outstanding discrimination.28
The point of the ROC curve on the upper-most left-hand corner was identified as the optimal cutoff change score and was used to estimate the minimal clinically important difference (MCID),29,30 although the baseline entry score may affect it.31,32 The MCID represents the point with equally balanced sensitivity (probability of the measure correctly classifying individuals who demonstrate change on the global rating of change) and specificity (probability of the measure correctly classifying individuals who have minimally or not changed on the global rating of change) in the ROC curve.33
Results
The Cross-Cultural Adaptation Process
The PRTEE for Italian patients was adapted using a systematic, standardized approach.10–12 No difficulties were encountered in translating the questionnaire, and the back translation corresponded very well to the original version. The only real, albeit minor, problem we encountered was in the first and third questions in the functional disability subscale (specific activities). The first question in the English version was: “Turn a doorknob or key.” As “doorknobs” are not widely used in Italy, this term was translated as “door handle.” The third question in the English version was: “Lift a full coffee cup…to your mouth.” Because most Italians drink espresso coffee, which comes in a small cup, we preferred to translate “Lift a full coffee cup” as “Lift a full cappuccino cup.” However, conceptual equivalence was verified by checking the original PRTEE and the back-translated questionnaires for all equivalences.
The prefinal version performed well in the pilot test. The participants stated that the items were clear and that the majority were relevant to their chronic LET. The average time taken by the participants to answer all the items was approximately 5 minutes.
No items were missing from the PRTEE and DASH scores at either the baseline or the discharge assessment. No individual scored the worst or best possible score (no floor or ceiling effects) in either the PRTEE questionnaire or the DASH questionnaire. On the basis of global rating of change scores, 49 participants improved, 38 remained stable, and 8 worsened.
Reliability
Internal consistency reached a Cronbach alpha of .95 (95% CI=.93–.98) (N=95) for the 15 items. When the alpha coefficient was calculated for the overall scale by eliminating each of the 15 items one at a time, the range was .89 to .98; no single item was found to change the internal consistency substantially.
The test-retest reliability yielded an ICC (2,1) of .95 (95% CI=.90–.97), with a SEM of 2.68 (95% CI=2.64–2.73) in the short term (3 days, 95 participants), and an ICC (2,1) of .93 (95% CI=.89–.96), with a SEM of 3.25 (95% CI=3.12–3.47), in the medium term (6 weeks, 38 participants).
Construct Validity
In the evaluation of the correlation between the PRTEE and DASH questionnaires, we considered the overall scores of the 2 questionnaires as well as the pain subscale scores (questions 1–5) and functional activity subscale scores (questions 6–15) for the PRTEE questionnaire and the symptoms subscale scores (questions 24–29) and function subscale scores (questions 1–21) for the DASH questionnaire.
Correlations between the PRTEE overall and subscale scores and the DASH overall and subscale scores at the baseline and discharge assessments are summarized in Tables 1 and 2, respectively. The overall PRTEE and DASH scores for the whole group of participants at baseline (N=95) were strongly correlated with one another (r=.84, 95% CI=.75–.89, P<.0001) (Fig. 1). The overall PRTEE and DASH scores for the group of participants who underwent the shock-wave therapy (N=95) at discharge were moderately, albeit still significantly, correlated with one another (r=.50, 95% CI=.30–.65, P<.001) (Fig. 2). The pretreatment-posttreatment change scores for the PRTEE and DASH also were moderately, albeit significantly, correlated with one another (r=.64, 95% CI=.48–.75, P<.001). As regards the coefficient of determination (r2), our results showed an r2 of .7 (P<.0001) (Fig. 1) for the baseline data and an r2 of .4 (P<.001) for the pretreatment-posttreatment change scores.
Correlations Between the Patient-Rated Tennis Elbow Evaluation (PRTEE) Questionnaire and Disabilities of the Arm, Shoulder and Hand (DASH) Questionnaire Scores at the Baseline Assessment(N=95)a
Correlations Between the Patient-Rated Tennis Elbow Evaluation (PRTEE) Questionnaire and Disabilities of the Arm, Shoulder and Hand (DASH) Questionnaire Scores at the Discharge Assessment (N=95)a
Relationship between the Patient-Rated Tennis Elbow Evaluation (PRTEE) questionnaire and Disabilities of the Arm, Shoulder and Hand (DASH) questionnaire scores at baseline for all participants. Regression plot, with 95% confidence interval for the mean and the slope.
Relationship between the Patient-Rated Tennis Elbow Evaluation (PRTEE) questionnaire and Disabilities of the Arm, Shoulder and Hand (DASH) questionnaire scores at discharge for participants who underwent shock-wave therapy. Regression plot, with 95% confidence interval for the mean and the slope.
Responsiveness
The t tests showed statistically significant changes from baseline to discharge for the PRTEE (t=10.66, P<.0001) and the DASH (t=6.49, P<.0001). The mean baseline and discharge scores, as well as the magnitude of changes expressed by the ES, SRM, and GRR for the improved participants (n=49), are shown in Table 3. According to the interpretation of Liang et al,25 both the PRTEE and the DASH yield large ES and SRM values (PRTEE: ES=2.0, SRM=2.3; DASH: ES=1.4, SRM=1.5). The GRR values yielded by both the PRTEE (2.9) and the DASH (2.3) also were large, according to the interpretation of Guyatt et al.26
Baseline and Discharge Scores of the Patient-Rated Tennis Elbow Evaluation (PRTEE) Questionnaire and Disabilities of the Arm, Shoulder and Hand (DASH) Questionnaire in the Overall Study Sample (N=95) and in Participants Who Improved (n=49), Were Stable (n=38), and Worsened (n=8) and the Magnitude of the Changes After Shock-Wave Therapy in Participants Who Improved (n=49)a
The ROC curve analysis revealed AUC values of .89 (95% CI=.80–.95) for the PRTEE and .79 (95% CI=.68–.87) for the DASH (Fig. 3). The SEM values were .03 for the PRTEE and .05 for the DASH. The AUC for both the PRTEE (P<.0001) and the DASH (P<.0001) far exceeded 0.5. These findings indicate that the change scores yielded by the PRTEE and the DASH were significantly better than chance in identifying an improved individual from randomly selected pairs of improved and unimproved individuals. The difference between the PRTEE and DASH AUC values was 0.10 (SEM=0.04, z score=2.1, P=.03). This finding indicates that the discriminative ability of the PRTEE is better than that of the DASH in this sample of outpatients with chronic LET treated with shock-wave therapy. The ROC curve also was used to provide an estimate of the MCID, taken as the point on the ROC curve nearest the upper left-hand corner of the graph (cutoff score), which most effectively discriminates between individuals who have improved and those whose condition is unchanged. Assuming equivalent importance for sensitivity and specificity, the best cutoff scores (MCID) for predicting global outcome (“improved”/“not improved”) were 8 points for the PRTEE and 7.5 points for the DASH.
Receiver operating characteristic curves illustrating the relationship between sensitivity and complement of specificity (1 − specificity) for the Patient-Rated Tennis Elbow Evaluation (PRTEE) questionnaire and the Disabilities of Arm, Shoulder and Hand (DASH) questionnaire.
The sensitivity and specificity values associated with the PRTEE cutoff point of 8 were 0.94 (95% CI=0.83–0.98) and 0.78 (95% CI=0.58–0.91), respectively, and the positive and negative likelihood ratios were 4.2 (95% CI=3.4–5.2) and 0.08 (95% CI=0.02–0.3), respectively. The sensitivity and specificity values associated with the DASH cutoff point of 7.5 were −.77 (95% CI=0.63–0.88) and 0.74 (95% CI=0.54–0.89), respectively, and the positive and negative likelihood ratios were 2.9 (95% CI=2.3–3.9) and 0.3 (95% CI=0.1–0.6), respectively.
Discussion
This study cross-culturally adapted the PRTEE questionnaire using a systematic, standardized approach10–12 and determined the measurement properties of the Italian version in individuals with chronic LET. With the exception of 2 terms in 2 different questions, no difficulties were encountered in translating the questionnaire, and the back translation corresponded very well to the original English version.
The Cronbach alpha for the Italian version of the PRTEE was .95, which indicates excellent internal consistency, and exceeded .90, which is the recommended threshold when a questionnaire is used in a clinical setting.34 The Cronbach alpha for the Italian version of the PRTEE was equivalent to those of the German (.94)14 and Swedish (.94)35 versions of the PRTEE.
The Italian version of the PRTEE showed high reliability for both short-term (3 days, 95 patients) and medium-term (6 weeks, 38 patients) test-retest assessments, with ICCs of .95 and .93, respectively. These values were higher than that of the original PRTEE (ICC=.89)7 and similar to that of the Swedish version.35 The test-retest reliability of the German version was .87.14 However, because test-retest reliability in the German version was assessed using the Pearson correlation coefficient, a direct comparison with our results is not possible. Our ICC values were similar to those reported by Newcomer et al36 for the English version of the PRFEQ, although slightly lower than those of the Hong Kong Chinese version of the PRFEQ.37
Pearson correlation coefficients between the PRTEE and DASH displayed a strong correlation at the baseline assessment, whereas the correlation at discharge was moderate. The strongest correlation was found between the PRTEE functional ability subscale score and the DASH function subscale score at baseline (r=.89). The weakest correlation was found between the PRTEE functional ability subscale score and the DASH symptoms subscale score at discharge (r=.43).
Our Pearson correlation coefficients at baseline were close to those of the Swedish version35 but higher than those reported by Newcomer et al.36 Conversely, our Pearson correlation coefficients at discharge were lower than those of the Swedish version35 and those reported by Newcomer et al.36 A Pearson correlation coefficient (.56) comparable to ours (.58) was observed between the DASH overall score and the pain subscale score in the English version of the PRFEQ. With regard to the coefficient of determination (r2), our results yielded an r2 value that was slightly lower than that reported by Rompe et al14 for the baseline data (.70 versus .75) and lower (.40 versus .66) for the pretreatment-posttreatment change values.
Although the quality of measurement questionnaires usually has been evaluated by considering their reliability and validity, it has been suggested that responsiveness should be another criterion in the choice of a measurement questionnaire.39 Rompe et al14 were the first to determine the responsiveness of the PRTEE, but they used only a distribution-based method (ie, the SRM) for this purpose. Newcomer et al36 also analyzed the responsiveness of the PRFEQ using distribution-based methods alone (ie, ES and SRM). To our knowledge, this is the first study that has used all of the recommended statistical methods, including the ES, SRM, and GRR (distribution-based methods) and the ROC curve (anchor-based method), to determine the responsiveness of the PRTEE questionnaire.
Our results demonstrated a high degree of responsiveness for both the PRTEE and the DASH. However, both the distribution-based methods and anchor-based method showed that the PRTEE was clearly more responsive than the DASH in our sample of outpatients with chronic LET treated with shock-wave therapy.
We compared our ES and SRM results with those of the 2 previous studies that evaluated the responsiveness of the PRTEE and PRFEQ. We observed that our SRM value (2.3) was slightly higher than that reported by Rompe et al (2.0)14 and higher than that reported after 6 weeks (2.3 versus 1.0) and slightly higher than that reported after 12 weeks (2.3 versus 1.9) of treatment by Newcomer et al.36 Our ES value was higher than that reported by Newcomer et al36 both after 6 weeks (2.0 versus 1.0) and after 12 weeks (2.0 versus 1.6) of treatment.
The MCID, defined as the magnitude of change that best distinguishes between patients who have improved and those whose condition remains unchanged, was calculated using the ROC curve analysis. The MCID was approximately 8 points for the PRTEE and approximately 7.5 points for the DASH. A comparison of the data yielded by the analysis of the ROC curve in our study with those of other studies is not possible because no other studies, to our knowledge, have appraised responsiveness of the PRTEE through the ROC curve.
A methodological limitation of our study is that due to a small sample size, we did not perform a factor analysis to assess convergent validity. Our sample of patients, however, may be considered representative of the general population that is normally referred to an outpatient rehabilitation center.
The evidence presented in this article indicates that the PRTEE, a patient-rated, disease-specific questionnaire, was a valid and reliable means of measuring change in pain and function over time in our sample of outpatients with chronic LET treated with shock-wave therapy and that it was significantly more responsive than the DASH, a patient-rated, generic questionnaire. This observation is in keeping with those reported in other studies that showed patient-rated, disease-specific questionnaires were more responsive to the target condition than patient-rated, generic questionnaires.40–42 As the PRTEE is, unlike the DASH, a disease-specific questionnaire, we suggest that the PRTEE can be used as a standard outcome measure in Italian outpatients who undergo therapy for chronic LET.
Our data indicate that the Italian version of the PRTEE questionnaire is a valid, reliable, and responsive tool that can be used to quantitatively measure outcome in Italian patients with chronic LET, in both clinical and research settings. Further research is warranted to determine the measurement properties of the Italian version of the PRTEE in people with acute LET and other elbow diseases, as well as to compare the measurement properties of the Italian version of the PRTEE with other disease- and organ-specific questionnaires.
Footnotes
-
Prof Cacchio, Prof Necozione, Prof Rompe, Prof di Orio, Prof Santilli, and Prof Paoloni provided concept/idea/research design. Prof Cacchio, Prof MacDermid, Prof Maffulli, and Prof Paoloni provided writing. Prof Cacchio, Prof Necozione, Prof Maffulli, Prof di Orio, Prof Santilli, and Prof Paoloni provided data collection. All authors provided data analysis. Prof Cacchio and Prof Paoloni provided project management. Prof Cacchio, Prof Necozione, Prof MacDermid, Prof Maffulli, Prof di Orio, Prof Santilli, and Prof Paoloni provided consultation (including review of manuscript before submission).
-
Prof MacDermid is the developer of the Patient-Rated Tennis Elbow Evaluation (PRTEE).
-
This study was approved by the local ethics committee and complied with the Declaration of Helsinki.
- Received November 14, 2011.
- Accepted April 30, 2012.
- © 2012 American Physical Therapy Association