Abstract
Background Exercise interventions are often incompletely described in reports of clinical trials, hampering evaluation of results and replication and implementation into practice.
Objective The aim of this study was to develop a standardized method for reporting exercise programs in clinical trials: the Consensus on Exercise Reporting Template (CERT).
Design and Methods Using the EQUATOR Network's methodological framework, 137 exercise experts were invited to participate in a Delphi consensus study. A list of 41 items was identified from a meta-epidemiologic study of 73 systematic reviews of exercise. For each item, participants indicated agreement on an 11-point rating scale. Consensus for item inclusion was defined a priori as greater than 70% agreement of respondents rating an item 7 or above. Three sequential rounds of anonymous online questionnaires and a Delphi workshop were used.
Results There were 57 (response rate=42%), 54 (response rate=95%), and 49 (response rate=91%) respondents to rounds 1 through 3, respectively, from 11 countries and a range of disciplines. In round 1, 2 items were excluded; 24 items reached consensus for inclusion (8 items accepted in original format), and 16 items were revised in response to participant suggestions. Of 14 items in round 2, 3 were excluded, 11 reached consensus for inclusion (4 items accepted in original format), and 7 were reworded. Sixteen items were included in round 3, and all items reached greater than 70% consensus for inclusion.
Limitations The views of included Delphi panelists may differ from those of experts who declined participation and may not fully represent the views of all exercise experts.
Conclusions The CERT, a 16-item checklist developed by an international panel of exercise experts, is designed to improve the reporting of exercise programs in all evaluative study designs and contains 7 categories: materials, provider, delivery, location, dosage, tailoring, and compliance. The CERT will encourage transparency, improve trial interpretation and replication, and facilitate implementation of effective exercise interventions into practice.
Chronic diseases are an emerging global issue that substantially contributes to disability and health care costs. The burden of these conditions is increasing with the aging population, and there is an urgent need to identify effective management strategies to reduce disability and associated health care costs.1,2 Supported by multiple systematic reviews,5–7 clinical practice guidelines,8–13 and position statements,14–16 exercise programs are recommended as part of the management for many chronic conditions, including, but not limited to, back and neck pain, osteoarthritis, osteoporosis, type 2 diabetes, cardiovascular and respiratory disease, cancer, human immunodeficiency virus–acquired immunodeficiency virus, and depression.
However, exercise has many dimensions and varies in type, intensity, duration, and frequency. Without explicit descriptions of exercise programs, it is not possible to explore why different trials report heterogeneous results or accurately replicate exercise protocols in other studies. The poor reporting of exercise programs makes it difficult to implement the programs/protocols in other studies. A 2012 meta-epidemiologic study that included 73 systematic reviews of exercise trials for people with chronic health conditions showed that exercise programs were often incompletely reported.17,18 In particular, important domains such as type of exercise, dosage, intensity, progression rules, supervision, or whether the exercise was delivered to individuals or groups were not consistently reported. These findings reflect the generally poor quality of descriptions of complex interventions in the peer-reviewed literature.19,20 Interpretation of clinical trials, efficient use of research resources (eg, time, funding), and uptake of effective exercise programs into routine care would be facilitated if exercise programs were reported in a standardized and comprehensive manner.
The authors of the Template for Intervention Description and Replication (TIDieR), an extension of the Consolidated Standards of Reporting Trials (CONSORT) Statement, have made general recommendations for the explicit reporting of complex interventions in clinical trials.19–22 However, additional details, such as exercise type, dosage, intensity, frequency, supervision, progression and individualization, are needed to fully appreciate exercise-specific interventions.17 Here, we describe the development of the Consensus on Exercise Reporting Template (CERT), which is intended to be used as a further extension of the CONSORT Statement and the TIDierR for the explicit reporting of exercise programs across all evaluative study designs for exercise research.
Materials and Methods
Design
We followed the methodological framework for developing reporting guidelines recommended by the EQUATOR (Enhancing the QUAlity and Transparency Of health Research) Network (http://www.equator-network.org).23 The CERT was registered on the Equator Network as a reporting guideline under development (http://www.equator-network.org/library/reporting-guidelines-under-development/).
The CERT study protocol has been published.24 In brief, we used a modified Delphi method, a survey-based approach to consensus building that is based on fundamental principles of purposive sampling of experts in the field of interest, panelist anonymity, iterative questionnaire presentation, and feedback of statistical analysis.25,26 The study was designed, implemented, and coordinated by an international steering committee (S.C.S., C.E.D., M.U., and R.B.) that determined questionnaire development, data analysis, and a priori criteria for item consensus and survey termination.24
Steering Committee
The international steering committee (S.C.S., C.E.D., M.U., and R.B.) comprised expertise across a range of disciplines (epidemiology, general medical practice, physical therapy, and rheumatology), geographical areas (Australia, United Kingdom, and Canada), and research expertise (qualitative, quantitative, and Delphi methods).
Participants—Selection and Recruitment
An international panel of exercise experts was identified from exercise systematic review authorship, established national and international profiles in exercise research and practice, and peer recommendations. An expert was defined as an individual who has demonstrated expertise in the conduct and evaluation of exercise interventions. In identifying panel members, attention was given to obtaining wide geographical and professional coverage. Participants were provided with an explanatory statement that informed them of the study objectives, how much input would be expected of them, and how their contribution would be used. We also provided a summary of the evidence and the proposed exercise reporting grid from the 2012 meta-epidemiological study.17
Ethics
The Cabrini Institute Ethics Committee approved the project (HREC 02-07-04-14). Potential participants were informed that by responding to the questionnaire, they were deemed to have consented to participate in the study and to have their de-identified responses included in any analyses. All named participants also provided consent to be acknowledged in this article.
Survey Tool
We used the results of the 2012 meta-epidemiological study that identified 43 key exercise descriptors and items recommended in the American College of Sports Medicine models for exercise prescription as the initial draft item set.16,17 After removal of irrelevant or duplicate items and pilot testing, 41 items were included in the first survey (Appendix 1). For each item, participants were asked to indicate their level of agreement on an 11-point numerical rating scale (ranging from 0=strongly disagree to 10=strongly agree; 5=neither agree nor disagree) that the item is essential to include in a checklist of reporting requirements for exercise programs in clinical trials. We also had a free-text field for each item to encourage feedback and suggestions, and a final question asked for any additional comments or suggestions.
Survey Process and A Priori Decisions
Survey Monkey (http://www.surveymonkey.com) software was used to produce and conduct the survey. Identified experts were invited to participate in June–July 2014 via an email that included an explanatory statement and offer of coauthorship for participants completing all Delphi rounds. Survey rounds were conducted until consensus was achieved and no new issues or items emerged.
There were 3 sequential rounds of anonymous online surveys. Each Delphi round was conducted over a 14-day period with approximately 8 weeks between rounds to allow for analysis, item refinement, and pilot testing. Each Delphi round took approximately 30 minutes to complete, could be completed over multiple computer sessions, and could be reviewed prior to submission. Reminders were emailed to nonresponders approximately 10 days after the initial mailing in each round, with additional reminders at 2-week intervals after the requested submission date. Only participants who completed a survey round were included in the subsequent round. The results for each item in each round were displayed graphically together with a narrative summary and a thematic analysis of qualitative data (free-text responses). The feedback document included a full description of the results for each item, including whether they fulfilled criteria for inclusion or exclusion or consensus had not been reached, and a summary of participant comments. These data were emailed to participants just prior to rounds 2 and 3.
Consensus for inclusion of an item into the CERT was defined a priori as greater than 70% of respondents rating an item as 7 or above on the 0 to 10 scale. Items were excluded if greater than 70% of respondents rated an item as 3 or below. We assumed that items were unclear if they were rated 4, 5, or 6 by greater than 30% of respondents or generated more than 10 comments. Suggestions or comments for modifications of concept or wording were considered by the steering committee (eg, where there was ambiguous wording, similarity to another item, and so on). Using data from the qualitative content analysis, the steering committee reworded or combined items that were deemed unclear from earlier rounds for inclusion in subsequent rounds.
Round 1 was conducted in June–July 2014, and round 2 was conducted in September–October 2014. The results of rounds 1 and 2 were presented at a workshop at the XIII International Low Back Forum in October 2014, which was attended by 30 researchers and clinicians with expertise in low back pain and musculoskeletal conditions (http://www.lbpforum.com.br), 8 of whom were participating in the Delphi survey. The purpose of the presentation was to invite comments about the process of development of the CERT and whether the CERT had broad applicability to low back pain exercise trials. We also invited comments about the wording of items, but not about whether they should be included in the CERT. The workshop was audio-recorded with informed consent, transcribed, and analyzed qualitatively with content analysis methods, and the findings were used to inform the third Delphi round.
Round 3 was conducted in December 2014–January 2015. For this round, we included all items that had reached consensus for inclusion in rounds 1 and 2 in their original format, items that reached consensus for inclusion in round 2 but required further clarification, and any remaining items for which no consensus had been reached. Feedback from comments received in round 2 informed rewording of all items. We also rearranged and categorized the items to be consistent with the framework and domain categories of the CONSORT Statement and TIDieR.19,21,22
Role of the Funding Source
This research project was funded by Arthritis Australia (Philip Benjamin, Grant No: 2014GIA03). Professor Buchbinder is funded by an Australian National Health and Medical Research Council (NHMRC) Senior Principal Research Fellowship.
Results
Participants
Of 137 invited experts, 57 participants (response rate=42%) completed round 1, 54 completed round 2 (response rate=95%), and 49 completed round 3 (response rate=91%). The respondents came from 11 countries (Australia [n=11], Brazil [n=2], Canada (n=9), Denmark [n=8], France [n=1], Germany [n=1], the Netherlands [n=8], New Zealand [n=2], Norway [n=2], United Kingdom [n=9], and United States [n=4]) and represented the following disciplines: biostatistics (n=2), chiropractic (n=5), epidemiology (n=4), exercise physiology (n=6), general and specialist medical practice (n=5), occupational therapy (n=1), physical therapy (n=28), psychology (n=1), sports science (n=1), and surgery (n=3). Five of the participants reported having more than one discipline: chiropractor/physical therapist (n=1), specialist medical practitioner/epidemiologist (n=1), biostatistician/specialist medical practitioner (n=1), physical therapist/epidemiologist (n=1), and psychologist/specialist medical practitioner (n=1). Across participants, there was expertise in exercise across a range of health conditions, including cardiovascular, respiratory, stroke and other neurologic conditions, musculoskeletal, depression and anxiety, diabetes, cancer, and urinary incontinence.
Results of Delphi Process
Figure 1 summarizes the results of individual rounds of the study and the flow of items through the study. In round 1, not all participants answered every question and indicated their level of agreement for all items, and level of agreement was 100% for 8 items (57/57 participants), 98% for 8 items (56/57 participants), 96% for 18 items (55/57 participants), and 95% for 7 items (54/57 participants). Of the 41 items included in round 1, 24 items reached consensus for inclusion, 2 reached consensus for exclusion, and no consensus was reached for 15 items (Figs. 1 and 2, Appendix 1 [round 1]). The 2 excluded items were the context of the qualifications of the exercise instructor and the participants' pre-existing fitness levels. Items with the greatest consensus for inclusion were: type of exercise equipment used (95% scored it 7 or above and 61% scored it 10); whether there were measures of exercise adherence (89% scored it 7 or above, and 62% scored it 10); whether the exercises were supervised or unsupervised (94.6% scored it 7 or above, and 71% scored it 10); specification of the number of exercise sessions per week (82% scored it 7 or above and 72% scored it 10); and duration of the exercise program (97% scored it 7 or above, and 72% scored it 10). Additionally, 512 comments were generated. Based on these comments, wording of 16 of the 24 included items required revision. These 16 items, together with the 15 items that failed to reach consensus, were reformulated (reworded or combined according to participant feedback) by the steering committee into 14 items for round 2 (Fig. 1, Appendix 1 [round 2]).
Flowchart of Consensus on Exercise Reporting Template (CERT) items through the Delphi study. Q=question.
Round 1 items presented in order of greatest consensus (percentage of respondents who scored an item 7 or more) (n=57). Items 12, 13, 16, 21, 24, and 28 were completed by 54 respondents; items 9, 14, 17, 25–27, and 30–41 were completed by 55 respondents; items 10, 11, 15, 18–20, 22, and 23 were completed by 56 respondents; and items 1–8 were completed by 57 respondents.
In round 2, level of agreement was indicated by 53/54 participants (99%) for 4 items and all participants for the remaining 10 items. Eight items reached consensus for inclusion, 3 items reached consensus for exclusion, and no consensus was reached for 3 items (Figs. 1 and 3, Appendix 1 [round 2]). The 3 excluded items were: number of years of instructor experience, whether there were warm-up or cool-down activities, and whether the speed of the exercises was described. Items with the greatest consensus for inclusion were: whether there were measures of exercise adherence (98% scored it 7 or above, and 57% scored it 10), whether exercises were tailored to the individual or “one size fits all” (96% scored it 7 or above, and 64% scored it 10), and whether the exercise dosage (eg, number of exercise repetitions, sets, and sessions) was described (89% scored it 7 or above, and 65% scored it 10). Comments were provided for all items, with 180 comments overall. Based on this feedback, we reformulated all accepted items (8 items from round 1 and 8 items from round 2), together with the 3 items that failed to reach consensus, into 16 items for round 3 (Fig. 1, Appendix 1 [round 3]).
Round 2 items presented in order of greatest consensus (percentage of respondents who scored an item 7 or more) (n=57). Items 3, 5, 10, and 14 were completed by 53 respondents; items 1, 2, 4, 6–9, and 11–13 were completed by 54 respondents.
All of the items included in round 3 reached consensus for inclusion (Fig. 4), and no new issues were raised in the 133 comments that were received. In round 3, level of agreement was indicated by 47/49 participants (96%) for one item, by 48 participants (98%) for 2 items, and by all participants for the remaining 13 items. Items with the greatest consensus for inclusion were: whether the exercises were performed individually or in a group (84% scored it 7 or above, and 53% scored it 10); whether nonexercise components were included (92% scored it 7 or above, and 55% scored it 10); specification of the explicit details of the program dosage, such as the number of exercise repetitions and sets (90% scored it 7 or above and 58% scored it a 10); whether there were measures of exercise adherence (96% scored it 7 or above, and 59% scored it 10); and whether adverse events that occurred during exercise were described (88% scored it 7 or above, and 59% scored it 10).
Round 3 items presented in order of greatest consensus (percentage of respondents who scored an item 7 or more) (n=49). Item 10 was completed by 47 respondents; items 13 and 16 were completed by 48 respondents, and items 1–9, 11, 12, 14, and 15 were completed by 49 respondents.
In summary, round 3 included 16 items (8 items from round 1, 4 items from round 2, and 4 revised items).
The final 16-item CERT checklist is shown in abbreviated form in the Table and is modeled on the TIDieR domains and headings. It consists of the following 7 categories consistent with the TIDieR: (1) What–materials: item 1 (the equipment that is used for the exercise intervention), (2) Who–provider: item 2 (the characteristics and expertise of the exercise instructor), (3) How–delivery: items 3 through 11 (the way in which the exercises are delivered to the participant), (4) Where–location: item 12 (the setting in which the exercises are performed), (5) When, how much–dosage: item 13 (a detailed description of how the exercises are performed), (6) Tailoring–what, how: items 14 and 15 (the way in which the exercises are prescribed and progressed), and (7) How well–compliance/planned or actual: item 16 (whether the exercises are delivered and performed as intended).
Final Consensus on Exercise Reporting Template (CERT) With 16 Abbreviated Items
Discussion
International exercise experts reached a high level of consensus on a set of key items that they considered to be necessary for reporting replicable exercise programs. The need for an exercise-specific reporting guideline became evident from the results of a meta-epidemiological study.17,18 The statement, summarized in the Table, will encourage transparency, improve the ability to interpret and replicate trial findings and facilitate the implementation of effective exercise interventions into clinical practice.
We followed the 18-step checklist, recommended by Moher et al23 for developing a health research reporting guideline, and harmonized the CERT with the CONSORT Statement and the TIDieR for consistency. The CERT is complementary to other more generalist tools and research reporting guidelines and is designed specifically for the reporting of exercise interventions in clinical trials. Although some items, such as study setting, provider, adverse events, and adherence, are already included in the CONSORT and the TIDieR, the study participants indicated that further clarification in the exercise-specific domain was needed.
The CERT will be generalizable across all types of exercise interventions for many conditions and provides a structure to inform the development and implementation of exercise interventions and production of implementation manuals. The final checklist of 16 items was the minimum data set that was considered necessary to report in clinical trials of exercise interventions. It received a high degree of consensus among a wide range of international exercise experts from different disciplines. This does not preclude provision of additional information where considered appropriate. Authors may want to provide additional information and descriptors where they consider it necessary for the replication of an intervention.
Our study is aligned with the recommended quality indicators for a Delphi study: reproducible participant criteria, stated number of rounds, clear criteria for excluding or dropping items, and other termination criteria.25,26 Conducting the study by using an Internet platform facilitated participants' responses by allowing anonymity and accessibility and electronic dissemination of information from previous rounds. Anonymity is a strength of the Delphi process because participants are free to say what they want without fear of judgment by colleagues.
We included international exercise experts from 11 countries, many of whom are multilingual, thus maximizing the potential for cross-cultural adaptation. It is, however, currently a limitation that the items are published only in English. It also will be important to develop and publish standard adaptations.
The views of included Delphi panelists also may differ from those of experts who declined participation and may not fully represent the views all exercise experts. To try to minimize this limitation, a comprehensive search was conducted to identify experts, supplemented by a snowballing technique of peer recommendation, to ensure a final respondent sample that represented a range of international researchers and clinicians. Our participant group included a multidisciplinary range of participants who had expertise in exercise trials across a range of health conditions. It is likely, therefore, that our results will be generalizable across exercise interventions regardless of the health condition under study.
There is debate over who constitutes an expert in the Delphi process. We support a suggestion by Fink et al that “[a]n expert should be a representative of their professional group with sufficient expertise not to be disputed or the power required to instigate the findings.”27(p982) In our Delphi study, all participants appeared to fulfill this definition.
In summary, the CERT checklist evolved through several iterations and followed the EQUATOR Network recommendations. The process began with a preliminary checklist of 41 items derived from a meta-epidemiologic study of systematic reviews of exercise trials for chronic health conditions. The checklist was refined by international exercise experts in 3 iterative Delphi consensus survey rounds and a Delphi workshop, and the panelists agreed on the final 16 core items.
The CERT can be endorsed by journals to encourage explicit reporting and can be used by authors to structure reports of their exercise interventions, by reviewers and editors to assess completeness of descriptions, and by researchers and clinicians who want to use the published information. To overcome journal word limits for manuscript publication, we recommend that the completed CERT items be included as online appendixes. The CERT wording mirrors applicable items from CONSORT 2010, TIDieR, and Standard Protocol Items Recommendations for Interventional Trials (SPIRIT) statements, and consistent wording and structure for items common to these checklists will facilitate complete reporting for exercise interventions.19,21,22,28 An associated Explanation and Elaboration Statement, currently under development, will provide the rationale and supporting evidence for each checklist item, along with a manual for guidance and model examples from actual exercise interventions.
Appendix 1.
Iteration of Consensus on Exercise Reporting Template (CERT) Items
Appendix 2.
Consensus on Exercise Reporting Template (CERT) Delphi Panel
Footnotes
Dr Slade, Professor Dionne, Professor Underwood, and Professor Buchbinder designed the study and survey tool, drafted the manuscript with input from all other authors, and performed data analysis. Dr Slade was responsible for implementing the survey. All authors read and approved the final manuscript.
The authors thank Dr Bianca Bendermacher (the Netherlands), Professor Jill Cook (Australia), Associate Professor Kjartan Fersum (Norway), Dr Lora Giangregorio (Canada), Professor Jan Hartvigan (Denmark), Dr Melanie Holden (United Kingdom), Associate Professor Per Kjaer (Denmark), Professor Donna MacIntyre (Canada), Dr Nathan Meier (United States), Professor Nicholas Taylor (Australia), and Dr Flavia Vital (Brazil) for their contributions to the study.
Peer Reviewers: Belinda Beck (Australia), Kim Bennell (Australia), Lucie Brosseau (Canada), Jill Cook (Australia), Leonardo Costa (Brazil), Fiona Cramp (United Kingdom), Edith Cup (United Kingdom), Lynne Feehan (Canada), Manuela Ferreira (Australia), Scott Forbes (Canada), Paul Glasziou (Australia), Bas Habets (the Netherlands), Susan Harris (Canada), Jan Hartvigan (Denmark), Jean Hay-Smith (New Zealand), Susan Hillier (Australia), Rana Hinman (Australia), Ann Holland (Australia), Maria Hondras (Denmark), George Kelly (United States), Peter Kent (Denmark), Per Kjaer (Denmark), Gert-Jan Lauret (the Netherlands), Audrey Long (Canada), Chris Maher (Australia), Lars Morso (Denmark), Nina Osteras (Norway), Tom Peterson (Denmark), Ros Quinlivan (United Kingdom), Karen Rees (United Kingdom), Jean-Philippe Regnaux (France), Marc Rietberg (the Netherlands) Dave Saunders (United Kingdom), Nicole Skoetz (Denmark), Karen Sogaard (Denmark), Tim Takken (the Netherlands), Nicholas Taylor (Australia), Maurits van Tulder (the Netherlands), Nicoline Voet (the Netherlands), Lesley Ward (New Zealand), Claire White (United Kingdom).
This research project was funded by Arthritis Australia (Philip Benjamin, Grant No: 2014GIA03). Professor Buchbinder is funded by an Australian National Health and Medical Research Council (NHMRC) Senior Principal Research Fellowship.
- Received December 10, 2015.
- Accepted April 28, 2016.
- © 2016 American Physical Therapy Association