Abstract
Background In order to make inferences about strength related to development or treatment interventions, it is important to use measurement instruments that have sound clinimetric properties.
Purpose The objective of this review is to systematically evaluate the level of evidence of the clinimetric properties of instruments for measuring upper extremity muscle strength at the “body functions & structures” level of the International Classification of Functioning, Disability and Health for Children and Youth (ICF-CY) for children with cerebral palsy (CP).
Data Sources A systematic search of the PubMed, EMBASE, OTseeker, CINAHL, PEDro, and MEDLINE databases up to November 2012 was performed.
Study Selection Two independent raters identified and examined studies that reported the use of upper extremity strength measurement instruments and methods for children and adolescents with CP aged 0 to 18 years.
Data Extraction The COSMIN (COnsensus-based Standards for the selection of health status Measurement INstruments) checklist with 4-point rating scale was used by 2 independent raters to evaluate the methodological quality of the included studies. Best evidence synthesis was performed using COSMIN outcomes and the quality of the clinimetric properties.
Data Synthesis Six different measurement instruments or methods were identified. Test-retest, interrater, and intrarater reliability were investigated. Two test-retest reliability studies were rated as “fair” for the level of evidence. All other studies were rated as “unknown” for the level of evidence.
Limitations The paucity of literature describing clinimetric properties, especially other than reliability, of upper limb strength measurement instruments for children with CP was a limitation of the study.
Conclusions For measuring grip strength, the Jamar dynamometer is recommended. For other muscle groups, handheld dynamometry is recommended. Manual muscle testing (MMT) can be used in case of limited (below MMT grade 4) wrist strength or for total upper limb muscle strength. Based on lacking information regarding other clinimetric properties, caution is advised regarding interpretation of the results.
The term “cerebral palsy” (CP) describes a group of disorders of the development of movement and posture, causing activity limitations that are attributed to nonprogressive impairments, that occur in the developing fetal or infant brain. Motor disorders in people with CP are often accompanied by disturbances of sensation, cognition, communication, perception, or behavior or a seizure disorder, or both.1 Abnormal motor behavior (reflecting abnormal motor control) is the core feature of CP. It is characterized by various abnormal patterns of movement and posture related to defective coordination of movements or regulation of muscle tone.2
One of the effects of abnormal motor behavior is the loss of muscle strength. Children with CP have less strength in their affected side or sides compared with their peers who are developing typically.3–5 Although some studies have focused on the loss of muscle strength in the lower extremities and evolving impairments of related activities,3,6–8 the decrease in muscle strength of the upper extremities also may lead to limitations in activities of daily living, as grip strength is found to be a good predictor of use of the affected arm in bimanual performance in children with CP.9,10 To determine whether strength is a limiting factor in the performance of activities of daily living, it is important to measure strength accurately.
Muscle strength is assessed at the body functions level of the International Classification of Functioning, Disability and Health for Children and Youth (ICF-CY)11 and can be measured in 3 different ways: isometric, isotonic, and isokinetic.12 In order to make inferences about strength, either in clinical practice or in research, strength has to be measured with an instrument that has sound clinimetric properties. Reliability, for example, is a very important property, among others such as validity and responsiveness. One needs to know the degree to which variations in results between repeated measurements occur. This so-called measurement error can arise from several sources: the measurement instrument itself, the person or people performing the measurement, the patient undergoing the measurement, and the circumstances under which the measurement is performed.13 The more studies of good methodological quality that report consistent clinimetric findings, the greater or stronger the level of evidence of the investigated clinimetric property is considered to be.14
Several studies have examined clinimetric properties of upper extremity strength measurement instruments for children who are developing typically. In most of these studies, test-retest, intrarater, and interrater reliability of handheld dynamometers measuring isometric muscle and grip strength in the upper extremities in children revealed excellent intraclass correlation coefficients (ICCs).15–21 Moreover, evaluations of the validity of measurement instruments of isometric upper extremity muscle strength in children demonstrated excellent scores.15,16,22 Studies that examined the clinimetric properties of lower extremity strength measurement instruments in children with CP revealed moderate to excellent intrarater and interrater reliability.21,23–27 Studies that examined strength measurement instruments for adults with brain damage showed excellent intrasession and intersession, test-retest, and intrarater reliability scores for the paretic side.28–30 The nonparetic side showed moderate to excellent intrasession and intersession reliability scores.28
During isotonic and isokinetic muscle strength testing, the patient needs to be able to cooperate with the examiner and perform a maximum contraction of one muscle group. This is a task that many children with CP find very difficult to perform due to co-contraction of antagonists or agonists or cognitive limitations, or both.31,32 Furthermore, compared with their healthy peers, children with CP: (1) are slower in the application of force,4,33–35 (2) show sequential force generation,33,36 (3) have a reduced ability to adjust grasping forces to the object's physical properties,37–40 and (4) have impaired motor planning.33,34,41 Children with CP also have impairments in the spatial and temporal aspects of bimanual coordination.42 Due to the unique characteristics of children with CP, the clinimetric properties of strength measurement instruments used for these children should be studied specifically in this group.
To our knowledge, no systematic review has been published regarding the different clinimetric properties of upper extremity strength measurement instruments for children with CP. The purposes of this article are: (1) to systematically review the clinimetric properties of instruments that measure upper extremity muscle strength at the “body functions & structures” level of the ICF-CY for children with CP and (2) to systematically assess the methodological quality of the clinimetric studies and the strength of the evidence provided regarding the clinimetric properties.
Method
Data Sources and Searches
Electronic searches were conducted in the PubMed, EMBASE, OTseeker, CINAHL, PEDro, and MEDLINE databases from the inception of these databases until November 2012. The COSMIN (COnsensus-based Standards for the selection of health status Measurement INstruments) protocol for the systematic review of measurement properties was used to search the PubMed database. According to this protocol, the search strategy consisted of collections of search terms for the following characteristics: construct of interest, target population, instrument search, and psychometric properties.43 For construct of interest, the following terms were used: Power OR Muscle strength OR Resistance OR Strength OR Contraction OR Lift OR “Isometric contraction” OR “Isotonic contraction” OR “Isokinetic contraction” OR Grip OR Pinch OR Grasp OR Functional OR Function OR Exercise OR Physical fitness OR Endurance OR Tolerance.
Target population was defined as: Human AND Child AND (“Cerebral palsy” OR “Muscle spasticity” OR Diplegic OR Diplegia OR Monoplegic OR Monoplegia OR Quadriplegic OR Quadriplegia OR Spastic OR “Spastic Cerebral Palsies” OR “Unilateral Cerebral Palsy” OR Ataxia OR Atactic OR Distonia OR Distonic OR Hemiplegic OR Hemiplegia) AND (“Upper limb” OR Arm OR Forearm OR “Upper extremity” OR Shoulder OR Elbow OR Hand OR Wrist OR Finger OR Thumb OR Manual). Because this study did not focus on one specific measurement instrument but on all instruments that are used to measure upper extremity muscle strength, the instrument search was not defined. All search terms were combined with the filter for measurement properties.43 Finally, the exclusion filter (stroke OR animals) was added. For the other databases, the above-mentioned words were combined.
Study Selection
Studies of any design that evaluated reliability, validity, or responsiveness were eligible for inclusion. Other inclusion criteria were: (1) the study participants were children and adolescents (0–18 years of age) with CP, and (2) the study examined a measurement instrument or measurement method for upper extremity muscle strength (shoulder/elbow/wrist/grip) at the “body functions & structures” level of the ICF-CY.11 No language restrictions were applied. Studies were excluded if adult patients or children without CP were included in the study sample.
After performing the search strategies (K.D.), 2 reviewers (K.D. and E.R.) independently screened titles and abstracts for relevance. In cases of no consensus, the opinion of a third reviewer (Y.J.) was decisive. Additionally, related articles and the references of the included articles were checked by one reviewer (K.D.) for relevance and potential inclusion. These potentially eligible articles were then independently screened by the 2 reviewers.
After consensus was reached, full-text reports of the included studies were retrieved and read by the 2 reviewers independently. They searched the articles for a clinimetric property of the instrument used to measure upper extremity muscle strength in children with CP.
Data Extraction and Quality Assessment
The extraction and assessment consisted of several steps. First, the descriptive characteristics of the sample used in the studies, the procedures used, and the statistical outcomes reported in each study were extracted. Second, the methodological quality of the studies was assessed. Third, the quality of the clinimetric properties of the measurement instrument was evaluated. Finally, a best evidence synthesis was performed.
Rating of Methodological Quality of Individual Studies
Two reviewers (K.D. and E.R.) independently assessed the methodological quality of the included studies using the COSMIN protocol. In case of disagreement, discussion with the third reviewer (Y.J.) followed until consensus was reached.
To assess methodological quality, the reviewers used the COSMIN checklist with the 4-point rating scale, which is recommended for use in systematic reviews of clinimetric properties (www.cosmin.nl).44 This standardized and validated scoring system was developed based on discussions among experts.44 This scoring system allows the overall methodological quality of one clinimetric property per study to be calculated. The checklist consists of 9 boxes that each describe a measurement property (ie, internal consistency, reliability, measurement error, content validity, structural validity, hypothesis testing, cross-cultural validity, criterion validity and responsiveness) and 2 subchecklists to determine the interpretability and generalizability of the study. Each box contains between 5 and 18 items detailing how each specific clinimetric property should be assessed (see Appendix for the example of the reliability box). Each item is scored on a 4-point rating scale (ie, “poor,” “fair,” “good,” or “excellent”).44 A methodological quality score is obtained per box by taking the lowest rating of any item in that box (“worse score counts”). For our study, in accordance with the COSMIN protocol, only the boxes that corresponded to the investigated clinimetric properties were completed. Relevant items in the “Interpretability” box and the “Generalizability” box were used as a guide for extracting other relevant data from the included studies.
Rating of Statistical Findings for Individual Studies
One reviewer (K.D.) assessed the quality of the clinimetric properties of the measurement instrument in each study by applying widely accepted quality assessment criteria to the statistical outcomes (Tab. 1).14 The overall ratings are “good” (+), “negative” (−), and “indeterminate” (?).14
Rating System for the Statistical Findings for Individual Studies14,a
Data Synthesis
One reviewer (K.D.) combined the results of the rating of the methodological quality and the rating of the statistical findings for the individual studies to determine the overall level of evidence for the quality of the clinimetric properties of the identified measurement instruments of upper limb muscle strength. This method of synthesizing evidence is similar to the method that is used to synthesize evidence from clinical trials.45 The possible levels of evidence are: (1) strong, (2) moderate, (3) limited, (4) conflicting, and (5) unknown (Tab. 2).46
Synthesis of Study Quality and Findings46
Results
Study Selection
The selection procedures are summarized in the Figure. Seven eligible studies were identified, and 3 types of reliability (ie, intrarater, interrater, and test-retest) in 6 different measurement instruments were studied. The measurement instruments and methods used were: (1) manual muscle testing (MMT), (2) the Jamar dynamometer, (3) a handheld dynamometer (HDD), (4) an instrument based on muscle strength-torque sensors, (5) a computerized measurement tool using a strain gauge, and (6) a modified sphygmomanometer. A more detailed description of the included studies is given in Table 3.
Flowchart of the search strategy and selection of articles. ICF-CY=International Classification of Functioning, Disability and Health for Children and Youth, CP=cerebral palsy.
Characteristics of Included Studiesa
MMT.
Klingels et al.47 investigated interrater reliability and test-retest reliability of MMT of shoulder flexion, abduction, and adduction; elbow flexion and extension; forearm supination and pronation; and wrist flexion and extension in children with CP. For test-retest reliability, ICC values between .88 and .96 were found. For interrater reliability, ICC values varied between .60 and .91.
Jamar dynamometer.
Klingels et al47 investigated test-retest and interrater reliability of isometric grip strength of the upper extremity in children with CP using the Jamar dynamometer. Intraclass correlation coefficients of .96 and .95 were found for test-retest and interrater reliability, respectively.
HHD.
Crowner and Racette48 investigated test-retest reliability of the MicroFET HHD (Hoggan Health Industries Inc, Draper, Utah) for testing muscle strength of the shoulder and elbow muscles and the Baseline hydraulic HHD (FEI, Irvington, New York) for assessing grip strength. Vaz et al5 used a MicroFET2 HHD for investigating the test-retest reliability of muscle strength testing of the wrist flexors and extensors. In the study by Crowner and Racette,48 a total score of .99 was mentioned without indicating what the score means or how it was calculated. In the study by Vaz et al,5 ICC values between .93 and .98 were found for test-retest reliability of measurements of wrist flexion and extension.
Muscle strength-torque sensors.
In the study by Bleyenheuft et al,49 test-retest reliability of isometric fingertip grip muscle strength and load muscle strength was investigated. No significant difference was found between the first and second measurements (P=.935).
Strain gauge technology.
Rameckers et al50 investigated test-retest reliability of maximum voluntary contraction (MVC) of the index flexor muscles. Intraclass correlation coefficient values of .99 were found for the finger and wrist flexors.
Sphygmomanometry.
Glazier et al51 investigated test-retest reliability and Xu et al52 investigated intrarater reliability of grip strength measurements using a modified sphygmomanometer. Glazier et al51 found a Pearson correlation coefficient of .97 for test-retest reliability and Xu et al52 reported an ICC value of .919 for intrarater reliability of grip strength measurements.
Rating of Methodological Quality of Individual Studies
For all 6 instruments, the COSMIN box for reliability could be completed to rate the methodological quality of the corresponding studies. Most studies were rated as “poor” for methodological quality. The studies of interrater reliability of MMT47 and Jamar dynamometer47 measurements were rated as “fair” for methodological quality (Tab. 4). According to the guidelines in the COSMIN manual,53 the other boxes could not be completed.
Statistical Findings of Included Studies, Including Methodological Quality Scoresa
Rating of Statistical Findings for Individual Studies
The quality of the clinimetric properties was assessed for all 6 instruments (Tab. 4).
Most clinimetric properties were rated as “good.” Interrater reliability of shoulder and elbow measurements obtained with MMT was rated as “poor.” Test-retest reliability of total upper extremity HHD measurements and of measurements of grip strength obtained with the modified sphygmomanometer were scored as “indeterminate” because no justified statistical method was used.
Data Synthesis
The results of the methodological quality assessment and the quality assessment of the clinimetric properties were combined and are presented in Table 5. For most of the instruments, the level of evidence was rated as “unknown.” For upper extremity/wrist strength measurements with MMT and grip strength measurements with the Jamar dynamometer, the level of evidence was rated as “limited,” with a positive rating of the clinical property. For shoulder/elbow muscle strength measurements with MMT, the level of evidence is rated as “limited,” with a negative rating of the clinical property.
Levels of Evidence of Upper Extremity Strength Measurement Instrumentsa
Discussion
The purpose of this systematic review was to study the clinimetric properties of upper extremity strength measurement instruments used for children with CP. This review clearly exposes the lack of adequate studies investigating clinimetric properties of upper extremity strength measurement instruments for children with CP.
In the few studies using measurement instruments of upper extremity strength, only test-retest, intrarater, and interrater reliability were investigated in a select group of age ranges, Manual Ability Classification system (MACS) levels,54 and Gross Motor Function Classification System (GMFCS) levels.55 No conclusions can be made regarding the possibility of determining changes over time (responsiveness), the smallest detectable change (SDC), or the standard error of measurement (SEM). Furthermore, it is not clear whether all of the measurement instruments specifically measure muscle strength, as validity has not been investigated. Therefore, more research on the other clinimetric properties must be done for all of the instruments before they are used in clinical practice or further studies.
Only 2 of the studies47,51 were specifically designed to assess the clinimetric properties of the measurement instruments. All of the other studies were intervention studies, necessitating a reliability study of the outcome measurement. These findings may explain why only test-retest, interrater, and intrarater reliability are investigated in mostly small groups of children.
None of the measurement instruments were rated as “strong” or “moderate” for the level of evidence. According to the COSMIN standards, only the studies that reported on the interrater reliability of the MMT47 and Jamar dynamometer47 were rated as “fair” for methodological quality; therefore, the level of evidence was rated as “limited.”
In MMT, interrater reliability of muscle strength measurements of the shoulder and elbow had poor statistical outcomes. Manual muscle testing, therefore, is not recommended for measuring muscle strength in these muscle groups. Only the total upper extremity MMT score and the score of the wrist muscles had good interrater reliability. Although MMT is commonly used in clinical practice, its use is dissuaded with other populations described in the literature,56,57 despite the findings of Klingels et al.47 The studies by Noreau and Vachon56 and Schwartz et al57 showed there is wide variability in grading values with MMT grades 4 and 5. Therefore, it is recommended that MMT be used in the positive-rated muscle groups (upper extremity total, wrist) in children with less muscle strength (≤ grade 3).
The Jamar dynamometer had good statistical outcomes and, therefore, is recommended for measuring grip strength in children with CP. The positive characteristics of the Jamar dynamometer are that it is a small device (handheld, lightweight [1.4 kg]) that is relatively inexpensive (retail price=$300) and easy to use. The negative characteristics of the Jamar dynamometer are that it can only be used to measure handgrip strength, and it cannot be used by children with very small hands. Moreover, the range (0–90 kg) and incremental 2-kg steps may not be suitable to measure minimal changes, especially for young or small children or for children with very poor muscle strength. Based on these results, the Jamar dynamometer appears to have good potential as a reliable and clinically useful instrument for measuring handgrip strength in children with CP. However, specific assessment of its clinimetric properties in children with CP is warranted.
According to the COSMIN standard, all of the other studies had poor methodological quality; therefore, the levels of evidence of all other studies were rated “unknown.” The poor methodological quality is partly due to the fact that all of the studies used rather small sample sizes to investigate the clinimetric properties of the strength measurement instruments. Sample sizes varied between 2 and 30 people. The COSMIN manual53 recommends a minimum sample size of 50, although a sample size of 100 would be better. Pooling data to achieve these sample sizes was not possible. Because of the unknown level of evidence, the outcomes of these clinimetric properties must be interpreted with extra care.
Although the levels of evidence for the other measurement instruments and methods (ie, HHD, muscle strength-torque sensors, strain gauge technology, and modified sphygmomanometer) were rated as “unknown,” the clinimetric properties were rated as “good” or “indeterminate.” Therefore, for some of these instruments, a sufficient level of evidence can be reached when the clinimetric properties are researched in studies of good or excellent methodological quality. In order to consider their clinical applicability, the positive and negative characteristics of these instruments will be described.
The positive characteristics of the MicroFET2 dynamometer are that it is a small device (handheld, weighs less than 0.5 kg) that is relatively inexpensive (retail price=$1,095) and easy to use. Moreover, its ability to detect small changes might be good because of the small incremental steps of 1 N·m. The negative characteristics of the MicroFET2 dynamometer are that the assessor can have difficulty stabilizing the patient while using the device, the opposing strength of the examiner potentially contributes to the measured force, and inaccurate readings can be made when the force is not applied in a precise, perpendicular direction.58 In addition, different protocols are used worldwide, and various articles have already identified the need for further research and development of standardized handheld dynamometry procedures in children with CP.23–25 Studies researching the reliability of handheld dynamometry of the lower extremities in children with CP showed intrarater reliability (ICC) scores between .38 and .96 for the lower leg muscles in a sample of 10 to 25 children with CP.21,23–27 Interrater reliability of handheld dynamometry of the lower extremities in children with CP varies between .39 and .94, depending on the muscle group and method of measuring.21,23 The results of these studies are similar to those found in the studies on the upper extremities: low sample sizes and mild to good reliability. When combined, these findings indicate that the MicroFET2 dynamometer has potential as a reliable instrument for measuring upper limb muscle strength in children with CP. Future research should focus on all clinimetric properties of the HHD with regard to measuring the strength of the upper extremities of children with CP.
The positive characteristics of muscle strength-torque sensors and strain gauge technology are that the outcomes are computerized and show small incremental steps (because high-quality strain levers were used). Therefore, they can be very accurate. In addition, errors caused by inaccurate reading of the display by the examiner can be prevented by storing the outcome digitally, which can improve the reliability. Because of the small increments shown on the display, the outcomes of the modified sphygmomanometer are very accurate. A disadvantage of these instruments is that they are specially designed for research purposes. These instruments are not commercially or broadly available and, therefore, are more difficult to implement in daily clinical practice.
Limitations
The COSMIN method is a strict method with stringent rules, and it sets high standards for methodological design of clinimetric studies and reporting. The COSMIN standards were originally developed for evaluating questionnaire-based measurement instruments. One of the stringent rules is a minimum of 50 included samples in the study to achieve good methodological quality. For most studies that focus on clinimetric properties of questionnaire-based measurement tools, it is easier to adhere to this standard compared with studies that focus on clinimetric properties of performance-based measurement tools. Therefore, it is possible that the COSMIN standards have limitations when evaluating measurement tools that are performance-based.
In most studies included in the present review, the lack of information on their design and other important items of the COSMIN checklist is highly remarkable. Information about the COSMIN items of “missing data” (the percentage of missing data), “how missing items were handled,” and “independent administrations” (assessors blinded) often was absent. Also, the standard of the included sample size in the analysis often was not adequate. Because of this missing information and the small sample sizes, subitems automatically were given a low score in the COSMIN box. Furthermore, because of the limited number of studies that described clinimetric properties, it was not possible to compare studies and provide an overall conclusion of the best measurement instrument.
Conclusion and Recommendations
Although several instruments for measuring upper extremity strength in children with CP are available, research on the clinimetric properties of these instruments is rarely done. To measure grip strength, it is recommended to use the Jamar dynamometer. For measuring other upper extremity muscle groups, it is recommended to use the HHD. Manual muscle testing can be used when measuring total upper extremity or wrist strength in children with CP who have very limited muscle strength (below grade 4). However, caution with interpretation of the test results is warranted because no information is available regarding the possibility of determining changes over time (responsiveness), the SDC, the SEM, and the validity of these instruments. Future studies should be designed according to the COSMIN criteria; should go beyond interrater, intrarater, and test-retest reliability; and should be performed on children with CP from different age groups and all MACS levels,54 according to a well-described protocol.
Appendix.
Example of COSMIN (COnsensus-based Standards for the selection of health status Measurement INstruments) Measurement Property Box57
Footnotes
All authors provided concept/idea/research design, writing, and data analysis. Mr Dekkers, Dr Rameckers, and Dr Janssen-Potten provided data collection. Dr Smeets provided project management, fund procurement, facilities/equipment, and institutional liaisons. Dr Smeets and Dr Janssen-Potten provided consultation (including review of manuscript before submission). The authors are grateful to Krys Galama and Maria Kamphuis for correcting the English text.
- Received April 25, 2013.
- Accepted January 3, 2014.
- © 2014 American Physical Therapy Association