Skip to main content
  • Other Publications
  • Subscribe
  • Contact Us
Advertisement
JCORE Reference
this is the JCORE Reference site slogan
  • Home
  • Most Read
  • About Us
    • About Us
    • Editorial Board
  • More
    • Advertising
    • Alerts
    • Feedback
    • Folders
    • Help
  • Patients
  • Reference Site Links
    • View Regions
  • Archive

Interrater and Intrarater Reliability of Common Clinical Standing Balance Tests for People With Hip Osteoarthritis

Yik Ming Choi, Fiona Dobson, Joel Martin, Kim L. Bennell, Rana S. Hinman
DOI: 10.2522/ptj.20130266 Published 1 May 2014
Yik Ming Choi
Y.M. Choi, DClinPhysio, Centre for Health, Exercise and Sports Medicine, Department of Physiotherapy, School of Health Sciences, The University of Melbourne, Carlton, Victoria, Australia, and Department of Rehabilitative Services, Changi General Hospital, Singapore.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Fiona Dobson
F. Dobson, PhD, Centre for Health, Exercise and Sports Medicine, Department of Physiotherapy, School of Health Sciences, The University of Melbourne.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Joel Martin
J. Martin, BAppSc, Centre for Health, Exercise and Sports Medicine, Department of Physiotherapy, School of Health Sciences, The University of Melbourne.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Kim L. Bennell
K.L. Bennell, PhD, Centre for Health, Exercise and Sports Medicine, Department of Physiotherapy, School of Health Sciences, The University of Melbourne.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Rana S. Hinman
R.S. Hinman, PhD, Centre for Health, Exercise and Sports Medicine, Department of Physiotherapy, Melbourne School of Health Sciences, The University of Melbourne, Alan Gilbert Building, 161 Barry St, Carlton, Victoria, 3053, Australia.
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Article
  • Figures & Data
  • Info & Metrics
  • PDF
Loading

Abstract

Background Hip osteoarthritis (OA) is a common musculoskeletal condition affecting older individuals. Clinical balance tests are frequently used to assess standing balance in these people. There is insufficient information regarding the reliability of these tests.

Objective The aim of this study was to estimate reliability and measurement error of 4 common clinical standing balance tests in people with hip OA.

Design A prospective study was conducted with repeated measures between 2 independent raters within 1 session and within 1 rater over a 1-week interval.

Methods Thirty people with hip OA were evaluated. Reliability was estimated for the Four-Square Step Test, Step Test, Functional Reach Test, and Timed Single-Leg Stance Test using intraclass correlation coefficients (ICC [2,1]). Measurement error was expressed as standard error of measurement and minimal detectable change.

Results The Four-Square Step Test, Step Test, and Timed Single-Leg Stance Test were sufficiently reliable between raters (ICC=.85–.94, lower 1-sided 95% confidence interval [95% CI]=.71–.89), whereas the Step Test (standing on study limb) and Timed Single-Leg Stance Test (standing on nonstudy limb) were sufficiently reliable within a rater over a 1-week interval (ICC=.91, lower 1-sided 95% CI=.80–.83). The Step Test (standing on study limb) and Timed Single-Leg Stance Test (standing on nonstudy limb) achieved optimal levels of reliability (ICC >.90, lower 1-sided 95% CI >.70), with acceptable measurement error (<10%) for clinical outcome measures. The Functional Reach Test was not sufficiently reliable. A ceiling effect was detected for the Timed Single-Leg Stance Test.

Limitations Reliability was assessed only between 2 raters during a single session and within 1 rater over a 1-week interval, which limits generalizability.

Conclusions The Step Test (standing on study limb) is recommended as a highly reliable test with acceptable measurement error for assessing standing balance in people with hip OA.

Osteoarthritis (OA) is a common musculoskeletal condition affecting many individuals, especially older people. It typically causes joint pain and a decrease in physical function, thus limiting individual participation in society and leading to a reduction in quality of life.1,2 In the United States, it has been estimated that nearly 27 million adults aged 25 years and older have symptoms and clinical findings of OA.3 The hip is one of the most common joints affected by OA. Epidemiological studies show that hip OA affects 7% to 25% of the population aged over 55 years, and this prevalence is expected to increase gradually as the whole population ages.1,4

Standing balance is essential for many daily activities such as lower body dressing, ambulating, and stair climbing. Control of balance depends upon sensory input, central processing of afferent input, and coordinated neuromuscular responses to ensure the center of mass remains within the base of support when balance is challenged.5,6 A variety of symptoms and physical impairments associated with hip OA, including joint pain, muscle weakness, joint stiffness, and sensory dysfunction, can affect balance.7–9 Not surprisingly, impaired standing balance has been reported in people with hip OA compared with age-matched participants who were healthy10–13 and is frequently observed by clinicians treating people with hip OA. Importantly, impaired balance is recognized as a risk factor for falls in the older population,14,15 and falls are frequently reported in people with hip OA,16 with the majority of falls occurring during ambulation and stair ascent and descent. Thus, assessment of standing balance is an integral component of hip OA management.

Balance may be measured using complex and sophisticated equipment, such as force platforms or posturography systems11,17,18; however, such equipment is expensive and impractical for regular use in most clinical settings and in many research settings. For many clinicians and researchers, simple clinical tests are the most practical methods of measuring standing balance in people with hip OA.19,20 To ensure judicious use of clinical standing balance tests, it is essential to confirm that these tests are reliable, as well as understand the measurement error associated with their use, in the population of interest.21 However, to date, there is insufficient evidence regarding the clinimetric properties of clinical standing balance tests in people with hip OA.22 Our recent systematic review, which synthesized evidence on clinimetric properties of observer-rated impairment tests (including balance tests) in people with hip and groin problems,22 failed to identify a single study investigating the reliability (or any clinimetric property) of balance tests for hip OA. This remarkable dearth of literature evaluating measurement properties of balance tests in people with hip OA is concerning, given that such tests are frequently used in the clinical setting and to assess treatment outcomes in clinical trials.20,23,24

The primary aim of this study was to estimate the reliability of 4 common clinical balance tests in people with hip OA: Four-Square Step Test, Step Test, Functional Reach Test (FRT), and Timed Single-Leg Stance Test. A secondary aim was to estimate the amount of measurement error associated with each test.

Method

In this study, between-rater reliability refers to repeated measures between 2 independent raters within a session, and within-rater reliability refers to repeated measures by a rater over a 1-week interval. As such, both designs also include an element of test-retest reliability.

Participants

Volunteers were sourced from a database of research volunteers from the community maintained by the Centre for Health, Exercise and Sports Medicine, Department of Physiotherapy, The University of Melbourne. To be eligible, participants were required to fulfill the following inclusion criteria based on clinical diagnostic criteria for hip OA established by the American College of Rheumatology25: (1) age >50 years; (2) hip pain on most days of the previous month; and (3) at least one of the following radiological or clinical presentations: presence of joint space narrowing and osteophytes on hip radiographs taken in the previous year, hip internal rotation of <15 degrees and hip flexion of ≤115 degrees, and hip internal rotation of ≥15 degrees in the presence of pain and morning stiffness of the hip for ≤60 minutes. Participants also were required to be able to ambulate independently in the community and read and follow instructions in English. Participants were not eligible if they: (1) had previous hip or knee joint replacement; (2) had any hip surgery in the previous 6 months; (3) had other muscular, joint, or neurological conditions causing pain and dysfunction of lower limbs; or (4) used any form of walking aid. All participants provided written informed consent.

Procedure

Participants were tested on 2 occasions (approximately 1 week apart). At the first test session, participants performed the balance tests with 2 independent raters (rater A and rater B) to examine between-rater reliability. The testing order of both the raters and the balance tests was randomized using a computerized random number generator. Participants were given 5 minutes' rest between each rater's independent assessments. At the second test session, participants repeated the balance tests with the more experienced rater A (who was blinded to the results from session 1) to examine within-rater reliability. A 1-week test interval was used to provide sufficient time to limit recall of test scores, but it was short enough to limit potential real change in clinical status. At session 2, participants completed a self-report global rating of change. This measure was used as a reference standard for stability and determined whether any substantial change in the participant's hip condition had occurred between test sessions.

Assessment of hip OA symptoms.

As both lower limbs were assessed during the balance testing, the most painful hip was defined as the study limb, and the least painful (for bilateral disease) or nonpainful hip was defined as the nonstudy limb. A visual analog scale (VAS) was used to assess the average level of hip pain over the previous week. Participants were asked to mark an “X” on a 100-mm line, anchored with “no pain” on the left and “worst pain possible” on the right. The distance (in millimeters) from the left anchor to the X mark was then measured, with higher VAS scores indicating more severe pain.26 The VAS has demonstrated reliability in people with OA.27

The Hip Dysfunction and Osteoarthritis Outcome Score (HOOS) was used to assess patient-reported symptoms and disability related to hip OA.28 It consists of 40 items over 5 subscales: pain (10 items), other symptoms (5 items), function in daily living (17 items), function in sports and recreation (4 items), and hip-related quality of life (4 items).29,30 All items are answered on a 5-point Likert scale, and a total score is calculated, ranging from 0 (“no disability”) to 100 (“extreme disability”).29,30 The HOOS has demonstrated reliability in people with hip OA.30

A global change scale (GCS) was used to assess self-reported change in hip pain and physical function across the 2 testing sessions. The GCS was measured on a 5-point adjectival scale (“much worse,” “slightly worse,” “no change,” “slightly better,” and “much better”). Participants who recorded “much better” or “much worse” were excluded from the within-rater analyses. Some studies have previously used these scales to determine changes in participants' conditions, where “minimal or slight changes” were defined as nonmeaningful change.31–33 The GCS has been shown to be highly reliable in people with musculoskeletal dysfunction.34,35

Assessment of balance.

Participants were tested barefooted on each of the 4 clinical balance tests.

In the Four-Square Step Test,36 4 walking sticks were placed on the floor at right angles with handles outward to form 4 squares. Participants started in square 1, facing square 2, and remained facing this direction for the duration of the test. Participants then stepped forward with both feet as quickly as possible into square 2, then sideways to the right into square 3, then backward into square 4, and finally sideways to the left back into square 1. They then reversed the sequence back to the starting position. A demonstration was provided, and an initial practice was performed, immediately followed by 2 test trials. According to original published instructions for the test, the faster of the 2 trials was recorded to the nearest 10th of a second.

For the Step Test,37 a 15-cm height step was used with a 5-cm-wide cardboard template positioned on the floor along the edge of the step to provide a standardized starting position. The test was performed standing on the study leg the entire time, while the other leg was moved back and forth from the step to the floor (eg, the stepping foot was placed flat up onto the step, then back down flat onto the ground) as many times as possible in 15 seconds without overbalancing (moving the stance leg from the start position). A demonstration was provided, and 3 or 4 practice steps were performed, immediately followed by 1 test trial standing on each leg. The number of whole steps (up and back down to a flat position on the floor) performed in 15 seconds was recorded for each standing leg. If participants overbalanced, the test was concluded, and the number of completed steps and the time taken were recorded.

The FRT consisted of 2 types of tests: (1) forward reach and (2) lateral reach. In the forward reach test,38 participants started in a normal relaxed stance with their dominant arm facing side-on, but not touching, a wall. A leveled measuring tape was then mounted on the wall at the acromion height. Participants made a fist with the dominant hand and elevated the arm to 90 degrees (ie, shoulder level). The position of the third knuckle (metacarpophalangeal joint) along the tape was recorded as the starting point. Keeping the contralateral arm by the side and both heels on the floor, participants reached as far forward as possible to maintain a maximal reach position for 3 seconds without losing balance (such as taking a step, leaning on the wall, or needing to be assisted by the rater). The final reach position of the third knuckle along the tape was recorded as the finishing point. A demonstration was provided, immediately followed by 3 test trials. According to original published instructions for the test, the mean difference between the starting and finishing points across the 3 trials was recorded to the nearest millimeter as the test score.

In the lateral reach test,39 participants started in a normal relaxed stance with their back facing, but not touching, a wall. A leveled measuring tape was then mounted on the wall at the acromion height. Participants abducted 1 arm to 90 degrees (ie, shoulder level) with all fingers extended. The position of the tip of the third finger along the tape was recorded as the starting point. Keeping the contralateral arm by the side and both heels on the floor, participants reached as far sideways as possible to maintain a maximal reach position for 3 seconds without losing their balance, taking a step, or leaning on the wall. Knee flexion and trunk flexion and rotation were not permitted. Participants were instructed not to bend at the knees or at the trunk. If bending at the knees or trunk occurred during testing, the test was stopped immediately and corrected. A re-trial was then conducted. The final position of the tips of the third fingers along the tape was recorded as the finishing point. A demonstration was provided, immediately followed by 3 test trials on each side. The mean difference between the starting and finishing points across the trials for each side was recorded to the nearest millimeter as the test score. A reach in the direction of the study hip was defined as the ipsilateral reach, and a reach away from the study hip was defined as the contralateral reach.

Participants started the Timed Single-Leg Stance Test40 with their hands on their hips and stood on 1 leg for as long as possible up to a maximum of 30 seconds. The nonstance hip remained in a neutral position with the knee flexed so that the foot was positioned behind and was not permitted to touch the stance leg. Participants were encouraged to look at a nonmoving target 1 to 3 m ahead. The test was stopped if participants moved their hands off their hips, touched the nonstance foot down on the floor, or touched the stance leg with the nonstance leg. A demonstration was provided, followed immediately by 2 test trials on each leg (based on original published instructions). The longest time, up to a maximum of 30 seconds, of the 2 trials on each leg was recorded to the nearest 10th of a second as the test score for each leg.

Data Analysis

Data analyses were performed using the IBM SPSS 21 statistical package for Windows (IBM Corp, Armonk, New York). Data were checked for normality and for systematic differences between test sessions. Descriptive analyses were conducted across raters and sessions, including means, standard deviations, and ranges of scores. Percentages of maximal scores (ceiling effects) also were calculated for the Timed Single-Leg Stance Test because the score for this test is capped at 30 seconds.

Within-rater and between-rater reliability were each calculated using intraclass correlation coefficients (ICC [2,1]) with 95% confidence intervals (95% CIs) for a 2-way random effects model and absolute agreement. Interpretation of ICC values was based on published recommendations,21 where values higher than .75 indicate sufficient reliability and values higher than .90 indicate optimal reliability.21,41 Furthermore, 95% CI values were inspected to ensure that lower 1-sided 95% CI values met a recommended minimum acceptable level, which was set at .70.41–43

Measurement error was expressed as the standard error of measurement (SEM) and minimal detectable change (MDC). The SEM was calculated as the square root of the mean square error term from the analysis of variance. The MDC at the 90% confidence level (MDC90) was calculated as SEM × 1.65 (z score of 90% interval) × √2. For both the SEM and MDC90, 95% CIs were calculated according to recommended methods.44

As the units of measurement for the 4 balance tests varied, SEM and MDC90 also were expressed as SEM percentage (SEM%) and MDC percentage (MDC%) to assist with interpretation of the results. These values were defined as the SEM and MDC divided by the mean of all testing scores on the 2 test sessions and were calculated as SEM% = (SEM/mean) × 100 and MDC% = (MDC90/mean) × 100.42,45,46

Sample Size

Sample size calculations were based on a priori set levels of optimal and minimal acceptable limits of reliability for clinical measurement. As such, a minimum of 19 participants were required to achieve an optimal ICC of .90 and a minimal acceptable lower 1-sided 95% CI of .70 at a power of 80%.47 In this study, 30 participants were recruited to allow for any potential dropouts and the exclusion of data from participants who reported a meaningful change in their condition across sessions.

Results

Thirty people with hip OA (18 female [60%], 12 male [40%]; mean age=63.3 years, SD=5.71, range=50–75) participated. Descriptive characteristics of the participants are summarized in Table 1. In this cohort of participants, there were more women than men, and most of the participants were overweight (body mass index >25 kg/m2). One-third reported bilateral symptoms. Most had not sustained a fall in the previous 12 months. In addition, most participants reported a moderate level of hip pain and disability according to VAS and HOOS scores.

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 1.

Participant Characteristics (N=30)a

Within-rater reliability was based on data from 27 participants, as 2 participants were unable to return for session 2 and a further participant reported substantial change in hip pain (“much worse”) at session 2 and was excluded from further analysis. The within-rater reliability test interval was 7 days for most participants (25/27) and was 6 days and 8 days for the remaining 2 participants. There was no missing data, and no adverse events occurred at any testing occasion. The majority of data were normally distributed. There were systematic differences for the Four-Square Step Test and Step Test within rater A over the 1-week interval and for the forward reach part of the FRT between raters A and B within the single session (P<.05).

Between-Rater Reliability on 2 Test Occasions Within a Single Session

Balance test scores between raters for all 30 participants at session 1, along with the percentages of maximal scores for the Timed Single-Leg Stance Test and ICCs, are presented in Table 2. The Four-Square Step Test, Step Test, and Timed Single-Leg Stance Test were sufficiently reliable between raters (ICC=.85–.94, lower 1-sided 95% CI=.71–.89). Further inspection of the point estimates and confidence limits demonstrated that the Step Test (study limb) and the Timed Single-Leg Stance Test also met the optimal level of reliability (ICC >.90, lower 1-sided 95% CI >.70).

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 2.

Between-Rater Reliability: Balance Test Scores, Intraclass Correlation Coefficients (ICCs), Standard Errors of Measurement (SEMs), and Minimal Detectable Change at the 90% Level of Confidence (MDC90) Across the 2 Raters at Session 1 (N=30)a

Within-Rater Reliability of Repeated Measures Over a 1-Week Interval

Balance test scores for 27 participants assessed by rater A during session 1 and session 2, along with the percentages of maximal scores for the Timed Single-Leg Stance Test and ICCs, are presented in Table 3. The Step Test (study limb) and Timed Single-Leg Stance Test (nonstudy limb) were sufficiently reliable within 1 rater over a 1-week interval and met the optimal levels of reliability (ICC=.91, lower 1-sided 95% CI=.80–.83).

View this table:
  • View inline
  • View popup
  • Download powerpoint
Table 3.

Within-Rater Reliability: Balance Test Scores, Intraclass Correlation Coefficients (ICCs), Standard Errors of Measurement (SEMs), and Minimal Detectable Change at the 90% Level of Confidence (MDC90) Across the 2 Test Sessions (n=27)a

Ceiling and Floor Effects

Inspection of minimum and maximum scores (Tabs. 2 and 3) showed a consistent ceiling effect for the Timed Single-Leg Stance Test. Approximately half of the participants (44%–57%) were able to perform the Timed Single-Leg Stance Test with maximal holding of 30 seconds at each test occasion.

Measurement Error

The SEM, SEM%, MDC90, and MDC% between raters at session 1 and within 1 rater over a 1-week interval are provided in Tables 2 and 3, respectively. The SEM of the tests between raters varied between 7.4% and 16.1% of the test score, whereas it varied between 9.0% and 21.2% of the test score when repeatedly measured by 1 rater over a 1-week interval. The Step Test (study limb) and Four-Square Step Test had sufficiently low measurement error (<10% of the test score) for both situations, whereas the Timed Single-Leg Stance Test showed the largest measurement error (>14%) in both situations.

Discussion

In this study, we aimed to estimate the reliability and measurement error associated with 4 clinical standing balance tests in a cohort of people with symptomatic hip OA. We found that the Four-Square Step Test, Step Test, and Timed Single-Leg Stance Test were sufficiently reliable between raters within a session, whereas the Step Test (study limb) and Timed Single-Leg Stance Test (nonstudy limb) were sufficiently reliable within 1 rater over a 1-week interval. The Step Test (study limb) and Timed Single-Leg Stance Test (nonstudy limb) achieved optimal levels of reliability in both situations, but only the Step Test (study limb) also had sufficiently low measurement error to be confident of a measured value in the clinical situation. In view of the larger amount of measurement error and our observed ceiling effect for the Timed Single-Leg Stance Test, this test may be a less useful measure of standing balance for people with hip OA, despite being a reliable test. Furthermore, the FRT subtests were not sufficiently reliable either between or within raters, and the larger amount of measurement error associated with these tests limits the confidence in a measured value and the usefulness of these tests in the clinical setting. Thus, our findings suggest that the Step Test (standing on most affected limb) is the most useful clinical test of standing balance in hip OA, as it is highly reliable with sufficiently low measurement error.

Due to the paucity of earlier research in this area, and because this is the first study, to our knowledge, to estimate reliability of balance tests in hip OA, it is difficult to discuss our findings in relation to previous research. However, our findings are generally in agreement with those of a study that evaluated the reliability of balance measurements in patients with hip fracture.48 In that study, Sherrington and Lord48 found good test-retest reliability for the Step Test, with similar levels of reliability (ICC=.85–.92) and lower 95% CI values (.71–.83) compared with those found in the current study. In contrast, our findings are quite different from those of an earlier study that evaluated interrater reliability of a battery of tests, including the Timed Single-Leg Stance Test, in patients following surgically fixed hip fractures.49 In that study, the Timed Single-Leg Stance Test was one of the least reliable tests, and reliability estimates were much lower (kappa=.14–.63) than those found in the current study. To our knowledge, no reliability estimates for the FRT or the Four-Square Step Test in a comparable group have been conducted.

Measurement errors associated with the 4 balance tests, which have not previously been reported, also were estimated in the current study. This information assists with the interpretation of and confidence in an obtained measure. For a measure to be clinically useful, it must have a sufficiently high ICC and sufficiently low SEM. We also calculated the SEM% and MDC% so that tests could be compared, given that the units of measurement varied across the tests. In the current study, the Step Test and Four-Square Step Test were found to have lower SEM% and MDC% values than the FRT and Timed Single-Leg Stance Test. This finding means that, compared with the FRT and Timed Single-Leg Stance Test, smaller amounts of change are required on the Step Test and Four-Square Step Test to be confident that a real change in balance has occurred. To be confident of real change in balance when applying these tests in individuals with hip OA, clinicians and researchers should aim to see a change of 3 steps on the Step Test (standing on the affected side), 2 seconds on the Four-Square Step Test, 9.9 cm on the forward reach component of the FRT, (5.0 and 5.2 cm for ipsilateral and contralateral functional reach, respectively), and 10.8 seconds on the Timed Single-Leg Stance Test.

Our study had a number of strengths, including the robust sample size that was adequately powered to detect our a priori optimal level of reliability, inclusion of a range of commonly used clinical balance tests, and exclusion of participants with a change in clinical state from the within-rater analysis. Importantly, we also determined the measurement error associated with the balance tests, which will enable clinicians and researchers to interpret change in balance scores across time with respect to real change.

There were some limitations to the current study. Given a participant's global rating of change and balance performance may not be independent, and thus the potential for correlated error, it is possible our estimates of reliability were inflated somewhat. Results might have been different if participants with a change in their clinical condition were included in the analyses. As both our between-rater and within-rater analyses also included a component of test-retest reliability, the additional source of error resulting from potential differences in participants' performance across the repeated measures may have increased the measurement error estimates for these clinical tests. Indeed, as systematic differences for the Four-Square Step Test and Step Test were found over the 1-week interval, it is possible these errors were not only due to rater error but also represent altered performance by the participant between sessions.

Only 2 raters were used for evaluating between-rater reliability, which may limit the generalizability of our findings to a wider pool of raters with different abilities and clinical backgrounds. However, we did choose raters from different professional backgrounds (rater A was a clinical physical therapist, and rater B was a researcher with a human movement science background) and with different levels of experience in assessing older patients with pathology, which helps to increase the generalizability of our findings. Additionally, only 1 rater was used for evaluating within-rater reliability. Although this rater was a physical therapist, and thus improves the generalizability of the findings to clinicians, inclusion of additional raters would have strengthened the study. Although our cohort of participants with hip OA were all community recruits, representing at most a moderate level of disease severity based on symptomatic data, it is not clear whether the present findings apply to participants who are not community-dwelling or to patients with end-stage disease awaiting arthroplasty.

Future research is needed to provide comprehensive data about the clinimetric properties for clinical balance tests in people with hip OA. In particular, evaluations of the validity and responsiveness of these tests are needed. Information about the minimal clinically important difference is needed so that researchers and clinicians can determine what amount of change in the balance tests is required with interventions in order to achieve meaningful clinical improvements in health status for the patient. Although we have determined the MDC, which tells clinicians and researchers the amount of change needed to be sure of a real change beyond that associated with measurement error, it is not necessarily the same as the minimal clinically important difference. Although a third of our participants in this study had bilateral hip OA, a subgroup analysis of these participants was not performed because the study was not powered sufficiently for such an analysis. However, as two-thirds (n=20) of the participants had unilateral hip OA, a post hoc subanalysis with sufficient power revealed that reliability estimates for unilateral hip OA were approximately the same as those for the entire sample. Furthermore, interpretation of these values based on a priori criteria was no different from the interpretation of the values of the group as a whole. As estimates may differ for those with bilateral disease, we recommend that future research is needed to examine the reliability of balance tests within this subgroup.

In conclusion, this study provides estimates of reliability and measurement error of 4 clinical standing balance tests in a cohort of 30 participants with hip OA. Only the Step Test (standing on the affected side) and the Timed Single-Leg Stance Test demonstrated optimal levels of reliability for clinical measurement tests. When measurement error and ceiling effects also are considered, our data suggest the Step Test (standing on the affected side) is the most useful clinical measure of standing balance for people with hip OA. Further research is needed to determine the responsiveness and, in particular, the minimal clinically important difference, for these tests.

Footnotes

  • Dr Choi, Dr Dobson, Dr Bennell, and Dr Hinman provided concept/idea/research design and writing. Dr Choi and Mr Martin provided data collection. Dr Choi, Dr Dobson, and Dr Hinman provided data analysis and project management. Dr Dobson, Mr Martin, and Dr Bennell provided consultation (including review of manuscript before submission).

  • This prospective reliability study received ethics approval from The University of Melbourne Ethics Committee.

  • This research was funded by National Health and Medical Research Council Program Grant 631717. Dr Bennell was partly funded by an Australian Research Council Future Fellowship. Dr Choi was funded by the Singapore Ministry of Health Reinvestment Fund.

  • Received June 25, 2013.
  • Accepted February 13, 2014.
  • © 2014 American Physical Therapy Association

References

  1. ↵
    1. Dagenais S,
    2. Garbedian S,
    3. Wai EK
    . Systematic review of the prevalence of radiographic primary hip osteoarthritis. Clin Orthop Relat Res. 2009;467:623–637.
    OpenUrlCrossRefPubMedWeb of Science
  2. ↵
    1. Salaffi F,
    2. Carotti M,
    3. Stancati A,
    4. Grassi W
    . Health-related quality of life in older adults with symptomatic hip and knee osteoarthritis: a comparison with matched healthy controls. Aging Clin Exp Res. 2005;17:255–263.
    OpenUrlCrossRefPubMedWeb of Science
  3. ↵
    1. Lawrence RC,
    2. Felson DT,
    3. Helmick CG,
    4. et al
    . Estimates of the prevalence of arthritis and other rheumatic conditions in the United States, part II. Arthritis Rheum. 2008;58:26–35.
    OpenUrlCrossRefPubMedWeb of Science
  4. ↵
    1. Zhang Y,
    2. Jordan JM
    . Epidemiology of osteoarthritis. Rheum Dis Clin North Am. 2008;34:515–529.
    OpenUrlCrossRefPubMedWeb of Science
  5. ↵
    1. Horak FB,
    2. Shupert CL,
    3. Mirka A
    . Components of postural dyscontrol in the elderly: a review. Neurobiol Aging. 1989;10:727–738.
    OpenUrlCrossRefPubMedWeb of Science
  6. ↵
    1. Massion J
    . Postural control system. Curr Opin Neurobiol. 1994;4:877–887.
    OpenUrlCrossRefPubMed
  7. ↵
    1. Kosek E,
    2. Ordeberg G
    . Abnormalities of somatosensory perception in patients with painful osteoarthritis normalize following successful treatment. Eur J Pain. 2000;4:229–238.
    OpenUrlCrossRefPubMedWeb of Science
  8. ↵
    1. Loureiro A,
    2. Mills PM,
    3. Barrett RS
    . Muscle weakness in hip osteoarthritis: a systematic review. Arthritis Care Res (Hoboken). 2013;65:340–352.
    OpenUrlCrossRefPubMedWeb of Science
  9. ↵
    1. Bijlsma JW,
    2. Berenbaum F,
    3. Lafeber FP
    . Osteoarthritis: an update with relevance for clinical practice. Lancet. 2011;377:2115–2126.
    OpenUrlCrossRefPubMedWeb of Science
  10. ↵
    1. Kiss R
    . Effect of the degree of hip osteoarthritis on equilibrium ability after sudden changes in direction. J Electromyogr Kinesiol. 2010;20:1052–1057.
    OpenUrlCrossRefPubMed
  11. ↵
    1. Giemza C,
    2. Ostrowska B,
    3. Matczak-Giemza M
    . The effect of physiotherapy training programme on postural stability in men with hip osteoarthritis. Aging Male. 2007;10:67–70.
    OpenUrlCrossRefPubMedWeb of Science
  12. ↵
    1. Nantel J,
    2. Termoz N,
    3. Centomo H,
    4. et al
    . Postural balance during quiet standing in patients with total hip arthroplasty and surface replacement arthroplasty. Clin Biomech (Bristol, Avon). 2008;23:402–407.
    OpenUrlCrossRefPubMed
  13. ↵
    1. Tateuchi H,
    2. Ichihashi N,
    3. Shinya M,
    4. Oda S
    . Anticipatory postural adjustments during lateral step motion in patients with hip osteoarthritis. J Appl Biomech. 2011;27:32–39.
    OpenUrlPubMedWeb of Science
  14. ↵
    1. Robbins AS,
    2. Rubenstein LZ,
    3. Josephson KR,
    4. et al
    . Predictors of falls among elderly people: results of two population-based studies. Arch Intern Med. 1989;149:1628–1633.
    OpenUrlCrossRefPubMedWeb of Science
  15. ↵
    1. Stalenhoef PA,
    2. Diederiks JP,
    3. Knottnerus JA,
    4. et al
    . A risk model for the prediction of recurrent falls in community-dwelling elderly: a prospective cohort study. J Clin Epidemiol. 2002;55:1088–1094.
    OpenUrlCrossRefPubMedWeb of Science
  16. ↵
    1. Arnold CM,
    2. Faulkner RA
    . The history of falls and the association of the timed up and go test to falls and near-falls in older adults with hip osteoarthritis. BMC Geriatr. 2007;7:17.
    OpenUrlCrossRefPubMed
  17. ↵
    1. Rasch A,
    2. Dalen N,
    3. Berg HE
    . Muscle strength, gait, and balance in 20 patients with hip osteoarthritis followed for 2 years after THA. Acta Orthop. 2010;81:183–188.
    OpenUrlCrossRefPubMedWeb of Science
  18. ↵
    1. Arokoski JP,
    2. Leinonen V,
    3. Arokoski MH,
    4. et al
    . Postural control in male patients with hip osteoarthritis. Gait Posture. 2006;23:45–50.
    OpenUrlCrossRefPubMedWeb of Science
  19. ↵
    1. Arnold CM,
    2. Faulkner RA
    . The effect of aquatic exercise and education on lowering fall risk in older adults with hip osteoarthritis. J Aging Phys Act. 2010;18:245–260.
    OpenUrlPubMed
  20. ↵
    1. Hinman RS,
    2. Heywood SE,
    3. Day AR
    . Aquatic physical therapy for hip and knee osteoarthritis: results of a single-blind randomized controlled trial. Phys Ther. 2007;87:32–43.
    OpenUrlAbstract/FREE Full Text
  21. ↵
    1. Portney LG,
    2. Watkins MP
    . Foundations of Clinical Research: Applications to Practice. 3rd ed. Upper Saddle River, NJ: Pearson/Prentice Hall; 2009.
  22. ↵
    1. Dobson F,
    2. Choi YM,
    3. Hall M,
    4. Hinman RS
    . Clinimetric properties of observer-assessed impairment tests used to evaluate hip and groin impairments: a systematic review. Arthritis Care Res (Hoboken). 2012;64:1565–1575.
    OpenUrlCrossRefPubMedWeb of Science
  23. ↵
    1. Hale LA,
    2. Waters D,
    3. Herbison P
    . A randomized controlled trial to investigate the effects of water-based exercise to improve falls risk and physical function in older adults with lower-extremity osteoarthritis. Arch Phys Med Rehabil. 2012;93:27–34.
    OpenUrlCrossRefPubMed
  24. ↵
    1. Bennell KL,
    2. Egerton T,
    3. Pua YH,
    4. et al
    . Efficacy of a multimodal physiotherapy treatment program for hip osteoarthritis: a randomised placebo-controlled trial protocol. BMC Musculoskelet Disord. 2010;11:238.
    OpenUrlCrossRefPubMed
  25. ↵
    1. Altman RD,
    2. Alarcon G,
    3. Appelrouth D,
    4. et al
    . The American College of Rheumatology criteria for the classification and reporting of osteoarthritis of the hip. Arthritis Rheum. 1991;34:505–514.
    OpenUrlCrossRefPubMedWeb of Science
  26. ↵
    1. Kahl C,
    2. Cleland JA
    . Visual analogue scale, numeric pain rating scale and the McGill Pain Questionnaire: an overview of psychometric properties. Phys Ther Rev. 2005;10:123–128.
    OpenUrlCrossRef
  27. ↵
    1. Bellamy N
    . Osteoarthritis clinical trials: candidate variables and clinimetric properties. J Rheumatol. 1997;24:768–778.
    OpenUrlPubMedWeb of Science
  28. ↵
    1. Thorborg K,
    2. Roos EM,
    3. Bartels EM,
    4. et al
    . Validity, reliability and responsiveness of patient-reported outcome questionnaires when assessing hip and groin disability: a systematic review. Br J Sports Med. 2010;44:1186–1196.
    OpenUrlAbstract/FREE Full Text
  29. ↵
    1. Nilsdotter AK,
    2. Lohmander LS,
    3. Klassbo M,
    4. Roos EM
    . Hip Disability and Osteoarthritis Outcome Score (HOOS): validity and responsiveness in total hip replacement. BMC Musculoskelet Disord. 2003;4:10.
    OpenUrlCrossRefPubMed
  30. ↵
    1. Klassbo M,
    2. Larsson E,
    3. Mannevik E
    . Hip disability and osteoarthritis outcome score: an extension of the Western Ontario and McMaster Universities Osteoarthritis Index. Scand J Rheumatol. 2003;32:46–51.
    OpenUrlCrossRefPubMedWeb of Science
  31. ↵
    1. Cleland JA,
    2. Childs JD,
    3. Whitman JM
    . Psychometric properties of the Neck Disability Index and Numeric Pain Rating Scale in patients with mechanical neck pain. Arch Phys Med Rehabil. 2008;89:69–74.
    OpenUrlCrossRefPubMedWeb of Science
  32. ↵
    1. Perera S,
    2. Mody SH,
    3. Woodman RC,
    4. Studenski SA
    . Meaningful change and responsiveness in common physical performance measures in older adults. J Am Geriatr Soc. 2006;54:743–749.
    OpenUrlCrossRefPubMedWeb of Science
  33. ↵
    1. Wright AA,
    2. Cook CE,
    3. Baxter GD,
    4. et al
    . A comparison of 3 methodological approaches to defining major clinically important improvement of 4 performance measures in patients with hip osteoarthritis. J Orthop Sports Phys Ther. 2011;41:319–327.
    OpenUrlCrossRefPubMed
  34. ↵
    1. Costa LO,
    2. Maher CG,
    3. Latimer J,
    4. et al
    . Clinimetric testing of three self-report outcome measures for low back pain patients in Brazil: which one is the best? Spine. 2008;33:2459–2463.
    OpenUrlCrossRefPubMedWeb of Science
  35. ↵
    1. Kamper SJ,
    2. Ostelo RW,
    3. Knol DL,
    4. et al
    . Global perceived effect scales provided reliable assessments of health transition in people with musculoskeletal disorders, but ratings are strongly influenced by current status. J Clin Epidemiol. 2010;63:760–766, e761.
    OpenUrlCrossRefPubMed
  36. ↵
    1. Dite W,
    2. Temple VA
    . A clinical test of stepping and change of direction to identify multiple falling older adults. Arch Phys Med Rehabil. 2002;83:1566–1571.
    OpenUrlCrossRefPubMedWeb of Science
  37. ↵
    1. Hill KD,
    2. Bernhardt J,
    3. McGann AM,
    4. et al
    . A new test of dynamic standing balance for stroke patients: reliability, validity and comparison with healthy elderly. Physiother Can. 1996;48:257–262.
    OpenUrlCrossRef
  38. ↵
    1. Duncan PW,
    2. Weiner DK,
    3. Chandler J,
    4. Studenski SA
    . Functional reach: a new clinical measure of balance. J Gerontol. 1990;45:M192–M197.
    OpenUrlAbstract
  39. ↵
    1. Brauer S,
    2. Burns Y,
    3. Galley P
    . Lateral reach: a clinical measure of medio-lateral postural stability. Physiother Res Int. 1999;4:81–88.
    OpenUrlCrossRefPubMed
  40. ↵
    1. Bohannon RW,
    2. Larkin PA,
    3. Cook AC,
    4. et al
    . Decrease in timed balance test scores with aging. Phys Ther. 1984;64:1067–1070.
    OpenUrlAbstract/FREE Full Text
  41. ↵
    1. de Vet HC,
    2. Terwee CB,
    3. Mokkink LB,
    4. Knol DL
    . Measurement in Medicine: A Practical Guide. New York, NY: Cambridge University Press; 2011.
  42. ↵
    1. Goldberg A,
    2. Casby A,
    3. Wasielewski M
    . Minimum detectable change for single-leg-stance-time in older adults. Gait Posture. 2011;33:737–739.
    OpenUrlCrossRefPubMedWeb of Science
  43. ↵
    1. Scholtes VA,
    2. Terwee CB,
    3. Poolman RW
    . What makes a measurement instrument valid and reliable? Injury. 2011;42:236–240.
    OpenUrlCrossRefPubMed
  44. ↵
    1. Stratford PW,
    2. Goldsmith CH
    . Use of the standard error as a reliability index of interest: an applied example using elbow flexor strength data. Phys Ther. 1997;77:745–750.
    OpenUrlAbstract/FREE Full Text
  45. ↵
    1. Flansbjer UB,
    2. Holmback AM,
    3. Downham D,
    4. et al
    . Reliability of gait performance tests in men and women with hemiparesis after stroke. J Rehabil Med. 2005;37:75–82.
    OpenUrlCrossRefPubMedWeb of Science
  46. ↵
    1. Huang SL,
    2. Hsieh CL,
    3. Wu RM,
    4. et al
    . Minimal detectable change of the Timed “Up & Go” Test and the Dynamic Gait Index in people with Parkinson disease. Phys Ther. 2011;91:114–121.
    OpenUrlAbstract/FREE Full Text
  47. ↵
    1. Walter SD,
    2. Eliasziw M,
    3. Donner A
    . Sample size and optimal designs for reliability studies. Stat Med. 1998;17:101–110.
    OpenUrlCrossRefPubMedWeb of Science
  48. ↵
    1. Sherrington C,
    2. Lord SR
    . Reliability of simple portable tests of physical performance in older people after hip fracture. Clin Rehabil. 2005;19:496–504.
    OpenUrlAbstract/FREE Full Text
  49. ↵
    1. Fox KM,
    2. Felsenthal G,
    3. Hebel JR,
    4. et al
    . A portable neuromuscular function assessment for studying recovery from hip fracture. Arch Phys Med Rehabil. 1996;77:171–176.
    OpenUrlCrossRefPubMedWeb of Science
View Abstract
PreviousNext
Back to top
Vol 94 Issue 5 Table of Contents
Physical Therapy: 94 (5)

Issue highlights

  • Outcome Measures for Individuals With Multiple Sclerosis
  • Upper Extremity Strength Measurement for Children With Cerebral Palsy
  • Pilates Exercises in Patients With Chronic Nonspecific Low Back Pain
  • The Graded Repetitive Arm Supplementary Program (GRASP) Intervention
  • Coaching People With Rheumatoid Arthritis to Increased Physical Activity
  • Reduced Restoration of Gait After a Step Modification Poststroke
  • Self-efficacy With Using a Manual Wheelchair
  • Functional Ability in Patients With Spinal Cord Injury
  • Safe Patient Handling Perceptions and Practices Among Acute Care Physical Therapists
  • Clinical Standing Balance Tests for Hip Osteoarthritis
  • Lymphoedema Functioning, Disability and Health Questionnaire for Lower Limb Lymphoedema
  • Recumbent Stepper Submaximal Test for Older Adults
  • A Modern Neuroscience Approach to Chronic Spinal Pain
Email

Thank you for your interest in spreading the word on JCORE Reference.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Enter multiple addresses on separate lines or separate them with commas.
Interrater and Intrarater Reliability of Common Clinical Standing Balance Tests for People With Hip Osteoarthritis
(Your Name) has sent you a message from JCORE Reference
(Your Name) thought you would like to see the JCORE Reference web site.
Print
Interrater and Intrarater Reliability of Common Clinical Standing Balance Tests for People With Hip Osteoarthritis
Yik Ming Choi, Fiona Dobson, Joel Martin, Kim L. Bennell, Rana S. Hinman
Physical Therapy May 2014, 94 (5) 696-704; DOI: 10.2522/ptj.20130266

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Save to my folders

Share
Interrater and Intrarater Reliability of Common Clinical Standing Balance Tests for People With Hip Osteoarthritis
Yik Ming Choi, Fiona Dobson, Joel Martin, Kim L. Bennell, Rana S. Hinman
Physical Therapy May 2014, 94 (5) 696-704; DOI: 10.2522/ptj.20130266
del.icio.us logo Digg logo Reddit logo Technorati logo Twitter logo CiteULike logo Connotea logo Facebook logo Google logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One
  • Article
    • Abstract
    • Method
    • Results
    • Discussion
    • Footnotes
    • References
  • Figures & Data
  • Info & Metrics
  • PDF

Related Articles

Cited By...

More in this TOC Section

  • Reliability and Validity of Force Platform Measures of Balance Impairment in Individuals With Parkinson Disease
  • Predictors of Reduced Frequency of Physical Activity 3 Months After Injury: Findings From the Prospective Outcomes of Injury Study
  • Effects of Locomotor Exercise Intensity on Gait Performance in Individuals With Incomplete Spinal Cord Injury
Show more Research Reports

Subjects

Footer Menu 1

  • menu 1 item 1
  • menu 1 item 2
  • menu 1 item 3
  • menu 1 item 4

Footer Menu 2

  • menu 2 item 1
  • menu 2 item 2
  • menu 2 item 3
  • menu 2 item 4

Footer Menu 3

  • menu 3 item 1
  • menu 3 item 2
  • menu 3 item 3
  • menu 3 item 4

Footer Menu 4

  • menu 4 item 1
  • menu 4 item 2
  • menu 4 item 3
  • menu 4 item 4
footer second
footer first
Copyright © 2013 The HighWire JCore Reference Site | Print ISSN: 0123-4567 | Online ISSN: 1123-4567
advertisement bottom
Advertisement Top