Abstract
Background Functional outcome measurement tools exist for individual diagnoses (eg, stroke), but no prospectively validated mobility measure is available for physical therapists' use across the breadth of acute hospital inpatients. The modified Iowa Level of Assistance Scale (mILOA), a scale measuring assistance required to achieve functional tasks, has demonstrated functional change in inpatients with orthopedic conditions and trauma, although its psychometric properties are unknown.
Objective The aim of this study was to assess interrater reliability, known-groups validity, and responsiveness of the mILOA in acute hospital inpatients.
Design This was a cohort, measurement-focused study.
Methods Patients at a large teaching hospital in Melbourne, Australia, were recruited. One hundred fifty-two inpatients who were functionally stable across 5 clinical groups had an mILOA score calculated during 2 independent physical therapy sessions to assess interrater reliability. Known-groups validity (“ready for discharge”/“not ready for discharge”) and responsiveness also were assessed.
Results The mean age of participants in the reliability phase of the study was 62.5 years (SD=17.7). The interrater reliability was excellent (intraclass correlation coefficient [2,1]=.975; 95% confidence interval=.965, .982), with a mean difference between scores of −.270 and limits of agreement of ±5.64. The mILOA score displayed a mean difference between 2 known groups of 15.3 points. Responsiveness was demonstrated with a minimal detectable change of 5.8 points.
Limitations Participants were included in the study if able to give consent for themselves, thereby limiting generalizability. Construct validity was not assessed due to the lack of a gold standard.
Conclusions The mILOA has excellent interrater reliability and good known-groups validity and responsiveness to functional change across acute hospital inpatients with a variety of diagnoses. It may provide opportunities for physical therapists to collect a functional outcome measure to demonstrate the effectiveness of inpatient therapy and allow for benchmarking across institutions.
Outcome measures are integral to assessing the effectiveness of treatments, with the aim of improving patient and hospital outcomes.1 Physical therapists are continually striving for excellence in patient care, although demonstrating achievement of such excellence is difficult given the paucity of information regarding outcomes of physical therapy in the acute hospital setting.2 Studies regularly report length of stay and hospital complications as surrogate outcome measures for effectiveness of inpatient care, but these measures do not demonstrate the functional changes associated with physical therapy intervention from admission to acute hospital discharge.2
An outcome measures working party was established in a large tertiary teaching hospital in Melbourne, Australia, to review existing measurement tools and analyze their ability to be used across surgical and medical inpatients of all ages in the acute hospital setting. A variety of outcome measures have been used in specific populations of patients, such as the Barthel Index in patients with stroke3; the de Morton Mobility Index (DEMMI),4 which measures mobility from bridging to jumping in older patients; the Physical Function ICU Test (P-FIT)5; the Acute Care Index of Function,2 which measures mobility in patients with critical illness; and the Functional Independence Measure (FIM),6 which measures 13 motor tasks and 5 cognitive tasks in various populations. These outcome measures have all been shown to be valid and reliable in hospital subgroups, but limitations exist. The DEMMI4 does not consider the younger patient; both the FIM6 and the Barthel Index3 include nonmobility domains such as feeding, bathing, and activities of daily living; and the PFIT5 and Acute Care Index of Function2 are designed to measure low-level activities only. Recently, the Activity Measure for Post-Acute Care (AM-PAC) “6-Clicks” tool7 was found to be valid (using the FIM as the gold standard) and responsive across a broad range of acute hospital patients based on a clinical database rather than a prospective study. However, no outcome measures that have had their clinometric properties prospectively examined or that adequately measure and monitor the effect of physical therapy interventions across the breadth of acute hospital inpatients are currently available.
The Iowa Level of Assistance Scale (ILOA) was developed by Shields et al8 and is a 6-item, 36-point tool used in the total hip and total knee arthroplasty population. The ILOA was modified by Oldmeadow et al9 for use in patients with a fractured femoral neck. The modification included converting amount of time walked to distance walked, as this measure was deemed to be more easily performed in clinical practice. It was chosen as a functional outcome measure for our inpatient population because it is quick and easy to use, involves only mobility tasks, is free to administer, and can be easily incorporated into usual physical therapy assessment and treatment. Measuring effectiveness of treatment is essential and may have human resource implications, especially given the shift toward outcome-based reimbursement. The inclusion of gait aid and distance walked was considered imperative in measuring functional outcomes in our patient population and, as such, this tool was deemed more suitable than the “6-Clicks”7 or the Acute Care Index of Function.2
The aim of our study was to assess the interrater reliability, known-groups validity, and responsiveness of the modified Iowa Level of Assistance Scale (mILOA) in acute hospital inpatients.
Method
Setting and Participants
Patients were included in the reliability and validity study if they met the following criteria: admitted to The Alfred Hospital from October 2013 to February 2014, adult patients (aged 18 years and older), and able to provide own written consent and deemed medically stable by the treating team to participate in 2 physical therapy mobility assessments within 24 hours. The criterion for medical stability is patient dependent and involved a discussion between the therapist and the treating medical team.
Patients were excluded if they were non-English speaking or were unable to participate in active therapy secondary to cognitive impairment (including severe head injuries or premorbid diagnosis of dementia). Patients with a diagnosis of cystic fibrosis were not included in this study, as the primary goal of treatment for this patient group is cardiorespiratory, not mobility. Patients also were excluded if they were in the intensive care unit of the hospital, as concurrent outcome measures research is being undertaken involving patients in that unit. The sample recruited for the responsiveness phase of the study was a convenience sample comprising patients from all clinical areas treated by a physical therapist during February 2013.
Therapists
There were 33 therapists involved in the study, all of whom were employed in the acute hospital. Their ages ranged from 22 to 47 years, with clinical experience in the acute hospital ranging from 1 to 22 years. All clinicians were involved in the 3 phases of the trial.
Procedure
Patients were identified by the treating physical therapist as suitable to participate in 2 physical therapy sessions in a 24-hour period. First-party written informed consent was obtained from all participants by an independent research assistant.
Outcome Measure
The mILOA is a 6-item functional outcome measure that assesses the amount of assistance required for a person to move from a supine to a sitting position on the edge of the bed and from a sitting to a standing position, walking, negotiation of one step, walking distance, and assistive device used.8 For each item of the mILOA, a score from 0 to 6 is given for the amount of assistance required (Tab. 1). The mILOA total score is the sum of all 6 items (total score out of 36), with a higher score representing more assistance and, therefore, more disability. These measures were deemed to be the most important functional measures when assessing a patient's readiness for discharge home.8 Scoring of the mILOA consists of adding 6 numbers together based on the physical therapy assessment performed, and time taken is dependent on the physical therapy session rather than the scoring itself.
Modified Iowa Level of Assistance Scalea
Participants were divided into 5 groups based on their admission diagnosis: inpatients with a surgical diagnosis (general/cardiac/transplant), inpatients aged ≥65 years with a medical diagnosis, inpatients aged <65 years with a medical diagnosis, inpatients with a trauma diagnosis, and patients admitted for elective surgical procedures (musculoskeletal).
These groups were chosen because we hypothesized that their functional outcomes and progression throughout their hospital stay would be clinically different between groups and because these groups were representative of the breadth of acute hospital inpatients at our institution.
Statistical Measurement
Interrater reliability.
Two physical therapists, with varying levels of experience, independently assessed each participant and rated the participant's mILOA score within 24 hours of each other. Participants who were considered by the first physical therapist to be in a functionally stable state (unlikely to physically change between treatments) were recruited in order to score interrater reliability rather than measuring a change in their mobility. We chose consecutive patients from the 5 subgroups of interest who met the inclusion criteria. Every effort was made to ensure the participant had adequate analgesia and rest time between assessments.
The first mILOA score was completed within the usual physical therapy treatment session. The second physical therapist saw the same participant within 24 hours of the first assessment and was aware of the individual's medical condition but blinded to the first mobility assessment (and, therefore, the mILOA score). Interrater reliability was assessed using intraclass correlation coefficients (ICC [2,1]) with 95% confidence intervals.10 This form of the ICC was chosen because we considered that the raters were representative of a larger population of similar raters and that the results would be generalizable to this wider population of therapists. An ICC is commonly considered poor if less than .40, fair for values between .40 and .59, good for values between .60 and .74, and excellent for values between .75 and 1.00.11 The standard error of the measurement (SEM), which represents the amount of variability that can be attributed to measurement error and reports reliability in the same units as the original measurement, was calculated along with its confidence interval according to the methods of Stratford and Goldsmith.12 Bland-Altman plots also were used to determine if any systematic differences across the range of values occurred between the 2 assessments.13 Normality of data was evaluated using visual inspection and skewness and kurtosis statistics. The limits of agreement were calculated as the mean difference ± standard deviation of the mean difference multiplied by 1.96.13 Sample size estimates for the reliability phase of the study were based on the methods of Walter et al,14 which indicated that 27 participants per subgroup would be required for a study with 2 observations per participant, assuming an expected ICC of .6, a null ICC of .2, a type I error of .05, and a type II error of .2. To allow for potential noncompletion, we aimed to recruit 30 participants per subgroup.14
Validity.
There are 3 main types of validity that can be measured: content, construct, and criterion.15 Content validity is described as the measurement tool including factors relevant to the domain being measured. The inclusion of gait aid and distance walked improves the validity of this tool compared with other measurement tools.2,7 Criterion validity would involve comparing the measurement tool with a gold standard, which, in the current study, was not possible given that a perfect tool for this population does not currently exist. Construct validity of the mILOA, therefore, was chosen and performed using known-groups validity,16 which involves administering the measurement instrument to groups expected to differ due to known characteristics. The mILOA score was rated within 48 hours of hospital admission (when the patient is not ready for discharge), and, in a separate group of participants, the mILOA score was calculated on the day that they were deemed physically ready for discharge home by the treating physical therapist. The mILOA was scored by an independent clinician who had not been informed of the participants' discharge status. Because data for participants who were assessed as ready for discharge were not normally distributed, the Mann-Whitney U test was used to compare differences in scores between groups. We aimed to detect a difference of at least 7 points between those participants who were and were not ready for discharge, as this represents the minimal clinically important difference that has previously been reported for the ILOA.8 Power calculations indicated that to detect a difference of 7 points with 80% probability, assuming a standard deviation of 11 points, 80 participants would be required.
Responsiveness.
Responsiveness is the ability of a tool to detect change if a real clinical change has indeed occurred.17–19 Modified ILOA scores were collected at the first and last physical therapy treatments for all participants treated by a physical therapist during the month of February 2013. Responsiveness was measured in 3 ways: effect size (ES),18 SEM,20 and minimal detectable change at the 95% confidence level (MDC95).21 The ES (δ) was calculated using the formula: δ = (μ1 − μ2)/σ1, where μ1 is the mean score at baseline, μ2 is the mean score when physically ready for discharge, and σ1 is the standard deviation of the scores at baseline.22 The ES represents the magnitude of the effect that is detected by the measure; a moderate-to-large ES would be expected following an intervention that aims to achieve physical readiness for discharge, such as an acute hospital admission.23 The ES was interpreted using guidelines from Cohen,24 where 0.2 was specified as small, 0.5 to 0.6 as moderate, and 0.8 to 1.00 as large. The SEM represents the amount of variability that can be attributed to measurement error and was calculated by multiplying σ1 by √(1 − r), where r is the test-retest reliability of the measure or the ICC,10 which was obtained from the reliability experiment detailed above. The SEM was calculated in the reliability data set from which the ICC was obtained. The MDC95 measures the minimum amount of change in a person's score that ensures the change is not a result of measurement error, with 95% confidence. The MDC95 is calculated as: (1.96 × SEM × √2).19
Statistical analysis was undertaken using IBM SPSS Statistics 22.0 (IBM Corp, Armonk, New York), and statistical significance was set at P<.05.
Role of the Funding Source
This study was supported by a physical therapy research grant from Alfred Hospital.
Results
One hundred fifty-two patients gave informed consent to participate in the reliability phase of the study, and 82 patients participated in the known-groups validity analysis. These patients were evenly distributed across the 5 clinical groups for the reliability phase of the study. Participants in the validity phase of the study also came from all clinical groups, with the smallest proportion in the surgical group (n=10) and the largest proportion being medical patients aged >65 years (n=21). The demographic data are presented in Table 2.
Demographic Data Relating to the 3 Study Phasesa
Reliability
The average time between assessments was 7.1 hours (SD=8.7). The scores collected ranged between 0 and 35, with an average score of 9 points. The results for interrater reliability of the mILOA across the whole sample and for the 5 clinical groups are shown in Table 3, with the associated Bland-Altman plots presented in the Figure. The mean difference between the raters was 0.27 (SD=2.87; 95% CI=−0.192, 0.731). When considering the difference between the scores of the 2 assessors, 14 scores (9%) fell outside the 95% limits of agreement, and there was no systematic variation across the range of scores (Figure).
Reliability of the Modified Iowa Level of Assistance Scalea
Bland-Altman plot showing the reliability of modified Iowa Level of Assistance Scale (mILOA) scores for the entire sample.
Known-Groups Validity
Data were collected on 82 participants (58 who were physically ready for discharge, 24 who were not). The participants who were physically ready for discharge were younger (mean age=59.7 years [SD=17.7] versus 72.3 [SD=9.1], P<.001) and had fewer physical therapy treatments prior to assessment (X̅=0.9 [SD=1.2] versus 3.3 [SD=5.3], P=.005) compared with those who were not ready for discharge. There was a statistically significant difference in median scores for the 2 groups of 17 points (P<.001) (not ready for discharge: median=17, 1st and 3rd quartiles=12–23; ready for discharge: median=0, 1st and 3rd quartiles=0–4.25, Mann-Whitney U test=70.00).
Responsiveness
The responsiveness of the mILOA was calculated using 198 participants with both admission and discharge data collected in February 2013. The mean mILOA score on admission was 20.9 (SD=9.3) and on discharge was 9.7 (SD=10.3). The ES was 1.202, representing a very large effect, with a 95% confidence interval of 0.29 to 2.12. The SEM was 1.46 points in the reliability data set, which indicates that changes of greater than 2.92 points (2 × SEM) in either direction are likely to be beyond the bounds of measurement error on 96% of occasions. The MDC95 was 5.77 points, indicating that 95% of participants who were truly stable would have a difference in mILOA scores of less than 6 points between testing occasions.
Discussion
This study is the first to assess the psychometric properties of a functional tool across the broad range of acute hospital inpatients. The mILOA has been shown to be a reliable, valid, and responsive tool for measuring the mobility status of patients from admission to discharge in this setting.
No single physical therapy functional outcome measure has been found to be valid and reliable across the diverse range of patients admitted to an acute hospital. This fact was highlighted in a recent systematic review on functional measures for the older adult, which detailed 178 outcome measures and showed that none accurately measured and monitored the function of acute hospital inpatients.4 Jette et al7 have since found the AM-PAC “6-Clicks” inpatient daily activity and basic mobility short forms to be valid and responsive using registry data. This tool includes basic mobility tasks, but its face validity is reduced, as the maximum distance walked is “around the room,” and there is no progression of gait aid in the score. Given that 22% of our participants used a gait aid, we felt that this tool was not the most appropriate for our sample. Similar issues are present in the Acute Care Index of Function.2 The limitations of the currently available measures include their specificity to one condition or disease (eg, Mobility Scale for Acute Stroke25), their cost (eg, FIM), the time involved in recording the score, and the inclusion of nonmobility domains (eg, Barthel Index). The mILOA was chosen for this study because it can be easily incorporated into the care of the patient, with the 6 mobility items considered part of standard physical therapy mobility assessment. The original ILOA was already shown to be valid and reliable in an acute hospital population (following total hip and knee arthroplasty),8 giving confidence in the ability of this tool to be used more widely in a tertiary hospital. The mILOA also has been used as the primary outcome measure in patients following hip surgery9 and in acute hospital inpatients with trauma.26 A potential limitation of the mILOA is a ceiling effect, with 26% of participants scoring the best possible score at hospital discharge. Although this is not a limitation in the acute setting, it may diminish the utility of the mILOA for longer-term follow-up of mobility outcomes of patients after acute hospital discharge.
The interrater reliability was excellent, as measured by an ICC (2,1) value of .975, with the lowest ICC (2,1) value per group being .954, which is still considered excellent reliability. The Bland-Altman plot (Figure) shows no apparent systematic bias across the range of values. The mILOA demonstrated good reliability across all 5 medical and surgical subgroups assessed.
When considering the mean difference between the scores of the 2 reviewers, 14 scores fell outside the limits of agreement, with 6 (43%) of these scores having been trialed on a step in one of the assessments and not in the other. This finding is consistent with the diversity of clinical practice and clinical judgment when dealing with individual patients, with some therapists deciding to encourage their patients to attempt step climbing regardless of need for stair safety for discharge home, whereas others may deem it unsafe (or merely unnecessary) to perform a step if home discharge does not warrant it. It is important to note that scoring for a step makes up one-sixth of the mILOA. In a further 4 cases, the participant changed clinically between treatment sessions, causing a change in the score rather than a true reflection of decreased interrater reliability. In 2 cases, the participant reporting feeling “more tired” than in the preceding session, and in 2 cases, the participant had a vasovagal episode in the second session, resulting in a cessation of the assessment and a shorter distance walked and no step completed. The participants were not excluded from the study, as these findings were felt to be reflective of usual variations in patient presentation and an important measure in the acute hospital inpatient. These incidents were all documented on the assessment form used for the research.
The mILOA was responsive to change over the course of an acute admission, with large changes in scores evident from admission to discharge. This finding suggests that the mILOA may be a useful tool to assess changes in mobility in acute hospital inpatients and may assist in quantifying the effects of physical therapy interventions for both research and clinical practice. The MDC95 was 5.77 points, indicating that changes of this magnitude should be considered outside the bounds of measurement error on 95% of occasions, and probably represents a change in the mobility status of the patient. This value is similar to the minimal clinically important difference of 7 points that has previously been reported for the ILOA.8
Encouragingly, the MDC95 calculated in the responsiveness phase of the study was similar to the limits of agreement calculated by the Bland-Altman analysis performed in a different data set for the reliability phase of the study, which further increases our confidence in this threshold for measuring real clinical change. It is possible that both the MDC and the SEM would be smaller if a single rater had performed both ratings.
The limitations of our study include the exclusion of patients who were not able to provide their own consent to participate, which limits our ability to generalize to this important group of patients who are often seen by physical therapists in the acute setting. However, patients across the entire range of mILOA scores were included, which indicates that our results are applicable across the patient mobility spectrum. The lack of a gold standard outcome measure in the acute setting meant that a more robust construct validity study for the mILOA was unable to be undertaken, and only known-groups validity was tested. Another limitation of our study was that the order of assessments were not varied, with the treating physical therapist always performing the first mILOA score. Additionally, given that the participants were seen at 2 different times in a 24-hour period, although efforts were made to ensure they were at a similar functional level for the 2 assessments, external factors such as pain, dizziness, and natural improvement may have altered the way that they performed the test, thus reflecting a real change in score rather than the reliability between therapists. These findings illustrate the complexities of undertaking physical therapy research in the acute setting. Similarly, having therapists across all areas of the hospital involved in this research may have led to the calculation errors found in 4 participants (2.6%). These errors involved counting the score in one of the domains as 0 points instead of 6 points (to represent failure to complete the task), possibly representing a lack of experience in use of the tool.
The strength of this study was that the interrater reliability testing was performed by 2 separate physical therapists rather than by 2 physical therapists witnessing the same assessment. Although this approach means that the reliability estimates in this study reflect elements of both test-retest reliability and interrater reliability, we believe that this design enhances the external validity of results. The mILOA score is dependent on the way the physical therapist performs the assessment and the way the patient performs, as well as the physical therapist's scoring of the assessment. We derived a clinically meaningful estimate of reliability for situations where the physical therapist both performs and scores the mILOA, as would normally occur in clinical practice. Physical therapists with varying levels of experience were involved in this study, showing the reliability across the workforce. We also divided patients into 5 clinically separate groups to demonstrate reliability and validity within the specific patient groups, as well as more generically across the acute hospital system. The variety of patients studied makes this tool widely applicable for physical therapists in the acute hospital setting across many patient populations.
The mILOA has excellent interrater reliability and known-groups validity and is responsive to functional change in acute hospital inpatients. It is easy to use and can be scored during regular physical therapy functional assessments of the patient across the continuum of care to measure the extent of gait and transfer impairments, without need for additional equipment or costly resources. This tool could form an integral part of routine data collection for physical therapists in the acute hospital setting and provide opportunities for benchmarking between physical therapy departments. The strong psychometric properties of the mILOA in acute hospital inpatients provide new opportunities to embed outcome assessment into routine physical therapist practice.
Footnotes
All authors provided concept/idea/research design, writing, data analysis, and consultation (including review of manuscript before submission). Ms Kimmel and Ms Elliott managed the data collection and provided project management. Associate Professor Holland provided statistical support. The authors thank the members of the Alfred Hospital Physiotherapy Department who were involved in the data collection.
Ethics approval for this study was obtained from the Alfred Health Human Research Ethics Committee (HREC).
This study was supported by a physical therapy research grant from Alfred Hospital.
- Received June 4, 2014.
- Accepted May 21, 2015.
- © 2016 American Physical Therapy Association