Abstract
Background Fecal incontinence and constipation affect men and women of all ages.
Objective The purpose of this study was to psychometrically analyze the Fecal Incontinence and Constipation Questionnaire (FICQ) in patients seeking outpatient rehabilitation services due to pelvic-floor dysfunction (PFD).
Design This was a retrospective analysis of cross-sectional data from 644 patients (mean age=52 years, SD=16, range=18–91) being treated for PFD in 64 outpatient rehabilitation clinics in 20 states (United States).
Methods We assessed the 20-item FICQ for unidimensionality and local independence, differential item functioning (DIF), item fit, item hierarchical structure, and test precision using an item response theory model.
Results Factor analyses supported the 2-factor subscales as originally defined; items related to severity of leakage or constipation. Removal of 2 leakage items improved unidimensionality and local independence of the leakage scale. Among the remaining items, 2 items were suggestive of adjustment for DIF by age group and by number of PFD comorbid conditions. Item difficulties were suitable for patients with PFD with no ceiling or floor effect. Mean item difficulty parameters for leakage and constipation subscales ranged from 38.8 to 62.3 and 28.1 to 63.3 (0–100 scale), respectively. Endorsed leakage items representing highest difficulty levels were related to delay defecation and confidence to control bowel leakage. Endorsed constipation items representing highest difficulty levels were related to the need to strain during a bowel movement and the frequency of bowel movements.
Limitations A limitation of this study was the lack of medical diagnostic criteria to classify patients.
Conclusions After removing 2 items and adjusting for DIF, the results supported sound psychometric properties of the FICQ items and its initial use for patients with PFD in outpatient rehabilitation services.
Pelvic-floor dysfunction (PFD) affects a substantial proportion of individuals.1,2 A healthy pelvic floor is associated with normal placement of pelvic structures and normal muscle, bladder, and anus functioning. Weakness of or injury to any of the pelvic-floor structures increases the probability of urinary, bowel dysfunction, or both. Fecal incontinence (FI) and fecal constipation (FC) are 2 common subtypes of bowel dysfunction that affect men and women of all ages.3,4
Fecal incontinence is the involuntary loss of liquid or solid feces or mucus.5 The estimated prevalence of FI in the general population ranges from 1.4% (community-dwelling adults aged 40 years and older) to 68% (long-term care hospital residents with moderate mental impairment).6–13 The wide variation may be attributed to different sampling strategies (eg, women versus adults of both sexes, community-dwelling older adults versus nursing home residents, varying age samples) and the definition of FI. Overall, the prevalence of FI increases with age.6,9 Some authors have reported that cognitive impairment, limitations in daily activities, and prolonged institutionalization in nursing homes were associated with a higher risk of incontinence.13,14 Other authors have reported that greater frequency of FI was associated with decreased quality of life.7,15 Studies documenting the psychological impact of FI showed increased anxiety, depression, shame, and frustration,10,16 and that patients were less mobile in the community and were confined to their homes.15 The sex-specific prevalence is not clear from the literature. Traditionally, FI is believed to be more prevalent in women7–10; other authors have reported that prevalence is similar in women and men.6,12
Fecal constipation is a decreased frequency of bowel actions (<1 every 3 days) that also may be associated with excessive straining during the passage of stool.17 The reported prevalence of FC ranged widely from 0.7% (children less than 12 years old) to 45% (homebound elderly people), with a median value of 16%.18 The prevalence of FC increases with age and is more frequent in women than in men (the mean female-to-male ratio of individuals with FC is estimated to be 2:1).18–21 Low income, poor education, immobility, and less self-reported physical activity were associated with increased FC prevalence.18,22 Symptoms can be burdensome, leading to a reduction in patient quality of life.23,24
Many patients seek rehabilitation to reduce symptoms related to FI and FC and consequently improve functional status (FS).25,26 There is an increasing demand for patient-reported outcomes (PROs) to be applied in this patient population during routine clinical practice and research in order to assist in clinical care planning and outcomes assessment in patients with PFD seeking outpatient rehabilitation services.27,28
The current study builds on previous work where we developed,29,30 simulated,31–33 and applied and validated body part–specific computerized adaptive testing (CAT) applications34–42 for patients seeking rehabilitation for a variety of impairments in outpatient rehabilitation clinics. The purpose of this study was to evaluate psychometric properties of a new self-report Fecal Incontinence and Constipation Questionnaire (FICQ) in patients with PFD seeking outpatient rehabilitation services.
Method
FICQ
The FICQ was developed by Focus on Therapeutic Outcomes, Inc, (FOTO) (Knoxville, Tennessee) in collaboration with an experienced physical therapist who specializes in treating patients with PFD. The FICQ aims to evaluate how bowel dysfunction affects patient-perceived functional status related to PFD.43 Questions (items) were designed to address issues of greatest concern to patients with FI or FC seeking outpatient rehabilitation therapy. Items were worded to represent tasks that have different levels of difficulty to enable the development of an item response theory (IRT)–based item bank suitable for CAT application for this patient population. The FICQ consists of 20 items: 15 items related to bowel leakage problems and 5 items related to constipation problems. Each item has its own Likert rating scale structure and operational definition (Appendix). Face validity was established by collecting feedback on the initial item bank (item description and rating categories) from a small group of physical therapist who specialized in treating patients with PFD.
Data Collection
The platform used for electronic PROs data collection has been described previously.44 Briefly, patients with PFD were managed in outpatient rehabilitation clinics participating in FOTO, which is an international medical rehabilitation outcomes database management company.45,46 During admission to therapy and initial evaluation (intake), patients entered demographic data and completed self-report surveys using Patient Inquiry, a computer program developed by FOTO.45,46 Demographic variables included age, sex, symptom acuity, surgical history, number of comorbid conditions, exercise history, and payer source. Data were collected for age as a continuous variable and categorized as 18–44, 45–64, or 65 years. Participants were categorized as either female or male. Symptom acuity, which we operationally defined as the number of calendar days from the date of onset of the condition being treated in therapy to the date of initial therapy evaluation, was categorized either as acute (<22 days), subacute (22–90 days), or chronic (>90 days). Surgical history was categorized as none, 1, 2, 3, or 4 or more surgeries related to the condition being treated. Number of comorbid conditions was assessed using a list of 30 conditions common to patients entering an outpatient rehabilitation clinic (eg, arthritis, asthma, diabetes, heart attack, AIDS, sleep disturbance, overweight, cancer).47,48 Overweight comorbidity was defined as having a body mass index greater than 30 kg/m2. Exercise history prior to receiving therapy was categorized as exercising 3 times a week or more, exercising 1 to 2 times a week, or exercising seldom or never. Last, 15 payer sources (eg, preferred provider organization, Medicare) were included.
When “pelvic floor” was selected as the primary reason for treatment, PFD-related questions were administered to the patient. To reduce survey administration burden (ie, to reduce the number of items administered), a branching algorithm was used to direct patients to select 1 or more specific PFDs that may apply to them (ie, urinary, bowel, pelvic pain, and pelvic organ prolapse). For any selected disorder, subsequent subtypes were presented. If patients selected “bowel problems,” more specific subtype selection was presented, including (1) leakage and (2) constipation or straining. Patients could skip any item without explanation. Based on the selected subtype, only items relevant to that subtype were presented. Additionally, few selected items were programmed to be skipped if patients reported that they never experienced bowel leakage under a certain circumstance. For example, if patients answered “never” to the item “How often does your bowel leak when you are physically active, including coughing or sneezing?” the computer software would skip the following item: “Describe the level of activity that causes your bowel to leak.”
Inclusion Criteria
Data were selected from the database if patients: (1) were 18 years old or older, (2) were managed for any PFD problems, (3) received outpatient rehabilitation services, and (4) responded to FOTO Patient Inquiry computer-based FICQ items at intake between May 2007 and January 2011.
Analytical Procedure
Data were analyzed in 2 stages. First, we assessed the FICQ for its unidimensionality and local independence and differential item functioning (DIF) to determine how well unidimensionality and local independence IRT assumptions were met and whether a single set of item parameters for all groups combined was sufficient (ie, no practical DIF). Second, we analyzed item fit, item hierarchical structure, and test precision using the IRT approach. Based on results found in the first stage, if multiple dimensions were found, data were analyzed for each dimension separately. If items were found to exhibit DIF, a different set of item parameters were computed based on group membership (described below).
Data management.
Prior to data analysis, responses from all items except item 14 (Appendix) were recoded, with higher (more positive) responses representing higher functioning.
Unidimensionality and local independence.
A unidimensional scale includes items that represent only 1 construct.49 To test for unidimensionality, we analyzed: (1) the factor loadings and (2) the variances explained by each factor. As suggested by Nunnally,50 items were considered to be removed if factor loadings were below 0.40. To assess IRT assumptions of unidimensionality and local independence, we conducted exploratory factor analyses (EFAs) of latent trait variables followed by confirmatory factor analyses (CFAs) utilizing Mplus (Muthén & Muthén, Los Angeles, California)51 on all items.
Local independence requires that, after taking into account patient ability, the items in a test be independent of each other. Therefore, a patient's responses to 1 item should not lead, or be correlated to, the response on another item.49 To test for local independence, we analyzed: (1) the residual correlation matrix, (2) the magnitude of the standardized coefficients, and (3) the percentage of absolute residual correlations >.10, suggesting local dependency. Model fit was evaluated using the comparative fit index (CFI), the Tucker-Lewis index (TLI), and the root-mean-square error of approximation (RMSEA). The TLI and CFI range from 0 (poor fit) to 1 (good fit). Values of CFI and TLI greater than .90 are indicative of good model fit; RMSEA values less than .08 suggest adequate fit.52 To our knowledge, there is no empirically substantiated standard for the cutoff of residual correlation. We eliminated 1 item in each pair of items with a residual correlation of .20 or more.53 Items that had a higher number of residual correlation (>.10) with other items were inspected and removed, if necessary, to improve the model fit.
DIF.
All patients at a given level of ability should have an equal probability of scoring positively on each item regardless of their group membership (eg, age group).54 Items are flagged as “significant DIF” when this requirement does not hold. Measuring DIF was 1 of 10 recommendations for advancing patient-centered outcomes measurement because if items in a health assessment instrument are biased by patient groupings, detection rates can be overestimated or underestimated.55
For the purposes of DIF detection, we followed a method developed by Crane et al56 and described in detail by Hart et al.57 Specifically, item responses were calibrated based on Samejima's 2-parameter graded response model (GRM)58 using PARSCALE software (Scientific Software International Inc, Lincolnwood, Illinois)59 and difwithpar software (University of Washington, Seattle, Washington).60 The difwithpar software examines 3 ordinal logistic regression (OLR) models for each item and each demographic category selected for analysis: sex (female and male), age group (18–44, 45–64, and >65 years), symptom acuity (acute, subacute, and chronic), and number of PFD comorbid conditions (1=patient reported only 1 bowel problem, 2=patient reported bowel problem and 1 other problem, 3=patient reported urinary, bowel, and pelvic pain problems). As described by Crane et al,56 items were examined for the presence of: (1) uniform DIF (ie, the interference related to demographic groups between ability measures and item responses is the same across the entire range measured by the test) by examining the relative difference between beta coefficients in the regression models (ie, a 10% difference) and (2) nonuniform DIF (ie, the interference varies at different levels of ability measures) by comparing the −2 log likelihoods of 2 of the regression models.
Item fit.
To assess the item fit, item hierarchical structure, and test precision, we analyzed the data using Masters' 1-parameter partial credit model (PCM)61 and WINSTEPS software (Winsteps, Chicago, Illinois).62 Masters' PCM was selected because it is a latent structure model for polytomous responses to a set of test items and because it allows each item to have a unique rating scale structure, which is the format of the FICQ items. The 1-parameter model was selected because it requires a smaller sample size to obtain useful and stable item parameter estimates than 2-parameter models.63
Fit statistics were performed to investigate whether the response patterns fit the PCM measurement model. A fit statistic index calculates the ratio of the observed variance divided by the expected variance, with an expected value of 1 and a range from 0 to positive infinity. Based on Linacre64 and Smith et al,65 we used a reasonable item MNSQ fit statistic of 1.2 given that we had a sample size of approximately 600. An MNSQ fit statistic higher than 1.2 indicates that the item response pattern has more variance than the model expected. There are 2 kinds of fit statistics: (1) the infit statistic, a weighted index that is more sensitive to the response pattern of items targeted to the person's estimated ability level, and (2) the outfit statistic, an unweighted index that computes the overall misfit of personal responses.66
Item hierarchical structure.
The empirical item difficulty hierarchical order supports construct validity of the instrument. Item difficulty hierarchical order was inspected via estimated item difficulty parameters. The item difficulty parameters were expressed in logits and then transformed to a scale of 0 to 100, with higher positive values indicating a more challenging task that usually is successfully accomplished or endorsed by patients with higher functioning abilities.
In addition, average measures for rating scale categories were inspected via estimated average measure for each category (ie, category parameter). Linacre67 recommended that average measures be advanced monotonically within each rating scale category. Category disorders (ie, the average measure did not increase with category value as expected) may result from disorder in category definitions or less frequently observed intermediate categories representing narrow intervals on the latent variable,68 in which case a collapsing of categories needs to be considered.
Test precision.
We assessed the test precision using the test information function (TIF) and standard error (SE). The TIF49,69 indicates the level of information or score precision provided by the scale over the range of the construct's continuum and is the sum of the item information functions (IIF) at each patient ability level along the construct's continuum being measured (ie, bowel function). The amount of information provided by a scale at each person ability level is inversely related to the error with which functional status is estimated at that level of person ability.69 We plotted the TIF, generated using data from the FICQ items. The shape of TIF provides a visual comparison of the level of test precision for FICQ items. To quantify measure precision at each person ability level, we plotted averaged SEs of functional status estimates from the FICQ items, superposed with the TIF.
Results
Participants
Data from 644 patients with PFD symptoms related to bowel syndromes who were receiving outpatient rehabilitation in 64 clinics in 20 states (United States) were analyzed (Tab. 1). Patients were primarily female (91% female), with 77% being under 65 years of age (mean age=52, SD=16, range=18–91) and 76% having chronic bowel syndromes. Of the 644 patients analyzed, 24% had solely bowel problems, 34% had both urinary and bowel problems, 10% had both bowel and pelvic pain problems, and 32% had urinary and bowel problems as well as pelvic pain. Most patients (53%) had FC only. Twenty-eight percent of the patients had FI only, and 17% of the patients had both FC and FI. The percentage of missing data was 2%.
Patient Characteristics at Rehabilitation Intake (N=644)a
Unidimensionality and Local Independence
The EFA supported the 20 FICQ items tended to contain 2 or 3 dominant dimensions or factors, with the first 3 factors explaining 34%, 13%, and 10% of the total variance. After inspecting the patterns, the 2-factor structure offered a better interpretability, splitting the 20 FICQ items into 2 subscales: items related to FI severity (items 1–15) and items related to FC severity (items 16–20) (see Appendix for item description). Items 4 (FI-PROBLEMSLEEP) and 12 (FI-SEX) loaded equally on the first 2 factors. Because both items were asked under the context of bowel leakage, they were kept within the FI subscale. Items 8 (FI-ACTIVITY), 18 (FC-LAXATIVE), and 19 (FC-STRAIN) had low factor loadings of 0.22, 0.32, and 0.32, respectively. We decided to remove the item with the lowest factor loading: item 8.
Preliminary analysis showed several item pairs among the 15 FI items had correlation residuals of .20 or more, but the 5 FC items were free from local dependency. After inspecting the patterns, item 4 (FI-PROBLEMSLEEP) had high positive correlation residuals (>.20) with items 9, 10, 14, and 15 and negative correlation residuals (<−.20) with item 1, 3, and 7. We felt that item 4 appeared to be redundant and thus removed it.
The remaining 13 FI items and 5 FC items were reanalyzed. Remaining items met the majority of evaluation criteria. In the 13 FI items, the first 3 factors explaining 50%, 11%, and 10% of data variance. Only 5 item pairs (5%) (out of 15 × (15−1)/2=105 item pairs) had absolute correlation residuals higher than desired (>.20). Fit statistics for 1-, 2-, and 3-factor models were CFI=0.94, 0.97, and 0.99, TLI=0.96, 0.97, 0.99, and RMSEA=0.14, 0.11, 0.08, respectively, supporting adequate unidimensionality. With the RMSEA equal to 0.14 greater than the criterion of 0.08, the results suggest further purification of the items to improve the unidimensionality is needed. In the 5 FC items, the first 3 factors explained 46%, 17%, and 16% of data variance. All item pairs (out of 5 × (5−1)/2=10 item pairs) had absolute correlation residuals less than .10. Fit statistics for 1- and 2-factor models were CFI=1.00, 1.00, TLI=1.11, 1.15, and RMSEA=0.00, 0.00, respectively, supporting unidimensionality.
DIF
After removing items 4 and 8, results of DIF analysis using the 13 FI items and 5 FC items were suggestive of no DIF by sex, age group, acuity, and number of PFD comorbid conditions, except for 2 FI items. Item 12 (FI-SEX) showed uniform DIF by age group for relating to the extent of FI affecting sex life (the relative difference in beta coefficients in estimate was equal to 0.47, which was greater than 0.10), and item 15 (FI-DELAY) showed uniform DIF by number of PFD comorbid conditions (the relative difference in beta coefficients in estimate was equal to 0.15, which was greater than 0.10).
To account for the DIF effect, item 12 was split into 3 new items by age group: item 12_1 (age 18–44 years), item 12_2 (age 45–64 years), and item 12_3 (age 65 years). By the term “splitting,” we mean that for each new item (eg, item 12_1), responses for that item's age group (eg, 18–44 years) were as coded in the original dataset, whereas all other responses were set to missing. Similarly, item 15 was split into 3 new items by number of PFD comorbid conditions: item 15_1 (patient reported only 1 bowel problem), item 15_2 (patient report 1 bowel problem and 1 other problem), and item 15_3 (patient reported urinary, bowel, and pelvic pain problems).
Item Fit
After removing items 4 and 8 and splitting items 12 and 15, each into 3 new items, we analyzed the 17 FI items and 5 FC items separately using Masters' 1-parameter PCM61 and WINSTEPS software.62 Tables 2 and 3 present the item characteristics of the FI items and FC items, respectively.
Item Characteristics of the Fecal Incontinence and Constipation Questionnaire (FICQ): Fecal Incontinence (Leakage) Itemsa
Item Characteristics of the Fecal Incontinence and Constipation Questionnaire (FICQ): Fecal Constipation Itemsa
Results showed that several FI items (items 15_3, 15_2, 15_1, 14, 12_3, and 12_2) and none of the FC items had infit statistics greater than the criterion of 1.2. Several FI items (items 15_3, 15_2, 14, 15_1, 5, 12_2, 10, 3, and 12_3) and none of the FC items showed high outfit statistics (>1.2). We hypothesized that those high fit statistics may have been due to low frequency counts on response categories after splitting. Although items 5 and 14 had high fit statistics, the magnitudes were small (<1.4). Item 3 had an outfit statistic of 1.46, indicating that an unexpected response pattern may exist toward reporting the impact of bowel leak during sleeping.
Item Hierarchical Structure
In Tables 2 and 3, items were ranked based on the item difficulty parameter, with more difficult items on the top. For FI items, mean item difficulty parameters ranged from 38.80 to 62.29 (0–100 scale). Items representing more difficult tasks to be endorsed by patients with high functioning abilities were related to delay defecation after first feeling the urge for patients with 2 or 3 PFD comorbid conditions (items 15_2 and 15_3), confidence to control the urine leakage problem (items 13 and 14), and impact of bowel leakage on life (item 11). Items representing easier tasks endorsed by patients with low functioning abilities were related to the impact of bowel leakage on sex life in patients over 65 years of age (item 12_3), frequency of bowel leakage when asleep (item 3), and frequency of bowel leakage when physically active (item 7).
For FC items, mean item difficulty parameters ranged from 28.07 to 63.30. Items representing more difficult tasks were related to strain during a bowel movement (item 19) and frequency of bowel movements (item 16). Easier items were using the enemas per month (item 17) and the need to assist manually to have a bowel movement (item 20).
When the patient ability distributions were compared, both the FI and FC items were at the sample's overall ability level. Using the FI items, the mean sample ability level estimated was 55.44 (SD=11.53) (0–100 scale), which matched well with mean item difficulty of the FI items of 48.57 (0–100 scale). Nine patients (3%) obtained the maximum measure, and 1 patient (0%) obtained the minimum measure. Using the FC items, the mean sample ability level estimated was 55.61 (SD=13.55), slightly higher than the mean item difficulty of the FI items of 44.78, with 9 patients (2%) obtaining the maximum measure and none (0%) obtaining the minimum measure.
Disordered categories are denoted with an asterisk in Tables 2 and 3. Of all category parameters, 14 category parameters in the FI subscale were disordered. The majority of disordered category parameters occurred at response category 2 (ie, “once or less per week” in items 1, 3, 5, 6, and 7) and response category 4 (ie, “once a day” in items 5 and 7 and “a great deal” in item 12). In the FC subscale, 2 category parameters were disordered. Both disordered category parameters occurred at response category 2 (ie, “once a month” in items 17 and 18).
Test Precision
The Figure illustrates a bell-shaped TIF curve with 1 peak located at the middle ability level. The SE values were small in the middle range of patient ability measures, but they increased as ability measures (0–100 scale) became extreme. On average, the SE values for patients with ability measures between 20 and 80 were 2.5 and 6.4, respectively, for each FI and FC subscale.
Test information function (TIF) and standard error (SE) for fecal incontinence items (top panel) and fecal constipation items (bottom panel).
Discussion
The purpose of this study was to perform a preliminary psychometric analysis of the FICQ in patients seeking outpatient rehabilitation services due to PFD. Overall, results supported that the final FICQ items met IRT assumptions, were free from DIF for the variables assessed, were suitable for patients with PFD with no obvious ceiling or floor effect, and produced reliable and precise measurements of functional status related to bowel function. The data fit the PCM measurement model well. Findings from this study can be used to develop an initial pelvic-floor, body part–specific CAT application to be used in outpatient rehabilitation services.
To our knowledge, this is the first study designed to develop an IRT-based item bank suitable for CAT application for patients with PFD related to bowel dysfunction seeking outpatient rehabilitation therapy. Our results suggest that the FICQ represents an adequate first step in the development of multiple CATs (eg, urinary CAT, fecal CAT) for the PFD population particularly. To our knowledge, only a few studies70–72 have used IRT methods to examine the psychometric properties of bowel incontinence and constipation related items. Although the Imperforate Anus Psychosocial Questionnaire70 measures psychosocial domains (eg, emotion, cognition, self-determination, social relationships and school, physical function, and experiences of care) in children with imperforate anus, the FICQ emphasizes the bowel urgency and frequency as well as severity of the bowel symptoms in adult patients with PFD, and thus the IRT analysis results were not comparable.
In a final report to the Department of Health and Aging (Australian government), Sansoni et al71 analyzed 10 items (5 items from the Wexner scale and 5 from other fecal incontinence questions) using the IRT PCM. Six items were removed from an iterative analytical procedure because they did not meet IRT psychometric criteria. Among the remaining 4 items, the most difficult item was “Does stool leak so that you have to change your underwear?” and the easiest item was “Do you leak stool if you don't get to a toilet in time?” Hashimoto et al72 evaluated a proposed modification to the Fecal Incontinence Quality of Life Instrument (FIQL) using the IRT PCM. The FIQL consists of 29 questions in 4 categories: lifestyle, coping/behavior, depression/self-perception, and embarrassment. Because the FIQL items measure social and behavioral components, results were not comparable with our study.
To examine whether items function differently across different group memberships, we used a method developed by Crane et al56 for DIF detection by sex, age group, acuity, and number of PFD comorbid conditions. The results supported clinically relevant findings in age differences in sex life (item 12) that the bowel leakage problems affect younger patients' sex life more than older patients. The results also supported that delaying defecation after first feeling the urge was more challenging for patients who had more PFD symptoms (eg, having urinary, bowel, and pelvic pain problems) than for patients who had fewer PFD symptoms (eg, having a solely bowel problem), which is clinically logical, supporting construct validity of the FICQ.
We used Masters' 1-parameter PCM to perform the initial examination of the psychometric properties of the FICQ items because it is a latent structure model for polytomous responses to a set of test items with a unique rating scale structure for individual items and because it requires a smaller sample size to obtain useful and stable item parameter estimates. Some may argue that the 2-parameter IRT model is preferred because the 1-parameter model is less general than the 2-parameter model in that it does not allow the discrimination to vary across items. In a follow-up analysis (available upon request), we analyzed the same dataset using Samejima's 2-parameter GRM58 using PARSCALE software.59 We found that most results were similar. The item hierarchical structure in the 13 FI items remained; however, item 1 became a more challenging item and item 2 became an easier item, compared with the estimate using the 1-parameter PCM. The item hierarchical structure in the 5 FC items remained.
Because the GRM allows items to have different slopes (ie, discrimination parameters), results from the GRM further showed that FI items 11, 2, and 13 and FC item 17 had larger positive slopes, indicating these items were able to discriminate among patients with different abilities (ie, high and low bowel function) better than other items. However, because data splitting to adjust for DIF resulted in low frequency counts on response categories in item 12 (after adjusting for age group) and item 15 (after adjusting for number of PFD comorbid conditions), estimated parameters were unable to achieve convergence for these splitting items under the 2-parameter GRM model. Further, we found that the 2-parameter GRM model failed to estimate several category thresholds for items with skewed responses (eg, 91% of patients responded “seldom or never” to item 17). Therefore, the 1-parameter PCM measurement model seemed to be a better choice, although the item discrimination parameters were varied among FICQ items. Future studies are needed to validate our results using the 2-parameter model with a larger sample size.
Although our preliminary analyses supported the use of FICQ in patients seeking treatment with FI and FC in outpatient rehabilitation therapy services, test developers should endeavor to improve the quality of existing FICQ items and expand the content coverage (especially for the FC dimension). The clinical implications of the use of laxatives and strain during bowel movement having low factor loadings on the FC scale are unclear. From the clinical perspective, the negative correlation residuals between item 4 (FI-PROBLEMSLEEP) and items 1, 3, and 7 also are unclear, which warrants future investigations. In our initial attempt to interpret the results, 3 therapists provided feedback. Therapists commented that it is rare for a patient to have leakage during sleep and that it is a functional complaint that pelvic-floor therapy may have little to no impact on because the cause of leaking during sleep may be nonmusculoskeletal in nature. For the level of activity that causes the bowel to leak, therapists suggested that patients with PFD do not leak as much with coughing and sneezing as they might with stress urinary incontinence. Fecal incontinence during the day may suggest a result of incomplete evacuation, poor stool form, poor sensory awareness, and poor muscle tone and endurance that fatigues from the hours of just being up against gravity all day rather than a specific activity level. This suggestion may explain why the item 8 (FI-ACTIVITY) showed low factor loading. However, whether the items should be removed is equivocal where some therapists consider that these 2 items (items 4 and 8) are valuable for therapists to know to formulate the treatment plan. For instance, therapists may want to know what activity is causing leakage and whether the leakage occurs when a patient is not aware (is asleep). In some cases, therapists may further evaluate the patient's diet or assess his or her sensation or pelvic awareness.
There were several disordered rating scale categories. As described by Linacre,67 rating scales are used as a means of extracting more information out of an item compared with dichotomous items. A rating scale category should have a clear and clinically logical operational definition, with higher categories representing higher or lower functioning. The majority of the rating scale definitions of the FICQ items are based on frequency (ie, continuously, several times per day, and once per day) or severity (ie, a serious problem, quite a problem, a bit of problem, not a problem), which suggests disordered categories are less likely to result from disordered category definitions and more likely to result from representing narrow intervals on the latent trait being measures. Therefore, we collapsed rating scale categories that had 6-, 5-, and 4-point rating levels down to 3-point (ie, rescoring 1, 2, 3, 4, 5, and 6 to 1, 1, 2, 2, 3, and 3, respectively; rescoring 1, 2, 3, 4, and 5 to 1, 2, 2, 2, and 3, respectively; and rescoring 1, 2, 3, and 4 to 1, 2, 2, and 3, respectively). This process decreased the number of disordered parameters from 16 to 3. The remaining 3 disordered parameters were among items that have a skewed score distribution (item 3) and low frequency counts due to data splitting (items 12_3 and 15_3). We decided to retain the original rating scale structure because the current sample size was relatively small for stable parameter estimations. Future improvements are needed to refine the rating scale categories by modifying the operational definitions of rating scale categories or collapsing categories using a larger sample.
For the test information function, there was a concern whether it might be overestimated in the FI subscale if it was calculated based on 17 items (after removing 2 items and adjusting for DIF by splitting item 12 by age group and item 15 by number of comorbidities) rather than the original 15 items (original items 1–15). In a follow-up analysis, we compared TIF curves between the 2 scenarios and found that the curves were very similar in shapes and magnitudes.
There were several limitations of this study. Since this study was a secondary analysis of prospectively collected data via a proprietary database management company, the researchers were not in control of the data collection procedure, nor of the specific timetable for patients to be assessed because no special training was provided to the therapists prior to data collection. Additionally, generalizability of results may be limited because differences among participating clinics compared with clinics that do not collect data using FOTO may exist. Because data were collected in routine busy outpatient rehabilitation clinics, PFD items were selected from the computer-based administrative branching algorithm to reduce the respondent burden. With this data collection approach, the presence of missing data due to unanswered items makes statistical analyses challenging. Last, due to data unavailability, we did not use medical terminology to classify patients. For example, functional constipation is commonly classified using the Rome III diagnostic criteria, and fecal incontinence can be classified into loose FI and well-formed FI. Future studies should endeavor to collect clinical diagnostic data or link to electronic medical records so that more research questions can be addressed.
Conclusion
Preliminary analyses supported sound psychometric properties of the FICQ items and its use in patients seeking treatment with PFD in outpatient rehabilitation therapy services. Findings from this study will be used to develop an initial pelvic-floor–specific CAT application to be used in outpatient rehabilitation therapy services. More FC items should be added to increase the content coverage. Additional studies are needed to validate results using a larger dataset.
Appendix.
Fecal Incontinence and Constipation Questionnaire (FICQ)a
a The Fecal Incontinence and Constipation Questionnaire may not be used or reproduced without written permission from Focus on Therapeutic Outcomes, Inc.
Footnotes
Dr Wang provided concept/idea/research design and data analysis. Dr Wang and Dr Yen provided writing. Mr Werneke provided data collection and project management. Dr Deutscher, Dr Yen, and Mr Werneke provided consultation (including review of manuscript before submission). The authors thank Andrea Goldberger, PT, and Joan Rigg, PT, OCS, from St. Luke's–Elks Rehab; Katrina J. Heath, PT, DPT, from Carolinas Rehabilitation–Matthews; and several anonymous therapists for their constructive comments and valuable inputs after reviewing the questionnaire and manuscript.
This project was approved by the Institutional Review Board for the Protection of Human Subjects at the University of Wisconsin–Milwaukee.
This research was presented at the 7th World Congress of the International Society of Physical and Rehabilitation Medicine (ISPRM2013); June 16–20, 2013; Beijing, China.
Dr Wang acknowledges that she is a consultant of Focus On Therapeutic Outcomes, Inc (FOTO), the database management company that manages the data analyzed in the study. Analyses of data such as the analyses presented in the article are part of Dr Wang's regular daily work activities. Mr Mioduski also is an employee of FOTO. He programmed the software that was used to develop the computerized adaptive tests (CATs) and the software that was used to collect the data and managed the data from the CATs once collected. These activities are part of Mr Mioduski's regular daily work activities. Dr Deutscher and Dr Yen are independent of FOTO and were involved in the design of the project and extensive review of the manuscript.
- Received February 15, 2013.
- Accepted October 7, 2013.
- © 2014 American Physical Therapy Association