Concurrent Validity of Accelerometry and the Observation System for Recording Physical Activity in Children – Preschool
- Presented on 03/01/2011
Background Toward stemming the rise in childhood obesity, researchers are increasingly targeting increased physical activity (PA) in very young children (i.e., 3-5 y). Interventions aimed at this age group are few but growing in number and have typically focused on evaluating changes in equipment, staff development, programs, and policies. Rarer are evaluations of how permanent changes in physical topography affect PA. Measuring PA in this age group differs from other age groups, in part due to incapacity to self-report PA. Immature motor skills and short/transient episodes characterize PA intensity and duration, respectively. Metrics that are sensitive to these characteristics are required for accurate PA measurement. To date, accelerometry, systematic observation, and pedometry have been most frequently used to directly quantify PA in preschool children. There is, however,equivocal evidence concerning the equality of methods. Specifically, accelerometer count thresholds and epoch lengths have been recently questioned in light of resultant overestimation of no- and light PA compared to systematic observation using the Children’s Activity Rating Scale (Oliver et al., 2010).
Objectives In conjunction with evaluating the effects on PA of the redesign and re-purposing of an existing outdoor preschool play space, we assessed the concurrent validity of measuring PA with accelerometers and a newer systematic observation instrument – the Observation System for Recording Physical Activity in Children – Preschool (OSRAC-P; Brown et al., 2006). The OSRAC-P includes observational categories that assess PA type and context, social interaction, and location that are not afforded by accelerometry alone.
Methods Children (n=57; 32 girls; age, M=56.9±4.0 months; 84% normal weight) attending a university preschool were observed during 20-minute recess periods while wearing an Actigraph GT3X accelerometer set for 15-s recording epochs. A total of 140 paired observation-accelerometer recordings were collected over an 8-month period. Accelerometer data were converted to percentages of time spent in sedentary, light, moderate, and vigorous PA according to published cutpoints (Sirard et al., 2005). Observation (5-s observe, 25-s record intervals) data were converted to percentages of time spent in the same four intensity categories. Interrater reliability for PA level on the OSRAC-P was assessed 15 times during latter stages of observer training (M=86.4% SD=5.5%) and the data collection phase (M=84.3%, SD=3.7%). Concurrent validity was assessed using two proportion z-tests and Bland Altman plots.
Results PA intensities (% ± 95%CI O=OSRAC-P, A=accelerometer) included: sedentary (57.8±8.20; 65.9±7.8A; Z=1.40, p=.163), light (27.9±7.40; 25.1±8.2A; Z=-0.53, p=.596), moderate (11.1±5.20; 6.8±4.1A; Z=-1.26, p=.208), and vigorous (3.3±2.90; 2.2±2.5A; Z=-.563, p=.574). Overlap of confidence intervals indicated that the two methods were not statistically different for estimating PA intensity although light- and vigorous-intensity categories appeared more concordant than moderate-intensity and sedentary categories. Bland Altman differences (%O – %A) and limits of agreement were: sedentary (-.095±.306), light (.038±.254), moderate (.048±.143), and vigorous (.010±.137). Equivalence of measures based on interpretation of Bland Altman plots requires that three criteria are met: (1) the average discrepancy between methods is clinically small; (2) the difference between methods does not tend to change as the average increases; and (3) the scatter around the bias line remains constant as the average gets higher. Intensity categories met these criteria in the following manner: sedentary (criterion 2), light (criteria 1-3), moderate (criterion 1), and vigorous (criterion 1).
Conclusions Both objective measures have limitations, and though it is not possible to determine which is more accurate, the present study suggests potential improvements in methods. The OSRAC-P employs a 5-s sampling interval, which may be more sensitive at detecting higher-intensity PA than accelerometry, which by comparison samples over 15-s epochs. Additionally, some activity types listed on the OSRAC-P (e.g., climbing, riding) cannot be as readily detected by accelerometry. Since the OSRAC-P was originally designed to capture only 10-s of PA during each 1-min of observation, a sizeable duration of PA is not coded. Yet the instrument’s long recording interval allows decisions to be made about intensity level, type, context, location, initiator, and group composition, which can be used to better understand variables associated with PA intensity that accelerometry alone cannot provide. The short bursts involved and variety of very young children’s movement necessitate refinement in the measurement of PA. For accelerometry, activity count cutpoints could be developed for shorter epoch durations, and for the OSRAC-P the latency between successive observation intervals could be shortened if adequate levels of reliability are retained. (This may require deletion of several OSRAC-P categories depending on the variables of interest in a particular study.)
Support This study was partially funded by the California Association of Health, Physical Education, Recreation & Dance Foundation for the Promotion of Healthy Lifestyles.