University of Illinois Urbana-Champaign, Champaign, IL
Classification of Cardiovascular Risk Using Accelerometer Data and Machine Learning Algorithms
- Presented on May 30, 2014
Background: Physical activity patterns captured by accelerometers have been used to classify activity type with machine learning (ML) algorithms. ML may also be applied to accelerometer data for predicting cardiovascular (CV) health risk directly. Decision trees are efﬁcient constructive search algorithms that develop rules for categorizing the data based on most informative features. RF are ensemble classiﬁers that consist of bootstrap aggregated (bagged) decision trees. While many ML algorithms exist, decision trees and random forests (RF) show great promise based on classiﬁcation accuracy.
Purpose: Classify individuals by CV risk score based on accelerometer-derived physical activity using ML methods.
Methods: National Health and Nutrition Examination Survey (NHANES) 2003-2004 data for 2,158 participants, 50% male, aged 30-85 with an average age of 56.46 yrs. (SD = 15.81) was analyzed. For each subject, a ten year Reynolds CV risk score (Ridker et al, 2007) was calculated, and risk greater than 10% was considered “high” risk. Average daily moderate, vigorous and moderate to vigorous physical activity in 1 minute and 10 minute bouts obtained by ActiGraph monitors were input into decision tree and RF classiﬁcation algorithms to classify participants into “high” and “low” risk categories.
Results: Reynolds risk scores ranged from 0 to 91% (M = 11%, SD = 5%); 735 (34%) participants in the sample had a “high” risk score greater than 10%. The RF classiﬁer revealed that 1 minute bouts of average daily moderate activity, followed by 1 minute bouts of average daily moderate or vigorous activity were the best predictors of CV risk. The algorithm classiﬁed participants as “high” or “low” risk with 73% accuracy. Classiﬁcation accuracy of “low” risk individuals was higher (85%) that “high” risk (48%). The decision tree pruned to 4 branches indicated that participants with 10.18 minutes or more of 1 minute bouts of average daily moderate activity were considered “low risk” with 82% certainty.
Conclusion: While better extraction of features is needed to improve the classiﬁcation accuracy, decision trees and RF classiﬁers offer a promising insight into recognizing CV status directly from accelerometer data.