Statistical and Machine Learning Models for Classification of Human Wear and Delivery Days in Accelerometry Data

Accelerometers are increasingly being used in biomedical research, but the analysis of accelerometry data is often complicated by both the massive size of the datasets and the collection of unwanted data from the process of delivery to study participants. Current methods for removing delivery data i...

Full description

Bibliographic Details
Main Authors: Ryan Moore, Kristin R. Archer, Leena Choi
Format: Article
Language:English
Published: MDPI AG 2021-04-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/21/8/2726
id doaj-d5e9e83c64d84412bef03b98650f02eb
record_format Article
spelling doaj-d5e9e83c64d84412bef03b98650f02eb2021-04-13T23:01:37ZengMDPI AGSensors1424-82202021-04-01212726272610.3390/s21082726Statistical and Machine Learning Models for Classification of Human Wear and Delivery Days in Accelerometry DataRyan Moore0Kristin R. Archer1Leena Choi2Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN 37232, USADepartment of Orthopaedic Surgery, Center for Musculoskeletal Research, Vanderbilt University Medical Center, Nashville, TN 37232, USADepartment of Biostatistics, Vanderbilt University Medical Center, Nashville, TN 37232, USAAccelerometers are increasingly being used in biomedical research, but the analysis of accelerometry data is often complicated by both the massive size of the datasets and the collection of unwanted data from the process of delivery to study participants. Current methods for removing delivery data involve arduous manual review of dense datasets. We aimed to develop models for the classification of days in accelerometry data as activity from human wear or the delivery process. These models can be used to automate the cleaning of accelerometry datasets that are adulterated with activity from delivery. We developed statistical and machine learning models for the classification of accelerometry data in a supervised learning context using a large human activity and delivery labeled accelerometry dataset. Model performances were assessed and compared using Monte Carlo cross-validation. We found that a hybrid convolutional recurrent neural network performed best in the classification task with an F1 score of 0.960 but simpler models such as logistic regression and random forest also had excellent performance with F1 scores of 0.951 and 0.957, respectively. The best performing models and related data processing techniques are made publicly available in the R package, Physical Activity.https://www.mdpi.com/1424-8220/21/8/2726accelerometrystatistical learningmachine learningpredictive modelingneural networksphysical activity
collection DOAJ
language English
format Article
sources DOAJ
author Ryan Moore
Kristin R. Archer
Leena Choi
spellingShingle Ryan Moore
Kristin R. Archer
Leena Choi
Statistical and Machine Learning Models for Classification of Human Wear and Delivery Days in Accelerometry Data
Sensors
accelerometry
statistical learning
machine learning
predictive modeling
neural networks
physical activity
author_facet Ryan Moore
Kristin R. Archer
Leena Choi
author_sort Ryan Moore
title Statistical and Machine Learning Models for Classification of Human Wear and Delivery Days in Accelerometry Data
title_short Statistical and Machine Learning Models for Classification of Human Wear and Delivery Days in Accelerometry Data
title_full Statistical and Machine Learning Models for Classification of Human Wear and Delivery Days in Accelerometry Data
title_fullStr Statistical and Machine Learning Models for Classification of Human Wear and Delivery Days in Accelerometry Data
title_full_unstemmed Statistical and Machine Learning Models for Classification of Human Wear and Delivery Days in Accelerometry Data
title_sort statistical and machine learning models for classification of human wear and delivery days in accelerometry data
publisher MDPI AG
series Sensors
issn 1424-8220
publishDate 2021-04-01
description Accelerometers are increasingly being used in biomedical research, but the analysis of accelerometry data is often complicated by both the massive size of the datasets and the collection of unwanted data from the process of delivery to study participants. Current methods for removing delivery data involve arduous manual review of dense datasets. We aimed to develop models for the classification of days in accelerometry data as activity from human wear or the delivery process. These models can be used to automate the cleaning of accelerometry datasets that are adulterated with activity from delivery. We developed statistical and machine learning models for the classification of accelerometry data in a supervised learning context using a large human activity and delivery labeled accelerometry dataset. Model performances were assessed and compared using Monte Carlo cross-validation. We found that a hybrid convolutional recurrent neural network performed best in the classification task with an F1 score of 0.960 but simpler models such as logistic regression and random forest also had excellent performance with F1 scores of 0.951 and 0.957, respectively. The best performing models and related data processing techniques are made publicly available in the R package, Physical Activity.
topic accelerometry
statistical learning
machine learning
predictive modeling
neural networks
physical activity
url https://www.mdpi.com/1424-8220/21/8/2726
work_keys_str_mv AT ryanmoore statisticalandmachinelearningmodelsforclassificationofhumanwearanddeliverydaysinaccelerometrydata
AT kristinrarcher statisticalandmachinelearningmodelsforclassificationofhumanwearanddeliverydaysinaccelerometrydata
AT leenachoi statisticalandmachinelearningmodelsforclassificationofhumanwearanddeliverydaysinaccelerometrydata
_version_ 1721528514834857984