Modeling Road Accident Severity with Comparisons of Logistic Regression, Decision Tree and Random Forest

To reduce the damage caused by road accidents, researchers have applied different techniques to explore correlated factors and develop efficient prediction models. The main purpose of this study is to use one statistical and two nonparametric data mining techniques, namely, logistic regression (LR),...

Full description

Bibliographic Details
Main Authors: Mu-Ming Chen, Mu-Chen Chen
Format: Article
Language:English
Published: MDPI AG 2020-05-01
Series:Information
Subjects:
Online Access:https://www.mdpi.com/2078-2489/11/5/270
id doaj-9c99fcbc29a046b09d26659b1d1bd322
record_format Article
spelling doaj-9c99fcbc29a046b09d26659b1d1bd3222020-11-25T03:32:05ZengMDPI AGInformation2078-24892020-05-011127027010.3390/info11050270Modeling Road Accident Severity with Comparisons of Logistic Regression, Decision Tree and Random ForestMu-Ming Chen0Mu-Chen Chen1Department of Transportation and Logistics Management, National Chiao Tung University, Hsinchu City 30010, TaiwanDepartment of Transportation and Logistics Management, National Chiao Tung University, Hsinchu City 30010, TaiwanTo reduce the damage caused by road accidents, researchers have applied different techniques to explore correlated factors and develop efficient prediction models. The main purpose of this study is to use one statistical and two nonparametric data mining techniques, namely, logistic regression (LR), classification and regression tree (CART), and random forest (RF), to compare their prediction capability, identify the significant variables (identified by LR) and important variables (identified by CART or RF) that are strongly correlated with road accident severity, and distinguish the variables that have significant positive influence on prediction performance. In this study, three prediction performance evaluation measures, accuracy, sensitivity and specificity, are used to find the best integrated method which consists of the most effective prediction model and the input variables that have higher positive influence on accuracy, sensitivity and specificity.https://www.mdpi.com/2078-2489/11/5/270transportationroad accident severitylogistic regressiondecision treerandom forest
collection DOAJ
language English
format Article
sources DOAJ
author Mu-Ming Chen
Mu-Chen Chen
spellingShingle Mu-Ming Chen
Mu-Chen Chen
Modeling Road Accident Severity with Comparisons of Logistic Regression, Decision Tree and Random Forest
Information
transportation
road accident severity
logistic regression
decision tree
random forest
author_facet Mu-Ming Chen
Mu-Chen Chen
author_sort Mu-Ming Chen
title Modeling Road Accident Severity with Comparisons of Logistic Regression, Decision Tree and Random Forest
title_short Modeling Road Accident Severity with Comparisons of Logistic Regression, Decision Tree and Random Forest
title_full Modeling Road Accident Severity with Comparisons of Logistic Regression, Decision Tree and Random Forest
title_fullStr Modeling Road Accident Severity with Comparisons of Logistic Regression, Decision Tree and Random Forest
title_full_unstemmed Modeling Road Accident Severity with Comparisons of Logistic Regression, Decision Tree and Random Forest
title_sort modeling road accident severity with comparisons of logistic regression, decision tree and random forest
publisher MDPI AG
series Information
issn 2078-2489
publishDate 2020-05-01
description To reduce the damage caused by road accidents, researchers have applied different techniques to explore correlated factors and develop efficient prediction models. The main purpose of this study is to use one statistical and two nonparametric data mining techniques, namely, logistic regression (LR), classification and regression tree (CART), and random forest (RF), to compare their prediction capability, identify the significant variables (identified by LR) and important variables (identified by CART or RF) that are strongly correlated with road accident severity, and distinguish the variables that have significant positive influence on prediction performance. In this study, three prediction performance evaluation measures, accuracy, sensitivity and specificity, are used to find the best integrated method which consists of the most effective prediction model and the input variables that have higher positive influence on accuracy, sensitivity and specificity.
topic transportation
road accident severity
logistic regression
decision tree
random forest
url https://www.mdpi.com/2078-2489/11/5/270
work_keys_str_mv AT mumingchen modelingroadaccidentseveritywithcomparisonsoflogisticregressiondecisiontreeandrandomforest
AT muchenchen modelingroadaccidentseveritywithcomparisonsoflogisticregressiondecisiontreeandrandomforest
_version_ 1724569792893943808