Modeling Road Accident Severity with Comparisons of Logistic Regression, Decision Tree and Random Forest
To reduce the damage caused by road accidents, researchers have applied different techniques to explore correlated factors and develop efficient prediction models. The main purpose of this study is to use one statistical and two nonparametric data mining techniques, namely, logistic regression (LR),...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-05-01
|
Series: | Information |
Subjects: | |
Online Access: | https://www.mdpi.com/2078-2489/11/5/270 |
id |
doaj-9c99fcbc29a046b09d26659b1d1bd322 |
---|---|
record_format |
Article |
spelling |
doaj-9c99fcbc29a046b09d26659b1d1bd3222020-11-25T03:32:05ZengMDPI AGInformation2078-24892020-05-011127027010.3390/info11050270Modeling Road Accident Severity with Comparisons of Logistic Regression, Decision Tree and Random ForestMu-Ming Chen0Mu-Chen Chen1Department of Transportation and Logistics Management, National Chiao Tung University, Hsinchu City 30010, TaiwanDepartment of Transportation and Logistics Management, National Chiao Tung University, Hsinchu City 30010, TaiwanTo reduce the damage caused by road accidents, researchers have applied different techniques to explore correlated factors and develop efficient prediction models. The main purpose of this study is to use one statistical and two nonparametric data mining techniques, namely, logistic regression (LR), classification and regression tree (CART), and random forest (RF), to compare their prediction capability, identify the significant variables (identified by LR) and important variables (identified by CART or RF) that are strongly correlated with road accident severity, and distinguish the variables that have significant positive influence on prediction performance. In this study, three prediction performance evaluation measures, accuracy, sensitivity and specificity, are used to find the best integrated method which consists of the most effective prediction model and the input variables that have higher positive influence on accuracy, sensitivity and specificity.https://www.mdpi.com/2078-2489/11/5/270transportationroad accident severitylogistic regressiondecision treerandom forest |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Mu-Ming Chen Mu-Chen Chen |
spellingShingle |
Mu-Ming Chen Mu-Chen Chen Modeling Road Accident Severity with Comparisons of Logistic Regression, Decision Tree and Random Forest Information transportation road accident severity logistic regression decision tree random forest |
author_facet |
Mu-Ming Chen Mu-Chen Chen |
author_sort |
Mu-Ming Chen |
title |
Modeling Road Accident Severity with Comparisons of Logistic Regression, Decision Tree and Random Forest |
title_short |
Modeling Road Accident Severity with Comparisons of Logistic Regression, Decision Tree and Random Forest |
title_full |
Modeling Road Accident Severity with Comparisons of Logistic Regression, Decision Tree and Random Forest |
title_fullStr |
Modeling Road Accident Severity with Comparisons of Logistic Regression, Decision Tree and Random Forest |
title_full_unstemmed |
Modeling Road Accident Severity with Comparisons of Logistic Regression, Decision Tree and Random Forest |
title_sort |
modeling road accident severity with comparisons of logistic regression, decision tree and random forest |
publisher |
MDPI AG |
series |
Information |
issn |
2078-2489 |
publishDate |
2020-05-01 |
description |
To reduce the damage caused by road accidents, researchers have applied different techniques to explore correlated factors and develop efficient prediction models. The main purpose of this study is to use one statistical and two nonparametric data mining techniques, namely, logistic regression (LR), classification and regression tree (CART), and random forest (RF), to compare their prediction capability, identify the significant variables (identified by LR) and important variables (identified by CART or RF) that are strongly correlated with road accident severity, and distinguish the variables that have significant positive influence on prediction performance. In this study, three prediction performance evaluation measures, accuracy, sensitivity and specificity, are used to find the best integrated method which consists of the most effective prediction model and the input variables that have higher positive influence on accuracy, sensitivity and specificity. |
topic |
transportation road accident severity logistic regression decision tree random forest |
url |
https://www.mdpi.com/2078-2489/11/5/270 |
work_keys_str_mv |
AT mumingchen modelingroadaccidentseveritywithcomparisonsoflogisticregressiondecisiontreeandrandomforest AT muchenchen modelingroadaccidentseveritywithcomparisonsoflogisticregressiondecisiontreeandrandomforest |
_version_ |
1724569792893943808 |