Application of Finite Mixture Models for Vehicle Crash Data Analysis

Developing sound or reliable statistical models for analyzing vehicle crashes is very important in highway safety studies. A difficulty arises when crash data exhibit overdispersion. Over-dispersion caused by unobserved heterogeneity is a serious problem and has been addressed in a variety ways with...

Full description

Bibliographic Details
Main Author: Park, Byung Jung
Other Authors: Lord, Dominique
Format: Others
Language:English
Published: 2010
Subjects:
Online Access:http://hdl.handle.net/1969.1/ETD-TAMU-2010-05-7667
id ndltd-tamu.edu-oai-repository.tamu.edu-1969.1-ETD-TAMU-2010-05-7667
record_format oai_dc
spelling ndltd-tamu.edu-oai-repository.tamu.edu-1969.1-ETD-TAMU-2010-05-76672013-01-08T10:41:18ZApplication of Finite Mixture Models for Vehicle Crash Data AnalysisPark, Byung JungHighway safetyOver-dispersionFinite mixtureNegative binomial regression modelLatent class modelDeveloping sound or reliable statistical models for analyzing vehicle crashes is very important in highway safety studies. A difficulty arises when crash data exhibit overdispersion. Over-dispersion caused by unobserved heterogeneity is a serious problem and has been addressed in a variety ways within the negative binomial (NB) modeling framework. However, the true factors that affect heterogeneity are often unknown to researchers, and failure to accommodate such heterogeneity in the model can undermine the validity of the empirical results. Given the limitations of the NB regression model for addressing over-dispersion of crash data due to heterogeneity, this research examined an alternative model formulation that could be used for capturing heterogeneity through the use of finite mixture regression models. A Finite mixture of Poisson or NB regression models is especially useful when the count data were generated from a heterogeneous population. To evaluate these models, Poisson and NB mixture models were estimated using both simulated and empirical crash datasets, and the results were compared to those from a single NB regression model. For model parameter estimation, a Bayesian approach was adopted, since it provides much richer inference than the maximum likelihood approach. Using simulated datasets, it was shown that the single NB model is biased if the underlying cause of heterogeneity is due to the existence of multiple counting processes. The implications could be poor prediction performance and poor interpretation. Using two empirical datasets, the results demonstrated that a two-component finite mixture of NB regression models (FMNB-2) was quite enough to characterize the uncertainty about the crash occurrence, and it provided more opportunities for interpretation of the dataset which are not available from the standard NB model. Based on the models from the empirical dataset (i.e., FMNB-2 and NB models), their relative performances were also examined in terms of hotspot identification and accident modification factors. Finally, using a simulation study, bias properties of the posterior summary statistics for dispersion parameters in FMNB-2 model were characterized, and the guidelines on the choice of priors and the summary statistics to use were presented for different sample sizes and sample-mean values.Lord, Dominique2010-07-15T00:16:10Z2010-07-23T21:47:03Z2010-07-15T00:16:10Z2010-07-23T21:47:03Z2010-052010-07-14May 2010BookThesisElectronic Dissertationtextapplication/pdfhttp://hdl.handle.net/1969.1/ETD-TAMU-2010-05-7667eng
collection NDLTD
language English
format Others
sources NDLTD
topic Highway safety
Over-dispersion
Finite mixture
Negative binomial regression model
Latent class model
spellingShingle Highway safety
Over-dispersion
Finite mixture
Negative binomial regression model
Latent class model
Park, Byung Jung
Application of Finite Mixture Models for Vehicle Crash Data Analysis
description Developing sound or reliable statistical models for analyzing vehicle crashes is very important in highway safety studies. A difficulty arises when crash data exhibit overdispersion. Over-dispersion caused by unobserved heterogeneity is a serious problem and has been addressed in a variety ways within the negative binomial (NB) modeling framework. However, the true factors that affect heterogeneity are often unknown to researchers, and failure to accommodate such heterogeneity in the model can undermine the validity of the empirical results. Given the limitations of the NB regression model for addressing over-dispersion of crash data due to heterogeneity, this research examined an alternative model formulation that could be used for capturing heterogeneity through the use of finite mixture regression models. A Finite mixture of Poisson or NB regression models is especially useful when the count data were generated from a heterogeneous population. To evaluate these models, Poisson and NB mixture models were estimated using both simulated and empirical crash datasets, and the results were compared to those from a single NB regression model. For model parameter estimation, a Bayesian approach was adopted, since it provides much richer inference than the maximum likelihood approach. Using simulated datasets, it was shown that the single NB model is biased if the underlying cause of heterogeneity is due to the existence of multiple counting processes. The implications could be poor prediction performance and poor interpretation. Using two empirical datasets, the results demonstrated that a two-component finite mixture of NB regression models (FMNB-2) was quite enough to characterize the uncertainty about the crash occurrence, and it provided more opportunities for interpretation of the dataset which are not available from the standard NB model. Based on the models from the empirical dataset (i.e., FMNB-2 and NB models), their relative performances were also examined in terms of hotspot identification and accident modification factors. Finally, using a simulation study, bias properties of the posterior summary statistics for dispersion parameters in FMNB-2 model were characterized, and the guidelines on the choice of priors and the summary statistics to use were presented for different sample sizes and sample-mean values.
author2 Lord, Dominique
author_facet Lord, Dominique
Park, Byung Jung
author Park, Byung Jung
author_sort Park, Byung Jung
title Application of Finite Mixture Models for Vehicle Crash Data Analysis
title_short Application of Finite Mixture Models for Vehicle Crash Data Analysis
title_full Application of Finite Mixture Models for Vehicle Crash Data Analysis
title_fullStr Application of Finite Mixture Models for Vehicle Crash Data Analysis
title_full_unstemmed Application of Finite Mixture Models for Vehicle Crash Data Analysis
title_sort application of finite mixture models for vehicle crash data analysis
publishDate 2010
url http://hdl.handle.net/1969.1/ETD-TAMU-2010-05-7667
work_keys_str_mv AT parkbyungjung applicationoffinitemixturemodelsforvehiclecrashdataanalysis
_version_ 1716504796258631680