Application of Finite Mixture Models for Vehicle Crash Data Analysis
Developing sound or reliable statistical models for analyzing vehicle crashes is very important in highway safety studies. A difficulty arises when crash data exhibit overdispersion. Over-dispersion caused by unobserved heterogeneity is a serious problem and has been addressed in a variety ways with...
Main Author: | |
---|---|
Other Authors: | |
Format: | Others |
Language: | English |
Published: |
2010
|
Subjects: | |
Online Access: | http://hdl.handle.net/1969.1/ETD-TAMU-2010-05-7667 |
id |
ndltd-tamu.edu-oai-repository.tamu.edu-1969.1-ETD-TAMU-2010-05-7667 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-tamu.edu-oai-repository.tamu.edu-1969.1-ETD-TAMU-2010-05-76672013-01-08T10:41:18ZApplication of Finite Mixture Models for Vehicle Crash Data AnalysisPark, Byung JungHighway safetyOver-dispersionFinite mixtureNegative binomial regression modelLatent class modelDeveloping sound or reliable statistical models for analyzing vehicle crashes is very important in highway safety studies. A difficulty arises when crash data exhibit overdispersion. Over-dispersion caused by unobserved heterogeneity is a serious problem and has been addressed in a variety ways within the negative binomial (NB) modeling framework. However, the true factors that affect heterogeneity are often unknown to researchers, and failure to accommodate such heterogeneity in the model can undermine the validity of the empirical results. Given the limitations of the NB regression model for addressing over-dispersion of crash data due to heterogeneity, this research examined an alternative model formulation that could be used for capturing heterogeneity through the use of finite mixture regression models. A Finite mixture of Poisson or NB regression models is especially useful when the count data were generated from a heterogeneous population. To evaluate these models, Poisson and NB mixture models were estimated using both simulated and empirical crash datasets, and the results were compared to those from a single NB regression model. For model parameter estimation, a Bayesian approach was adopted, since it provides much richer inference than the maximum likelihood approach. Using simulated datasets, it was shown that the single NB model is biased if the underlying cause of heterogeneity is due to the existence of multiple counting processes. The implications could be poor prediction performance and poor interpretation. Using two empirical datasets, the results demonstrated that a two-component finite mixture of NB regression models (FMNB-2) was quite enough to characterize the uncertainty about the crash occurrence, and it provided more opportunities for interpretation of the dataset which are not available from the standard NB model. Based on the models from the empirical dataset (i.e., FMNB-2 and NB models), their relative performances were also examined in terms of hotspot identification and accident modification factors. Finally, using a simulation study, bias properties of the posterior summary statistics for dispersion parameters in FMNB-2 model were characterized, and the guidelines on the choice of priors and the summary statistics to use were presented for different sample sizes and sample-mean values.Lord, Dominique2010-07-15T00:16:10Z2010-07-23T21:47:03Z2010-07-15T00:16:10Z2010-07-23T21:47:03Z2010-052010-07-14May 2010BookThesisElectronic Dissertationtextapplication/pdfhttp://hdl.handle.net/1969.1/ETD-TAMU-2010-05-7667eng |
collection |
NDLTD |
language |
English |
format |
Others
|
sources |
NDLTD |
topic |
Highway safety Over-dispersion Finite mixture Negative binomial regression model Latent class model |
spellingShingle |
Highway safety Over-dispersion Finite mixture Negative binomial regression model Latent class model Park, Byung Jung Application of Finite Mixture Models for Vehicle Crash Data Analysis |
description |
Developing sound or reliable statistical models for analyzing vehicle crashes is very
important in highway safety studies. A difficulty arises when crash data exhibit overdispersion.
Over-dispersion caused by unobserved heterogeneity is a serious problem
and has been addressed in a variety ways within the negative binomial (NB) modeling
framework. However, the true factors that affect heterogeneity are often unknown to
researchers, and failure to accommodate such heterogeneity in the model can undermine
the validity of the empirical results.
Given the limitations of the NB regression model for addressing over-dispersion of crash
data due to heterogeneity, this research examined an alternative model formulation that
could be used for capturing heterogeneity through the use of finite mixture regression
models. A Finite mixture of Poisson or NB regression models is especially useful when
the count data were generated from a heterogeneous population. To evaluate these
models, Poisson and NB mixture models were estimated using both simulated and
empirical crash datasets, and the results were compared to those from a single NB
regression model. For model parameter estimation, a Bayesian approach was adopted,
since it provides much richer inference than the maximum likelihood approach.
Using simulated datasets, it was shown that the single NB model is biased if the
underlying cause of heterogeneity is due to the existence of multiple counting processes.
The implications could be poor prediction performance and poor interpretation. Using two empirical datasets, the results demonstrated that a two-component finite mixture of
NB regression models (FMNB-2) was quite enough to characterize the uncertainty about
the crash occurrence, and it provided more opportunities for interpretation of the dataset
which are not available from the standard NB model. Based on the models from the
empirical dataset (i.e., FMNB-2 and NB models), their relative performances were also
examined in terms of hotspot identification and accident modification factors. Finally,
using a simulation study, bias properties of the posterior summary statistics for
dispersion parameters in FMNB-2 model were characterized, and the guidelines on the
choice of priors and the summary statistics to use were presented for different sample
sizes and sample-mean values. |
author2 |
Lord, Dominique |
author_facet |
Lord, Dominique Park, Byung Jung |
author |
Park, Byung Jung |
author_sort |
Park, Byung Jung |
title |
Application of Finite Mixture Models for Vehicle Crash Data Analysis |
title_short |
Application of Finite Mixture Models for Vehicle Crash Data Analysis |
title_full |
Application of Finite Mixture Models for Vehicle Crash Data Analysis |
title_fullStr |
Application of Finite Mixture Models for Vehicle Crash Data Analysis |
title_full_unstemmed |
Application of Finite Mixture Models for Vehicle Crash Data Analysis |
title_sort |
application of finite mixture models for vehicle crash data analysis |
publishDate |
2010 |
url |
http://hdl.handle.net/1969.1/ETD-TAMU-2010-05-7667 |
work_keys_str_mv |
AT parkbyungjung applicationoffinitemixturemodelsforvehiclecrashdataanalysis |
_version_ |
1716504796258631680 |