Chi-Square Orthogonal Components for Assessing Goodness-of-fit of Multidimensional Multinomial Data

abstract: It is common in the analysis of data to provide a goodness-of-fit test to assess the performance of a model. In the analysis of contingency tables, goodness-of-fit statistics are frequently employed when modeling social science, educational or psychological data where the interest is often...

Full description

Bibliographic Details
Other Authors: Milovanovic, Jelena (Author)
Format: Doctoral Thesis
Language:English
Published: 2011
Subjects:
Online Access:http://hdl.handle.net/2286/R.I.9431
id ndltd-asu.edu-item-9431
record_format oai_dc
spelling ndltd-asu.edu-item-94312018-06-22T03:02:04Z Chi-Square Orthogonal Components for Assessing Goodness-of-fit of Multidimensional Multinomial Data abstract: It is common in the analysis of data to provide a goodness-of-fit test to assess the performance of a model. In the analysis of contingency tables, goodness-of-fit statistics are frequently employed when modeling social science, educational or psychological data where the interest is often directed at investigating the association among multi-categorical variables. Pearson's chi-squared statistic is well-known in goodness-of-fit testing, but it is sometimes considered to produce an omnibus test as it gives little guidance to the source of poor fit once the null hypothesis is rejected. However, its components can provide powerful directional tests. In this dissertation, orthogonal components are used to develop goodness-of-fit tests for models fit to the counts obtained from the cross-classification of multi-category dependent variables. Ordinal categories are assumed. Orthogonal components defined on marginals are obtained when analyzing multi-dimensional contingency tables through the use of the QR decomposition. A subset of these orthogonal components can be used to construct limited-information tests that allow one to identify the source of lack-of-fit and provide an increase in power compared to Pearson's test. These tests can address the adverse effects presented when data are sparse. The tests rely on the set of first- and second-order marginals jointly, the set of second-order marginals only, and the random forest method, a popular algorithm for modeling large complex data sets. The performance of these tests is compared to the likelihood ratio test as well as to tests based on orthogonal polynomial components. The derived goodness-of-fit tests are evaluated with studies for detecting two- and three-way associations that are not accounted for by a categorical variable factor model with a single latent variable. In addition the tests are used to investigate the case when the model misspecification involves parameter constraints for large and sparse contingency tables. The methodology proposed here is applied to data from the 38th round of the State Survey conducted by the Institute for Public Policy and Michigan State University Social Research (2005) . The results illustrate the use of the proposed techniques in the context of a sparse data set. Dissertation/Thesis Milovanovic, Jelena (Author) Young, Dennis (Advisor) Reiser, Mark (Advisor) Wilson, Jeffrey (Committee member) Eubank, Randall (Committee member) Yang, Yan (Committee member) Arizona State University (Publisher) Statistics Chi-Square goodness-of-fit tests decomposition of chi-square statistic Orthogonal components of chi-square statistic eng 217 pages Ph.D. Mathematics 2011 Doctoral Dissertation http://hdl.handle.net/2286/R.I.9431 http://rightsstatements.org/vocab/InC/1.0/ All Rights Reserved 2011
collection NDLTD
language English
format Doctoral Thesis
sources NDLTD
topic Statistics
Chi-Square goodness-of-fit tests
decomposition of chi-square statistic
Orthogonal components of chi-square statistic
spellingShingle Statistics
Chi-Square goodness-of-fit tests
decomposition of chi-square statistic
Orthogonal components of chi-square statistic
Chi-Square Orthogonal Components for Assessing Goodness-of-fit of Multidimensional Multinomial Data
description abstract: It is common in the analysis of data to provide a goodness-of-fit test to assess the performance of a model. In the analysis of contingency tables, goodness-of-fit statistics are frequently employed when modeling social science, educational or psychological data where the interest is often directed at investigating the association among multi-categorical variables. Pearson's chi-squared statistic is well-known in goodness-of-fit testing, but it is sometimes considered to produce an omnibus test as it gives little guidance to the source of poor fit once the null hypothesis is rejected. However, its components can provide powerful directional tests. In this dissertation, orthogonal components are used to develop goodness-of-fit tests for models fit to the counts obtained from the cross-classification of multi-category dependent variables. Ordinal categories are assumed. Orthogonal components defined on marginals are obtained when analyzing multi-dimensional contingency tables through the use of the QR decomposition. A subset of these orthogonal components can be used to construct limited-information tests that allow one to identify the source of lack-of-fit and provide an increase in power compared to Pearson's test. These tests can address the adverse effects presented when data are sparse. The tests rely on the set of first- and second-order marginals jointly, the set of second-order marginals only, and the random forest method, a popular algorithm for modeling large complex data sets. The performance of these tests is compared to the likelihood ratio test as well as to tests based on orthogonal polynomial components. The derived goodness-of-fit tests are evaluated with studies for detecting two- and three-way associations that are not accounted for by a categorical variable factor model with a single latent variable. In addition the tests are used to investigate the case when the model misspecification involves parameter constraints for large and sparse contingency tables. The methodology proposed here is applied to data from the 38th round of the State Survey conducted by the Institute for Public Policy and Michigan State University Social Research (2005) . The results illustrate the use of the proposed techniques in the context of a sparse data set. === Dissertation/Thesis === Ph.D. Mathematics 2011
author2 Milovanovic, Jelena (Author)
author_facet Milovanovic, Jelena (Author)
title Chi-Square Orthogonal Components for Assessing Goodness-of-fit of Multidimensional Multinomial Data
title_short Chi-Square Orthogonal Components for Assessing Goodness-of-fit of Multidimensional Multinomial Data
title_full Chi-Square Orthogonal Components for Assessing Goodness-of-fit of Multidimensional Multinomial Data
title_fullStr Chi-Square Orthogonal Components for Assessing Goodness-of-fit of Multidimensional Multinomial Data
title_full_unstemmed Chi-Square Orthogonal Components for Assessing Goodness-of-fit of Multidimensional Multinomial Data
title_sort chi-square orthogonal components for assessing goodness-of-fit of multidimensional multinomial data
publishDate 2011
url http://hdl.handle.net/2286/R.I.9431
_version_ 1718699722508599296