Local Log-Linear Models for Capture-Recapture

Capture-recapture (CRC) models use two or more samples, or lists, to estimate the size of a population. In the canonical example, a researcher captures, marks, and releases several samples of fish in a lake. When the fish that are captured more than once are few compared to the total number that are...

Full description

Bibliographic Details
Main Author: Kurtz, Zachary Todd
Format: Others
Published: Research Showcase @ CMU 2014
Subjects:
Online Access:http://repository.cmu.edu/dissertations/360
http://repository.cmu.edu/cgi/viewcontent.cgi?article=1360&context=dissertations
id ndltd-cmu.edu-oai-repository.cmu.edu-dissertations-1360
record_format oai_dc
spelling ndltd-cmu.edu-oai-repository.cmu.edu-dissertations-13602014-10-08T03:30:06Z Local Log-Linear Models for Capture-Recapture Kurtz, Zachary Todd Capture-recapture (CRC) models use two or more samples, or lists, to estimate the size of a population. In the canonical example, a researcher captures, marks, and releases several samples of fish in a lake. When the fish that are captured more than once are few compared to the total number that are captured, one suspects that the lake contains many more uncaptured fish. This basic intuition motivates CRC models in fields as diverse as epidemiology, entomology, and computer science. We use simulations to study the performance of conventional log-linear models for CRC. Specifically we evaluate model selection criteria, model averaging, an asymptotic variance formula, and several small-sample data adjustments. Next, we argue that interpretable models are essential for credible inference, since sets of models that fit the data equally well can imply vastly different estimates of the population size. A secondary analysis of data on survivors of the World Trade Center attacks illustrates this issue. Our main chapter develops local log-linear models. Heterogeneous populations tend to bias conventional log-linear models. Post-stratification can reduce the effects of heterogeneity by using covariates, such as the age or size of each observed unit, to partition the data into relatively homogeneous post-strata. One can fit a model to each post-stratum and aggregate the resulting estimates across post-strata. We extend post-stratification to its logical extreme by selecting a local log-linear model for each observed point in the covariate space, while smoothing to achieve stability. Local log-linear models serve a dual purpose. Besides estimating the population size, they estimate the rate of missingness as a function of covariates. Simulations demonstrate the superiority of local log-linear models for estimating local rates of missingness for special cases in which the generating model varies over the covariate space. We apply the method to estimate bird species richness in continental North America and to estimate the prevalence of multiple sclerosis in a region of France. 2014-01-01T08:00:00Z text application/pdf http://repository.cmu.edu/dissertations/360 http://repository.cmu.edu/cgi/viewcontent.cgi?article=1360&context=dissertations Dissertations Research Showcase @ CMU log-linear capture-recapture covariates smoothing model selection identifiability
collection NDLTD
format Others
sources NDLTD
topic log-linear
capture-recapture
covariates
smoothing
model selection
identifiability
spellingShingle log-linear
capture-recapture
covariates
smoothing
model selection
identifiability
Kurtz, Zachary Todd
Local Log-Linear Models for Capture-Recapture
description Capture-recapture (CRC) models use two or more samples, or lists, to estimate the size of a population. In the canonical example, a researcher captures, marks, and releases several samples of fish in a lake. When the fish that are captured more than once are few compared to the total number that are captured, one suspects that the lake contains many more uncaptured fish. This basic intuition motivates CRC models in fields as diverse as epidemiology, entomology, and computer science. We use simulations to study the performance of conventional log-linear models for CRC. Specifically we evaluate model selection criteria, model averaging, an asymptotic variance formula, and several small-sample data adjustments. Next, we argue that interpretable models are essential for credible inference, since sets of models that fit the data equally well can imply vastly different estimates of the population size. A secondary analysis of data on survivors of the World Trade Center attacks illustrates this issue. Our main chapter develops local log-linear models. Heterogeneous populations tend to bias conventional log-linear models. Post-stratification can reduce the effects of heterogeneity by using covariates, such as the age or size of each observed unit, to partition the data into relatively homogeneous post-strata. One can fit a model to each post-stratum and aggregate the resulting estimates across post-strata. We extend post-stratification to its logical extreme by selecting a local log-linear model for each observed point in the covariate space, while smoothing to achieve stability. Local log-linear models serve a dual purpose. Besides estimating the population size, they estimate the rate of missingness as a function of covariates. Simulations demonstrate the superiority of local log-linear models for estimating local rates of missingness for special cases in which the generating model varies over the covariate space. We apply the method to estimate bird species richness in continental North America and to estimate the prevalence of multiple sclerosis in a region of France.
author Kurtz, Zachary Todd
author_facet Kurtz, Zachary Todd
author_sort Kurtz, Zachary Todd
title Local Log-Linear Models for Capture-Recapture
title_short Local Log-Linear Models for Capture-Recapture
title_full Local Log-Linear Models for Capture-Recapture
title_fullStr Local Log-Linear Models for Capture-Recapture
title_full_unstemmed Local Log-Linear Models for Capture-Recapture
title_sort local log-linear models for capture-recapture
publisher Research Showcase @ CMU
publishDate 2014
url http://repository.cmu.edu/dissertations/360
http://repository.cmu.edu/cgi/viewcontent.cgi?article=1360&context=dissertations
work_keys_str_mv AT kurtzzacharytodd localloglinearmodelsforcapturerecapture
_version_ 1716716424749121536