A Greedy Algorithm for Representative Sampling: repsample in Stata

Quantitative empirical analyses of a population of interest usually aim to estimate the causal effect of one or more independent variables on a dependent variable. However, only in rare instances is the whole population available for analysis. Researchers tend to estimate causal effects on a selecte...

Full description

Bibliographic Details
Main Author:	Evangelos Kontopantelis
Format:	Article
Language:	English
Published:	Foundation for Open Access Statistics 2013-11-01
Series:	Journal of Statistical Software
Online Access:	http://www.jstatsoft.org/index.php/jss/article/view/2110

id	doaj-8be0ccf0150b438abe664c75b8d891ca
record_format	Article
spelling	doaj-8be0ccf0150b438abe664c75b8d891ca2020-11-24T23:24:37ZengFoundation for Open Access StatisticsJournal of Statistical Software1548-76602013-11-0155111910.18637/jss.v055.c01714A Greedy Algorithm for Representative Sampling: repsample in StataEvangelos KontopantelisQuantitative empirical analyses of a population of interest usually aim to estimate the causal effect of one or more independent variables on a dependent variable. However, only in rare instances is the whole population available for analysis. Researchers tend to estimate causal effects on a selected sample and generalize their conclusions to the whole population. The validity of this approach rests on the assumption that the sample is representative of the population on certain key characteristics. A study using a non-representative sample is lacking in external validity by failing to minimize population choice bias. When the sample is large and non-response bias is not an issue, a random selection process is adequate to ensure external validity. If that is not the case, however, researchers could follow a more deterministic approach to ensure representativeness on the selected characteristics, provided these are known, or can be estimated, in the parent population. Although such approaches exist for matched sampling designs, research on representative sampling and the similarity between the sample and the parent population seems to be lacking. In this article we propose a greedy algorithm for obtaining a representative sample and quantifying representativeness in Stata.http://www.jstatsoft.org/index.php/jss/article/view/2110
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Evangelos Kontopantelis
spellingShingle	Evangelos Kontopantelis A Greedy Algorithm for Representative Sampling: repsample in Stata Journal of Statistical Software
author_facet	Evangelos Kontopantelis
author_sort	Evangelos Kontopantelis
title	A Greedy Algorithm for Representative Sampling: repsample in Stata
title_short	A Greedy Algorithm for Representative Sampling: repsample in Stata
title_full	A Greedy Algorithm for Representative Sampling: repsample in Stata
title_fullStr	A Greedy Algorithm for Representative Sampling: repsample in Stata
title_full_unstemmed	A Greedy Algorithm for Representative Sampling: repsample in Stata
title_sort	greedy algorithm for representative sampling: repsample in stata
publisher	Foundation for Open Access Statistics
series	Journal of Statistical Software
issn	1548-7660
publishDate	2013-11-01
description	Quantitative empirical analyses of a population of interest usually aim to estimate the causal effect of one or more independent variables on a dependent variable. However, only in rare instances is the whole population available for analysis. Researchers tend to estimate causal effects on a selected sample and generalize their conclusions to the whole population. The validity of this approach rests on the assumption that the sample is representative of the population on certain key characteristics. A study using a non-representative sample is lacking in external validity by failing to minimize population choice bias. When the sample is large and non-response bias is not an issue, a random selection process is adequate to ensure external validity. If that is not the case, however, researchers could follow a more deterministic approach to ensure representativeness on the selected characteristics, provided these are known, or can be estimated, in the parent population. Although such approaches exist for matched sampling designs, research on representative sampling and the similarity between the sample and the parent population seems to be lacking. In this article we propose a greedy algorithm for obtaining a representative sample and quantifying representativeness in Stata.
url	http://www.jstatsoft.org/index.php/jss/article/view/2110
work_keys_str_mv	AT evangeloskontopantelis agreedyalgorithmforrepresentativesamplingrepsampleinstata AT evangeloskontopantelis greedyalgorithmforrepresentativesamplingrepsampleinstata
_version_	1725559755230937088

A Greedy Algorithm for Representative Sampling: repsample in Stata

Similar Items