Introducing an efficient sampling method for national surveys with limited sample sizes: application to a national study to determine quality and cost of healthcare

Abstract Background Sampling a small number of participants from an entire country is not straightforward. In this case, researchers reluctantly sample from a single setting or few settings, which limits the generalizability of findings. Therefore, there is a need to design efficient sampling method...

Full description

Bibliographic Details
Main Authors: Mahboubeh Parsaeian, Mahdi Mahdavi, Mojdeh Saadati, Parinaz Mehdipour, Ali Sheidaei, Shahab Khatibzadeh, Farshad Farzadfar, Saeid Shahraz
Format: Article
Language:English
Published: BMC 2021-07-01
Series:BMC Public Health
Subjects:
Online Access:https://doi.org/10.1186/s12889-021-11441-0
id doaj-062037be613b47d9b89eeee8b2964df4
record_format Article
spelling doaj-062037be613b47d9b89eeee8b2964df42021-07-18T11:15:10ZengBMCBMC Public Health1471-24582021-07-0121111010.1186/s12889-021-11441-0Introducing an efficient sampling method for national surveys with limited sample sizes: application to a national study to determine quality and cost of healthcareMahboubeh Parsaeian0Mahdi Mahdavi1Mojdeh Saadati2Parinaz Mehdipour3Ali Sheidaei4Shahab Khatibzadeh5Farshad Farzadfar6Saeid Shahraz7Department of Epidemiology and Biostatistics, School of Public Health, Tehran University of Medical SciencesNational Institute of Health Research (NIHR), Tehran University of Medical SciencesDepartment of Computer Science, Iowa State UniversityNon-Communicable Diseases Research Center, Endocrinology and Metabolism Population Sciences Institute, Tehran University of Medical SciencesDepartment of Epidemiology and Biostatistics, School of Public Health, Tehran University of Medical SciencesHeller School for Social Policy and Management, Brandeis UniversityNon-Communicable Diseases Research Center, Endocrinology and Metabolism Population Sciences Institute, Tehran University of Medical SciencesInstitute for Clinical Research and Health Policy Studies, Tufts Medical CenterAbstract Background Sampling a small number of participants from an entire country is not straightforward. In this case, researchers reluctantly sample from a single setting or few settings, which limits the generalizability of findings. Therefore, there is a need to design efficient sampling method for small sample size surveys that can produce generalizable results at the country level. Methods Data comprised of twenty proxy variables to measure health services demands, structures, and outcomes of 413 districts of Iran. We used two data mining methods (hierarchical clustering method (HCM) and model-based clustering method (MCM)) to create homogenous groups of districts, i.e., strata based on these variables. We compared the internal and stability validity of the methods by statistical indices. An expert group checked the face validity of the methods, particularly regarding the total number of strata and the combination of districts in each stratum. The efficiency of selected method, which is measured by the inverse of variance, was compared with a simple random sampling (SRS) through simulation. The sampling design was tested in a national study in Iran, which aimed to evaluate the quality and costs of medical care for eight selected diseases by only recruiting 300 participants per disease at the country level. Results MCM and HCM divided the districts into eight and two clusters, respectively. The measures of internal and stability validity showed that clusters created by MCM were more separated, compact, and stable, thus forming our optimum strata. The probability of death from stroke, chronic obstructive pulmonary disease, and in-hospital mortality rate were the most important indicators that distinguished the eight strata. Based on the simulation results, MCM increased the efficiency of the sampling design up to 1.7 times compared to SRS. Conclusions The use of data mining improved the efficiency of sampling up to 1.7 times greater than SRS and markedly reduced the number of strata to eight in the entire country. The proposed sampling design also identified key variables that could be used to classify districts in Iran for sampling from these target populations in the future studies.https://doi.org/10.1186/s12889-021-11441-0Survey sampling methodSmall sample sizeModel-based clusteringValiditySampling efficiencyIran quality of Care in Medicine Program (IQCAMP)
collection DOAJ
language English
format Article
sources DOAJ
author Mahboubeh Parsaeian
Mahdi Mahdavi
Mojdeh Saadati
Parinaz Mehdipour
Ali Sheidaei
Shahab Khatibzadeh
Farshad Farzadfar
Saeid Shahraz
spellingShingle Mahboubeh Parsaeian
Mahdi Mahdavi
Mojdeh Saadati
Parinaz Mehdipour
Ali Sheidaei
Shahab Khatibzadeh
Farshad Farzadfar
Saeid Shahraz
Introducing an efficient sampling method for national surveys with limited sample sizes: application to a national study to determine quality and cost of healthcare
BMC Public Health
Survey sampling method
Small sample size
Model-based clustering
Validity
Sampling efficiency
Iran quality of Care in Medicine Program (IQCAMP)
author_facet Mahboubeh Parsaeian
Mahdi Mahdavi
Mojdeh Saadati
Parinaz Mehdipour
Ali Sheidaei
Shahab Khatibzadeh
Farshad Farzadfar
Saeid Shahraz
author_sort Mahboubeh Parsaeian
title Introducing an efficient sampling method for national surveys with limited sample sizes: application to a national study to determine quality and cost of healthcare
title_short Introducing an efficient sampling method for national surveys with limited sample sizes: application to a national study to determine quality and cost of healthcare
title_full Introducing an efficient sampling method for national surveys with limited sample sizes: application to a national study to determine quality and cost of healthcare
title_fullStr Introducing an efficient sampling method for national surveys with limited sample sizes: application to a national study to determine quality and cost of healthcare
title_full_unstemmed Introducing an efficient sampling method for national surveys with limited sample sizes: application to a national study to determine quality and cost of healthcare
title_sort introducing an efficient sampling method for national surveys with limited sample sizes: application to a national study to determine quality and cost of healthcare
publisher BMC
series BMC Public Health
issn 1471-2458
publishDate 2021-07-01
description Abstract Background Sampling a small number of participants from an entire country is not straightforward. In this case, researchers reluctantly sample from a single setting or few settings, which limits the generalizability of findings. Therefore, there is a need to design efficient sampling method for small sample size surveys that can produce generalizable results at the country level. Methods Data comprised of twenty proxy variables to measure health services demands, structures, and outcomes of 413 districts of Iran. We used two data mining methods (hierarchical clustering method (HCM) and model-based clustering method (MCM)) to create homogenous groups of districts, i.e., strata based on these variables. We compared the internal and stability validity of the methods by statistical indices. An expert group checked the face validity of the methods, particularly regarding the total number of strata and the combination of districts in each stratum. The efficiency of selected method, which is measured by the inverse of variance, was compared with a simple random sampling (SRS) through simulation. The sampling design was tested in a national study in Iran, which aimed to evaluate the quality and costs of medical care for eight selected diseases by only recruiting 300 participants per disease at the country level. Results MCM and HCM divided the districts into eight and two clusters, respectively. The measures of internal and stability validity showed that clusters created by MCM were more separated, compact, and stable, thus forming our optimum strata. The probability of death from stroke, chronic obstructive pulmonary disease, and in-hospital mortality rate were the most important indicators that distinguished the eight strata. Based on the simulation results, MCM increased the efficiency of the sampling design up to 1.7 times compared to SRS. Conclusions The use of data mining improved the efficiency of sampling up to 1.7 times greater than SRS and markedly reduced the number of strata to eight in the entire country. The proposed sampling design also identified key variables that could be used to classify districts in Iran for sampling from these target populations in the future studies.
topic Survey sampling method
Small sample size
Model-based clustering
Validity
Sampling efficiency
Iran quality of Care in Medicine Program (IQCAMP)
url https://doi.org/10.1186/s12889-021-11441-0
work_keys_str_mv AT mahboubehparsaeian introducinganefficientsamplingmethodfornationalsurveyswithlimitedsamplesizesapplicationtoanationalstudytodeterminequalityandcostofhealthcare
AT mahdimahdavi introducinganefficientsamplingmethodfornationalsurveyswithlimitedsamplesizesapplicationtoanationalstudytodeterminequalityandcostofhealthcare
AT mojdehsaadati introducinganefficientsamplingmethodfornationalsurveyswithlimitedsamplesizesapplicationtoanationalstudytodeterminequalityandcostofhealthcare
AT parinazmehdipour introducinganefficientsamplingmethodfornationalsurveyswithlimitedsamplesizesapplicationtoanationalstudytodeterminequalityandcostofhealthcare
AT alisheidaei introducinganefficientsamplingmethodfornationalsurveyswithlimitedsamplesizesapplicationtoanationalstudytodeterminequalityandcostofhealthcare
AT shahabkhatibzadeh introducinganefficientsamplingmethodfornationalsurveyswithlimitedsamplesizesapplicationtoanationalstudytodeterminequalityandcostofhealthcare
AT farshadfarzadfar introducinganefficientsamplingmethodfornationalsurveyswithlimitedsamplesizesapplicationtoanationalstudytodeterminequalityandcostofhealthcare
AT saeidshahraz introducinganefficientsamplingmethodfornationalsurveyswithlimitedsamplesizesapplicationtoanationalstudytodeterminequalityandcostofhealthcare
_version_ 1721296304483598336