Nonparametric Subgroup Identification by PRIM and CART: A Simulation and Application Study

Two nonparametric methods for the identification of subgroups with outstanding outcome values are described and compared to each other in a simulation study and an application to clinical data. The Patient Rule Induction Method (PRIM) searches for box-shaped areas in the given data which exceed a mi...

Full description

Bibliographic Details
Main Authors: Armin Ott, Alexander Hapfelmeier
Format: Article
Language:English
Published: Hindawi Limited 2017-01-01
Series:Computational and Mathematical Methods in Medicine
Online Access:http://dx.doi.org/10.1155/2017/5271091
id doaj-8b2614f040554cbb82803134cde83dae
record_format Article
spelling doaj-8b2614f040554cbb82803134cde83dae2020-11-24T23:48:55ZengHindawi LimitedComputational and Mathematical Methods in Medicine1748-670X1748-67182017-01-01201710.1155/2017/52710915271091Nonparametric Subgroup Identification by PRIM and CART: A Simulation and Application StudyArmin Ott0Alexander Hapfelmeier1Institute of Medical Statistics and Epidemiology, Technische Universität München, Ismaninger Str. 22, 81675 Munich, GermanyInstitute of Medical Statistics and Epidemiology, Technische Universität München, Ismaninger Str. 22, 81675 Munich, GermanyTwo nonparametric methods for the identification of subgroups with outstanding outcome values are described and compared to each other in a simulation study and an application to clinical data. The Patient Rule Induction Method (PRIM) searches for box-shaped areas in the given data which exceed a minimal size and average outcome. This is achieved via a combination of iterative peeling and pasting steps, where small fractions of the data are removed or added to the current box. As an alternative, Classification and Regression Trees (CART) prediction models perform sequential binary splits of the data to produce subsets which can be interpreted as subgroups of heterogeneous outcome. PRIM and CART were compared in a simulation study to investigate their strengths and weaknesses under various data settings, taking different performance measures into account. PRIM was shown to be superior in rather complex settings such as those with few observations, a smaller signal-to-noise ratio, and more than one subgroup. CART showed the best performance in simpler situations. A practical application of the two methods was illustrated using a clinical data set. For this application, both methods produced similar results but the higher amount of user involvement of PRIM became apparent. PRIM can be flexibly tuned by the user, whereas CART, although simpler to implement, is rather static.http://dx.doi.org/10.1155/2017/5271091
collection DOAJ
language English
format Article
sources DOAJ
author Armin Ott
Alexander Hapfelmeier
spellingShingle Armin Ott
Alexander Hapfelmeier
Nonparametric Subgroup Identification by PRIM and CART: A Simulation and Application Study
Computational and Mathematical Methods in Medicine
author_facet Armin Ott
Alexander Hapfelmeier
author_sort Armin Ott
title Nonparametric Subgroup Identification by PRIM and CART: A Simulation and Application Study
title_short Nonparametric Subgroup Identification by PRIM and CART: A Simulation and Application Study
title_full Nonparametric Subgroup Identification by PRIM and CART: A Simulation and Application Study
title_fullStr Nonparametric Subgroup Identification by PRIM and CART: A Simulation and Application Study
title_full_unstemmed Nonparametric Subgroup Identification by PRIM and CART: A Simulation and Application Study
title_sort nonparametric subgroup identification by prim and cart: a simulation and application study
publisher Hindawi Limited
series Computational and Mathematical Methods in Medicine
issn 1748-670X
1748-6718
publishDate 2017-01-01
description Two nonparametric methods for the identification of subgroups with outstanding outcome values are described and compared to each other in a simulation study and an application to clinical data. The Patient Rule Induction Method (PRIM) searches for box-shaped areas in the given data which exceed a minimal size and average outcome. This is achieved via a combination of iterative peeling and pasting steps, where small fractions of the data are removed or added to the current box. As an alternative, Classification and Regression Trees (CART) prediction models perform sequential binary splits of the data to produce subsets which can be interpreted as subgroups of heterogeneous outcome. PRIM and CART were compared in a simulation study to investigate their strengths and weaknesses under various data settings, taking different performance measures into account. PRIM was shown to be superior in rather complex settings such as those with few observations, a smaller signal-to-noise ratio, and more than one subgroup. CART showed the best performance in simpler situations. A practical application of the two methods was illustrated using a clinical data set. For this application, both methods produced similar results but the higher amount of user involvement of PRIM became apparent. PRIM can be flexibly tuned by the user, whereas CART, although simpler to implement, is rather static.
url http://dx.doi.org/10.1155/2017/5271091
work_keys_str_mv AT arminott nonparametricsubgroupidentificationbyprimandcartasimulationandapplicationstudy
AT alexanderhapfelmeier nonparametricsubgroupidentificationbyprimandcartasimulationandapplicationstudy
_version_ 1725483910600589312