A machine learned classifier that uses gene expression data to accurately predict estrogen receptor status.
BACKGROUND: Selecting the appropriate treatment for breast cancer requires accurately determining the estrogen receptor (ER) status of the tumor. However, the standard for determining this status, immunohistochemical analysis of formalin-fixed paraffin embedded samples, suffers from numerous technic...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2013-01-01
|
Series: | PLoS ONE |
Online Access: | http://europepmc.org/articles/PMC3846850?pdf=render |
id |
doaj-5c6469c1f5424ab9b95e6e52aa4e3386 |
---|---|
record_format |
Article |
spelling |
doaj-5c6469c1f5424ab9b95e6e52aa4e33862020-11-25T00:44:09ZengPublic Library of Science (PLoS)PLoS ONE1932-62032013-01-01812e8214410.1371/journal.pone.0082144A machine learned classifier that uses gene expression data to accurately predict estrogen receptor status.Meysam BastaniLarissa VosNasimeh AsgarianJean DeschenesKathryn GrahamJohn MackeyRussell GreinerBACKGROUND: Selecting the appropriate treatment for breast cancer requires accurately determining the estrogen receptor (ER) status of the tumor. However, the standard for determining this status, immunohistochemical analysis of formalin-fixed paraffin embedded samples, suffers from numerous technical and reproducibility issues. Assessment of ER-status based on RNA expression can provide more objective, quantitative and reproducible test results. METHODS: To learn a parsimonious RNA-based classifier of hormone receptor status, we applied a machine learning tool to a training dataset of gene expression microarray data obtained from 176 frozen breast tumors, whose ER-status was determined by applying ASCO-CAP guidelines to standardized immunohistochemical testing of formalin fixed tumor. RESULTS: This produced a three-gene classifier that can predict the ER-status of a novel tumor, with a cross-validation accuracy of 93.17±2.44%. When applied to an independent validation set and to four other public databases, some on different platforms, this classifier obtained over 90% accuracy in each. In addition, we found that this prediction rule separated the patients' recurrence-free survival curves with a hazard ratio lower than the one based on the IHC analysis of ER-status. CONCLUSIONS: Our efficient and parsimonious classifier lends itself to high throughput, highly accurate and low-cost RNA-based assessments of ER-status, suitable for routine high-throughput clinical use. This analytic method provides a proof-of-principle that may be applicable to developing effective RNA-based tests for other biomarkers and conditions.http://europepmc.org/articles/PMC3846850?pdf=render |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Meysam Bastani Larissa Vos Nasimeh Asgarian Jean Deschenes Kathryn Graham John Mackey Russell Greiner |
spellingShingle |
Meysam Bastani Larissa Vos Nasimeh Asgarian Jean Deschenes Kathryn Graham John Mackey Russell Greiner A machine learned classifier that uses gene expression data to accurately predict estrogen receptor status. PLoS ONE |
author_facet |
Meysam Bastani Larissa Vos Nasimeh Asgarian Jean Deschenes Kathryn Graham John Mackey Russell Greiner |
author_sort |
Meysam Bastani |
title |
A machine learned classifier that uses gene expression data to accurately predict estrogen receptor status. |
title_short |
A machine learned classifier that uses gene expression data to accurately predict estrogen receptor status. |
title_full |
A machine learned classifier that uses gene expression data to accurately predict estrogen receptor status. |
title_fullStr |
A machine learned classifier that uses gene expression data to accurately predict estrogen receptor status. |
title_full_unstemmed |
A machine learned classifier that uses gene expression data to accurately predict estrogen receptor status. |
title_sort |
machine learned classifier that uses gene expression data to accurately predict estrogen receptor status. |
publisher |
Public Library of Science (PLoS) |
series |
PLoS ONE |
issn |
1932-6203 |
publishDate |
2013-01-01 |
description |
BACKGROUND: Selecting the appropriate treatment for breast cancer requires accurately determining the estrogen receptor (ER) status of the tumor. However, the standard for determining this status, immunohistochemical analysis of formalin-fixed paraffin embedded samples, suffers from numerous technical and reproducibility issues. Assessment of ER-status based on RNA expression can provide more objective, quantitative and reproducible test results. METHODS: To learn a parsimonious RNA-based classifier of hormone receptor status, we applied a machine learning tool to a training dataset of gene expression microarray data obtained from 176 frozen breast tumors, whose ER-status was determined by applying ASCO-CAP guidelines to standardized immunohistochemical testing of formalin fixed tumor. RESULTS: This produced a three-gene classifier that can predict the ER-status of a novel tumor, with a cross-validation accuracy of 93.17±2.44%. When applied to an independent validation set and to four other public databases, some on different platforms, this classifier obtained over 90% accuracy in each. In addition, we found that this prediction rule separated the patients' recurrence-free survival curves with a hazard ratio lower than the one based on the IHC analysis of ER-status. CONCLUSIONS: Our efficient and parsimonious classifier lends itself to high throughput, highly accurate and low-cost RNA-based assessments of ER-status, suitable for routine high-throughput clinical use. This analytic method provides a proof-of-principle that may be applicable to developing effective RNA-based tests for other biomarkers and conditions. |
url |
http://europepmc.org/articles/PMC3846850?pdf=render |
work_keys_str_mv |
AT meysambastani amachinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus AT larissavos amachinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus AT nasimehasgarian amachinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus AT jeandeschenes amachinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus AT kathryngraham amachinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus AT johnmackey amachinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus AT russellgreiner amachinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus AT meysambastani machinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus AT larissavos machinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus AT nasimehasgarian machinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus AT jeandeschenes machinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus AT kathryngraham machinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus AT johnmackey machinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus AT russellgreiner machinelearnedclassifierthatusesgeneexpressiondatatoaccuratelypredictestrogenreceptorstatus |
_version_ |
1725276236567019520 |