Machine learning on genome-wide association studies to predict the risk of radiation-associated contralateral breast cancer in the WECARE Study.

The purpose of this study was to identify germline single nucleotide polymorphisms (SNPs) that optimally predict radiation-associated contralateral breast cancer (RCBC) and to provide new biological insights into the carcinogenic process. Fifty-two women with contralateral breast cancer and 153 wome...

Full description

Bibliographic Details
Main Authors: Sangkyu Lee, Xiaolin Liang, Meghan Woods, Anne S Reiner, Patrick Concannon, Leslie Bernstein, Charles F Lynch, John D Boice, Joseph O Deasy, Jonine L Bernstein, Jung Hun Oh
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2020-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0226157
id doaj-202964ef6c624417b13537196a0dd099
record_format Article
spelling doaj-202964ef6c624417b13537196a0dd0992021-04-30T04:30:54ZengPublic Library of Science (PLoS)PLoS ONE1932-62032020-01-01152e022615710.1371/journal.pone.0226157Machine learning on genome-wide association studies to predict the risk of radiation-associated contralateral breast cancer in the WECARE Study.Sangkyu LeeXiaolin LiangMeghan WoodsAnne S ReinerPatrick ConcannonLeslie BernsteinCharles F LynchJohn D BoiceJoseph O DeasyJonine L BernsteinJung Hun OhThe purpose of this study was to identify germline single nucleotide polymorphisms (SNPs) that optimally predict radiation-associated contralateral breast cancer (RCBC) and to provide new biological insights into the carcinogenic process. Fifty-two women with contralateral breast cancer and 153 women with unilateral breast cancer were identified within the Women's Environmental Cancer and Radiation Epidemiology (WECARE) Study who were at increased risk of RCBC because they were ≤ 40 years of age at first diagnosis of breast cancer and received a scatter radiation dose > 1 Gy to the contralateral breast. A previously reported algorithm, preconditioned random forest regression, was applied to predict the risk of developing RCBC. The resulting model produced an area under the curve (AUC) of 0.62 (p = 0.04) on hold-out validation data. The biological analysis identified the cyclic AMP-mediated signaling and Ephrin-A as significant biological correlates, which were previously shown to influence cell survival after radiation in an ATM-dependent manner. The key connected genes and proteins that are identified in this analysis were previously identified as relevant to breast cancer, radiation response, or both. In summary, machine learning/bioinformatics methods applied to genome-wide genotyping data have great potential to reveal plausible biological correlates associated with the risk of RCBC.https://doi.org/10.1371/journal.pone.0226157
collection DOAJ
language English
format Article
sources DOAJ
author Sangkyu Lee
Xiaolin Liang
Meghan Woods
Anne S Reiner
Patrick Concannon
Leslie Bernstein
Charles F Lynch
John D Boice
Joseph O Deasy
Jonine L Bernstein
Jung Hun Oh
spellingShingle Sangkyu Lee
Xiaolin Liang
Meghan Woods
Anne S Reiner
Patrick Concannon
Leslie Bernstein
Charles F Lynch
John D Boice
Joseph O Deasy
Jonine L Bernstein
Jung Hun Oh
Machine learning on genome-wide association studies to predict the risk of radiation-associated contralateral breast cancer in the WECARE Study.
PLoS ONE
author_facet Sangkyu Lee
Xiaolin Liang
Meghan Woods
Anne S Reiner
Patrick Concannon
Leslie Bernstein
Charles F Lynch
John D Boice
Joseph O Deasy
Jonine L Bernstein
Jung Hun Oh
author_sort Sangkyu Lee
title Machine learning on genome-wide association studies to predict the risk of radiation-associated contralateral breast cancer in the WECARE Study.
title_short Machine learning on genome-wide association studies to predict the risk of radiation-associated contralateral breast cancer in the WECARE Study.
title_full Machine learning on genome-wide association studies to predict the risk of radiation-associated contralateral breast cancer in the WECARE Study.
title_fullStr Machine learning on genome-wide association studies to predict the risk of radiation-associated contralateral breast cancer in the WECARE Study.
title_full_unstemmed Machine learning on genome-wide association studies to predict the risk of radiation-associated contralateral breast cancer in the WECARE Study.
title_sort machine learning on genome-wide association studies to predict the risk of radiation-associated contralateral breast cancer in the wecare study.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2020-01-01
description The purpose of this study was to identify germline single nucleotide polymorphisms (SNPs) that optimally predict radiation-associated contralateral breast cancer (RCBC) and to provide new biological insights into the carcinogenic process. Fifty-two women with contralateral breast cancer and 153 women with unilateral breast cancer were identified within the Women's Environmental Cancer and Radiation Epidemiology (WECARE) Study who were at increased risk of RCBC because they were ≤ 40 years of age at first diagnosis of breast cancer and received a scatter radiation dose > 1 Gy to the contralateral breast. A previously reported algorithm, preconditioned random forest regression, was applied to predict the risk of developing RCBC. The resulting model produced an area under the curve (AUC) of 0.62 (p = 0.04) on hold-out validation data. The biological analysis identified the cyclic AMP-mediated signaling and Ephrin-A as significant biological correlates, which were previously shown to influence cell survival after radiation in an ATM-dependent manner. The key connected genes and proteins that are identified in this analysis were previously identified as relevant to breast cancer, radiation response, or both. In summary, machine learning/bioinformatics methods applied to genome-wide genotyping data have great potential to reveal plausible biological correlates associated with the risk of RCBC.
url https://doi.org/10.1371/journal.pone.0226157
work_keys_str_mv AT sangkyulee machinelearningongenomewideassociationstudiestopredicttheriskofradiationassociatedcontralateralbreastcancerinthewecarestudy
AT xiaolinliang machinelearningongenomewideassociationstudiestopredicttheriskofradiationassociatedcontralateralbreastcancerinthewecarestudy
AT meghanwoods machinelearningongenomewideassociationstudiestopredicttheriskofradiationassociatedcontralateralbreastcancerinthewecarestudy
AT annesreiner machinelearningongenomewideassociationstudiestopredicttheriskofradiationassociatedcontralateralbreastcancerinthewecarestudy
AT patrickconcannon machinelearningongenomewideassociationstudiestopredicttheriskofradiationassociatedcontralateralbreastcancerinthewecarestudy
AT lesliebernstein machinelearningongenomewideassociationstudiestopredicttheriskofradiationassociatedcontralateralbreastcancerinthewecarestudy
AT charlesflynch machinelearningongenomewideassociationstudiestopredicttheriskofradiationassociatedcontralateralbreastcancerinthewecarestudy
AT johndboice machinelearningongenomewideassociationstudiestopredicttheriskofradiationassociatedcontralateralbreastcancerinthewecarestudy
AT josephodeasy machinelearningongenomewideassociationstudiestopredicttheriskofradiationassociatedcontralateralbreastcancerinthewecarestudy
AT joninelbernstein machinelearningongenomewideassociationstudiestopredicttheriskofradiationassociatedcontralateralbreastcancerinthewecarestudy
AT junghunoh machinelearningongenomewideassociationstudiestopredicttheriskofradiationassociatedcontralateralbreastcancerinthewecarestudy
_version_ 1714648522276470784