%0 Article %A Nicolas Ugolin %I Public Library of Science (PLoS) %D 2011 %G English %B PLoS ONE %@ 1932-6203 %T Strategy to find molecular signatures in a small series of rare cancers: validation for radiation-induced breast and thyroid tumors. %U http://europepmc.org/articles/PMC3154936?pdf=render %X Methods of classification using transcriptome analysis for case-by-case tumor diagnosis could be limited by tumor heterogeneity and masked information in the gene expression profiles, especially as the number of tumors is small. We propose a new strategy, EMts_2PCA, based on: 1) The identification of a gene expression signature with a great potential for discriminating subgroups of tumors (EMts stage), which includes: a) a learning step, based on an expectation-maximization (EM) algorithm, to select sets of candidate genes whose expressions discriminate two subgroups, b) a training step to select from the sets of candidate genes those with the highest potential to classify training tumors, c) the compilation of genes selected during the training step, and standardization of their levels of expression to finalize the signature. 2) The predictive classification of independent prospective tumors, according to the two subgroups of interest, by the definition of a validation space based on a two-step principal component analysis (2PCA). The present method was evaluated by classifying three series of tumors and its robustness, in terms of tumor clustering and prediction, was further compared with that of three classification methods (Gene expression bar code, Top-scoring pair(s) and a PCA-based method). Results showed that EMts_2PCA was very efficient in tumor classification and prediction, with scores always better that those obtained by the most common methods of tumor clustering. Specifically, EMts_2PCA permitted identification of highly discriminating molecular signatures to differentiate post-Chernobyl thyroid or post-radiotherapy breast tumors from their sporadic counterparts that were previously unsuccessfully classified or classified with errors.