Prediction of unconventional protein secretion by exosomes

Motivation: In eukaryotes, proteins targeted for secretion contain a signal peptide, which allows them to proceed through the conventional ER/Golgi-dependent pathway. However, an important number of proteins lacking a signal peptide can be secreted through unconventional routes, including that media...

Full description

Bibliographic Details
Main Authors: Gomez-Perosanz, M. (Author), Ras-Carmona, A. (Author), Reche, P.A (Author)
Format: Article
Language:English
Published: BioMed Central Ltd 2021
Subjects:
Online Access:View Fulltext in Publisher
LEADER 02563nam a2200469Ia 4500
001 10.1186-s12859-021-04219-z
008 220427s2021 CNT 000 0 und d
020 |a 14712105 (ISSN) 
245 1 0 |a Prediction of unconventional protein secretion by exosomes 
260 0 |b BioMed Central Ltd  |c 2021 
856 |z View Fulltext in Publisher  |u https://doi.org/10.1186/s12859-021-04219-z 
520 3 |a Motivation: In eukaryotes, proteins targeted for secretion contain a signal peptide, which allows them to proceed through the conventional ER/Golgi-dependent pathway. However, an important number of proteins lacking a signal peptide can be secreted through unconventional routes, including that mediated by exosomes. Currently, no method is available to predict protein secretion via exosomes. Results: Here, we first assembled a dataset including the sequences of 2992 proteins secreted by exosomes and 2961 proteins that are not secreted by exosomes. Subsequently, we trained different random forests models on feature vectors derived from the sequences in this dataset. In tenfold cross-validation, the best model was trained on dipeptide composition, reaching an accuracy of 69.88% ± 2.08 and an area under the curve (AUC) of 0.76 ± 0.03. In an independent dataset, this model reached an accuracy of 75.73% and an AUC of 0.840. After these results, we developed ExoPred, a web-based tool that uses random forests to predict protein secretion by exosomes. Conclusion: ExoPred is available for free public use at http://imath.med.ucm.es/exopred/. Datasets are available at http://imath.med.ucm.es/exopred/datasets/. © 2021, The Author(s). 
650 0 4 |a Area under the curves 
650 0 4 |a Best model 
650 0 4 |a Cross validation 
650 0 4 |a Decision trees 
650 0 4 |a Dipeptide composition 
650 0 4 |a exosome 
650 0 4 |a Exosomes 
650 0 4 |a Exosomes 
650 0 4 |a Feature vectors 
650 0 4 |a Forecasting 
650 0 4 |a Golgi Apparatus 
650 0 4 |a Golgi complex 
650 0 4 |a metabolism 
650 0 4 |a Peptides 
650 0 4 |a protein 
650 0 4 |a Protein secretion 
650 0 4 |a Protein secretion 
650 0 4 |a Protein Sorting Signals 
650 0 4 |a protein transport 
650 0 4 |a Protein Transport 
650 0 4 |a Proteins 
650 0 4 |a Random forests 
650 0 4 |a Random forests 
650 0 4 |a signal peptide 
650 0 4 |a Signal peptide 
650 0 4 |a Web-based tools 
700 1 |a Gomez-Perosanz, M.  |e author 
700 1 |a Ras-Carmona, A.  |e author 
700 1 |a Reche, P.A.  |e author 
773 |t BMC Bioinformatics