|
|
|
|
LEADER |
02448nam a2200565Ia 4500 |
001 |
10.1186-s12859-021-04360-9 |
008 |
220427s2021 CNT 000 0 und d |
020 |
|
|
|a 14712105 (ISSN)
|
245 |
1 |
0 |
|a Identifying homogeneous subgroups of patients and important features: a topological machine learning approach
|
260 |
|
0 |
|b BioMed Central Ltd
|c 2021
|
856 |
|
|
|z View Fulltext in Publisher
|u https://doi.org/10.1186/s12859-021-04360-9
|
520 |
3 |
|
|a Background: This paper exploits recent developments in topological data analysis to present a pipeline for clustering based on Mapper, an algorithm that reduces complex data into a one-dimensional graph. Results: We present a pipeline to identify and summarise clusters based on statistically significant topological features from a point cloud using Mapper. Conclusions: Key strengths of this pipeline include the integration of prior knowledge to inform the clustering process and the selection of optimal clusters; the use of the bootstrap to restrict the search to robust topological features; the use of machine learning to inspect clusters; and the ability to incorporate mixed data types. Our pipeline can be downloaded under the GNU GPLv3 license at https://github.com/kcl-bhi/mapper-pipeline. © 2021, The Author(s).
|
650 |
0 |
4 |
|a adult
|
650 |
0 |
4 |
|a algorithm
|
650 |
0 |
4 |
|a Algorithms
|
650 |
0 |
4 |
|a article
|
650 |
0 |
4 |
|a bootstrapping
|
650 |
0 |
4 |
|a cluster analysis
|
650 |
0 |
4 |
|a Cluster Analysis
|
650 |
0 |
4 |
|a Clustering
|
650 |
0 |
4 |
|a Clustering algorithms
|
650 |
0 |
4 |
|a Clustering process
|
650 |
0 |
4 |
|a Complex data
|
650 |
0 |
4 |
|a data analysis
|
650 |
0 |
4 |
|a data analysis
|
650 |
0 |
4 |
|a Data Analysis
|
650 |
0 |
4 |
|a Graph algorithms
|
650 |
0 |
4 |
|a human
|
650 |
0 |
4 |
|a Humans
|
650 |
0 |
4 |
|a Important features
|
650 |
0 |
4 |
|a licence
|
650 |
0 |
4 |
|a machine learning
|
650 |
0 |
4 |
|a Machine learning
|
650 |
0 |
4 |
|a Machine learning
|
650 |
0 |
4 |
|a Machine Learning
|
650 |
0 |
4 |
|a Machine learning approaches
|
650 |
0 |
4 |
|a Mixed data types
|
650 |
0 |
4 |
|a pipeline
|
650 |
0 |
4 |
|a Pipelines
|
650 |
0 |
4 |
|a Prior knowledge
|
650 |
0 |
4 |
|a Topological data analysis
|
650 |
0 |
4 |
|a Topological data analysis
|
650 |
0 |
4 |
|a Topological features
|
650 |
0 |
4 |
|a Topology
|
700 |
1 |
|
|a Carr, E.
|e author
|
700 |
1 |
|
|a Carrière, M.
|e author
|
700 |
1 |
|
|a Chazal, F.
|e author
|
700 |
1 |
|
|a Iniesta, R.
|e author
|
700 |
1 |
|
|a Michel, B.
|e author
|
773 |
|
|
|t BMC Bioinformatics
|