Clustering for 2D chemical structures

The clustering of chemical structures is important and widely used in several areas of chemoinformatics. A little-discussed aspect of clustering is standardization, it ensures all descriptors in a chemical representation make a comparable contribution to the measurement of similarity. The initial st...

Full description

Bibliographic Details
Main Author: Chu, Chia-Wei
Published: University of Sheffield 2011
Subjects:
541
Online Access:http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.531140
id ndltd-bl.uk-oai-ethos.bl.uk-531140
record_format oai_dc
spelling ndltd-bl.uk-oai-ethos.bl.uk-5311402016-06-21T03:28:07ZClustering for 2D chemical structuresChu, Chia-Wei2011The clustering of chemical structures is important and widely used in several areas of chemoinformatics. A little-discussed aspect of clustering is standardization, it ensures all descriptors in a chemical representation make a comparable contribution to the measurement of similarity. The initial study compares the effectiveness of seven different standardization procedures that have been suggested previously, the results were also compared with unstandardized datasets. It was found that no one standardization method offered consistently the best performance. Comparative studies of clustering effectiveness are helpful in providing suitability and guidelines of different methods. In order to examine the suitability of different clustering methods for the application in chemoinformatics, especially those had not previously been applied to chemoinformatics, the second piece of study carries out an effectiveness comparison of nine clustering methods. However, the result revealed that it is unlikely that a single clustering method can provide consistently the best partition under all circumstances. Consensus clustering is a technique to combine multiple input partitions of the same set of objects to achieve a single clustering that is expected to provide a more robust and more generally effective representation of the partitions that are submitted. The third piece of study reports the use of seven different consensus clustering methods which had not previously been used on sets of chemical compounds represented by 2D fingerprints. Their effectiveness was compared with some traditional clustering methods discussed in the second study. It was observed that no consistently best consensus clustering method was found.541University of Sheffieldhttp://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.531140http://etheses.whiterose.ac.uk/12810/Electronic Thesis or Dissertation
collection NDLTD
sources NDLTD
topic 541
spellingShingle 541
Chu, Chia-Wei
Clustering for 2D chemical structures
description The clustering of chemical structures is important and widely used in several areas of chemoinformatics. A little-discussed aspect of clustering is standardization, it ensures all descriptors in a chemical representation make a comparable contribution to the measurement of similarity. The initial study compares the effectiveness of seven different standardization procedures that have been suggested previously, the results were also compared with unstandardized datasets. It was found that no one standardization method offered consistently the best performance. Comparative studies of clustering effectiveness are helpful in providing suitability and guidelines of different methods. In order to examine the suitability of different clustering methods for the application in chemoinformatics, especially those had not previously been applied to chemoinformatics, the second piece of study carries out an effectiveness comparison of nine clustering methods. However, the result revealed that it is unlikely that a single clustering method can provide consistently the best partition under all circumstances. Consensus clustering is a technique to combine multiple input partitions of the same set of objects to achieve a single clustering that is expected to provide a more robust and more generally effective representation of the partitions that are submitted. The third piece of study reports the use of seven different consensus clustering methods which had not previously been used on sets of chemical compounds represented by 2D fingerprints. Their effectiveness was compared with some traditional clustering methods discussed in the second study. It was observed that no consistently best consensus clustering method was found.
author Chu, Chia-Wei
author_facet Chu, Chia-Wei
author_sort Chu, Chia-Wei
title Clustering for 2D chemical structures
title_short Clustering for 2D chemical structures
title_full Clustering for 2D chemical structures
title_fullStr Clustering for 2D chemical structures
title_full_unstemmed Clustering for 2D chemical structures
title_sort clustering for 2d chemical structures
publisher University of Sheffield
publishDate 2011
url http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.531140
work_keys_str_mv AT chuchiawei clusteringfor2dchemicalstructures
_version_ 1718312780440797184