Ligand-based Methods for Data Management and Modelling

Drug discovery is a complicated and expensive process in the billion dollar range. One way of making the drug development process more efficient is better information handling, modelling and visualisation. The majority of todays drugs are small molecules, which interact with drug targets to cause an...

Full description

Bibliographic Details
Main Author:	Alvarsson, Jonathan
Format:	Doctoral Thesis
Language:	English
Published:	Uppsala universitet, Institutionen för farmaceutisk biovetenskap 2015
Subjects:	QSAR ligand-based drug discovery bioclipse information system cheminformatics bioinformatics
Online Access:	http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-248964 http://nbn-resolving.de/urn:isbn:978-91-554-9237-3

id	ndltd-UPSALLA1-oai-DiVA.org-uu-248964
record_format	oai_dc
spelling	ndltd-UPSALLA1-oai-DiVA.org-uu-2489642015-07-08T04:51:00ZLigand-based Methods for Data Management and ModellingengAlvarsson, JonathanUppsala universitet, Institutionen för farmaceutisk biovetenskapUppsala : Acta Universitatis Upsaliensis2015QSARligand-based drug discoverybioclipseinformation systemcheminformaticsbioinformaticsDrug discovery is a complicated and expensive process in the billion dollar range. One way of making the drug development process more efficient is better information handling, modelling and visualisation. The majority of todays drugs are small molecules, which interact with drug targets to cause an effect. Since the 1980s large amounts of compounds have been systematically tested by robots in so called high-throughput screening. Ligand-based drug discovery is based on modelling drug molecules. In the field known as Quantitative Structure–Activity Relationship (QSAR) molecules are described by molecular descriptors which are used for building mathematical models. Based on these models molecular properties can be predicted and using the molecular descriptors molecules can be compared for, e.g., similarity. Bioclipse is a workbench for the life sciences which provides ligand-based tools through a point and click interface. The aims of this thesis were to research, and develop new or improved ligand-based methods and open source software, and to work towards making these tools available for users through the Bioclipse workbench. To this end, a series of molecular signature studies was done and various Bioclipse plugins were developed. An introduction to the field is provided in the thesis summary which is followed by five research papers. Paper I describes the Bioclipse 2 software and the Bioclipse scripting language. In Paper II the laboratory information system Brunn for supporting work with dose-response studies on microtiter plates is described. In Paper III the creation of a molecular fingerprint based on the molecular signature descriptor is presented and the new fingerprints are evaluated for target prediction and found to perform on par with industrial standard commercial molecular fingerprints. In Paper IV the effect of different parameter choices when using the signature fingerprint together with support vector machines (SVM) using the radial basis function (RBF) kernel is explored and reasonable default values are found. In Paper V the performance of SVM based QSAR using large datasets with the molecular signature descriptor is studied, and a QSAR model based on 1.2 million substances is created and made available from the Bioclipse workbench. Doctoral thesis, comprehensive summaryinfo:eu-repo/semantics/doctoralThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-248964urn:isbn:978-91-554-9237-3Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Pharmacy, 1651-6192 ; 200application/pdfinfo:eu-repo/semantics/openAccess
collection	NDLTD
language	English
format	Doctoral Thesis
sources	NDLTD
topic	QSAR ligand-based drug discovery bioclipse information system cheminformatics bioinformatics
spellingShingle	QSAR ligand-based drug discovery bioclipse information system cheminformatics bioinformatics Alvarsson, Jonathan Ligand-based Methods for Data Management and Modelling
description	Drug discovery is a complicated and expensive process in the billion dollar range. One way of making the drug development process more efficient is better information handling, modelling and visualisation. The majority of todays drugs are small molecules, which interact with drug targets to cause an effect. Since the 1980s large amounts of compounds have been systematically tested by robots in so called high-throughput screening. Ligand-based drug discovery is based on modelling drug molecules. In the field known as Quantitative Structure–Activity Relationship (QSAR) molecules are described by molecular descriptors which are used for building mathematical models. Based on these models molecular properties can be predicted and using the molecular descriptors molecules can be compared for, e.g., similarity. Bioclipse is a workbench for the life sciences which provides ligand-based tools through a point and click interface. The aims of this thesis were to research, and develop new or improved ligand-based methods and open source software, and to work towards making these tools available for users through the Bioclipse workbench. To this end, a series of molecular signature studies was done and various Bioclipse plugins were developed. An introduction to the field is provided in the thesis summary which is followed by five research papers. Paper I describes the Bioclipse 2 software and the Bioclipse scripting language. In Paper II the laboratory information system Brunn for supporting work with dose-response studies on microtiter plates is described. In Paper III the creation of a molecular fingerprint based on the molecular signature descriptor is presented and the new fingerprints are evaluated for target prediction and found to perform on par with industrial standard commercial molecular fingerprints. In Paper IV the effect of different parameter choices when using the signature fingerprint together with support vector machines (SVM) using the radial basis function (RBF) kernel is explored and reasonable default values are found. In Paper V the performance of SVM based QSAR using large datasets with the molecular signature descriptor is studied, and a QSAR model based on 1.2 million substances is created and made available from the Bioclipse workbench.
author	Alvarsson, Jonathan
author_facet	Alvarsson, Jonathan
author_sort	Alvarsson, Jonathan
title	Ligand-based Methods for Data Management and Modelling
title_short	Ligand-based Methods for Data Management and Modelling
title_full	Ligand-based Methods for Data Management and Modelling
title_fullStr	Ligand-based Methods for Data Management and Modelling
title_full_unstemmed	Ligand-based Methods for Data Management and Modelling
title_sort	ligand-based methods for data management and modelling
publisher	Uppsala universitet, Institutionen för farmaceutisk biovetenskap
publishDate	2015
url	http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-248964 http://nbn-resolving.de/urn:isbn:978-91-554-9237-3
work_keys_str_mv	AT alvarssonjonathan ligandbasedmethodsfordatamanagementandmodelling
_version_	1716808064256966657

Ligand-based Methods for Data Management and Modelling

Similar Items