Ligand-based Methods for Data Management and Modelling

Drug discovery is a complicated and expensive process in the billion dollar range. One way of making the drug development process more efficient is better information handling, modelling and visualisation. The majority of todays drugs are small molecules, which interact with drug targets to cause an...

Full description

Bibliographic Details
Main Author: Alvarsson, Jonathan
Format: Doctoral Thesis
Language:English
Published: Uppsala universitet, Institutionen för farmaceutisk biovetenskap 2015
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-248964
http://nbn-resolving.de/urn:isbn:978-91-554-9237-3
id ndltd-UPSALLA1-oai-DiVA.org-uu-248964
record_format oai_dc
spelling ndltd-UPSALLA1-oai-DiVA.org-uu-2489642015-07-08T04:51:00ZLigand-based Methods for Data Management and ModellingengAlvarsson, JonathanUppsala universitet, Institutionen för farmaceutisk biovetenskapUppsala : Acta Universitatis Upsaliensis2015QSARligand-based drug discoverybioclipseinformation systemcheminformaticsbioinformaticsDrug discovery is a complicated and expensive process in the billion dollar range. One way of making the drug development process more efficient is better information handling, modelling and visualisation. The majority of todays drugs are small molecules, which interact with drug targets to cause an effect. Since the 1980s large amounts of compounds have been systematically tested by robots in so called high-throughput screening. Ligand-based drug discovery is based on modelling drug molecules. In the field known as Quantitative Structure–Activity Relationship (QSAR) molecules are described by molecular descriptors which are used for building mathematical models. Based on these models molecular properties can be predicted and using the molecular descriptors molecules can be compared for, e.g., similarity. Bioclipse is a workbench for the life sciences which provides ligand-based tools through a point and click interface.  The aims of this thesis were to research, and develop new or improved ligand-based methods and open source software, and to work towards making these tools available for users through the Bioclipse workbench. To this end, a series of molecular signature studies was done and various Bioclipse plugins were developed. An introduction to the field is provided in the thesis summary which is followed by five research papers. Paper I describes the Bioclipse 2 software and the Bioclipse scripting language. In Paper II the laboratory information system Brunn for supporting work with dose-response studies on microtiter plates is described. In Paper III the creation of a molecular fingerprint based on the molecular signature descriptor is presented and the new fingerprints are evaluated for target prediction and found to perform on par with industrial standard commercial molecular fingerprints. In Paper IV the effect of different parameter choices when using the signature fingerprint together with support vector machines (SVM) using the radial basis function (RBF) kernel is explored and reasonable default values are found. In Paper V the performance of SVM based QSAR using large datasets with the molecular signature descriptor is studied, and a QSAR model based on 1.2 million substances is created and made available from the Bioclipse workbench. Doctoral thesis, comprehensive summaryinfo:eu-repo/semantics/doctoralThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-248964urn:isbn:978-91-554-9237-3Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Pharmacy, 1651-6192 ; 200application/pdfinfo:eu-repo/semantics/openAccess
collection NDLTD
language English
format Doctoral Thesis
sources NDLTD
topic QSAR
ligand-based drug discovery
bioclipse
information system
cheminformatics
bioinformatics
spellingShingle QSAR
ligand-based drug discovery
bioclipse
information system
cheminformatics
bioinformatics
Alvarsson, Jonathan
Ligand-based Methods for Data Management and Modelling
description Drug discovery is a complicated and expensive process in the billion dollar range. One way of making the drug development process more efficient is better information handling, modelling and visualisation. The majority of todays drugs are small molecules, which interact with drug targets to cause an effect. Since the 1980s large amounts of compounds have been systematically tested by robots in so called high-throughput screening. Ligand-based drug discovery is based on modelling drug molecules. In the field known as Quantitative Structure–Activity Relationship (QSAR) molecules are described by molecular descriptors which are used for building mathematical models. Based on these models molecular properties can be predicted and using the molecular descriptors molecules can be compared for, e.g., similarity. Bioclipse is a workbench for the life sciences which provides ligand-based tools through a point and click interface.  The aims of this thesis were to research, and develop new or improved ligand-based methods and open source software, and to work towards making these tools available for users through the Bioclipse workbench. To this end, a series of molecular signature studies was done and various Bioclipse plugins were developed. An introduction to the field is provided in the thesis summary which is followed by five research papers. Paper I describes the Bioclipse 2 software and the Bioclipse scripting language. In Paper II the laboratory information system Brunn for supporting work with dose-response studies on microtiter plates is described. In Paper III the creation of a molecular fingerprint based on the molecular signature descriptor is presented and the new fingerprints are evaluated for target prediction and found to perform on par with industrial standard commercial molecular fingerprints. In Paper IV the effect of different parameter choices when using the signature fingerprint together with support vector machines (SVM) using the radial basis function (RBF) kernel is explored and reasonable default values are found. In Paper V the performance of SVM based QSAR using large datasets with the molecular signature descriptor is studied, and a QSAR model based on 1.2 million substances is created and made available from the Bioclipse workbench.
author Alvarsson, Jonathan
author_facet Alvarsson, Jonathan
author_sort Alvarsson, Jonathan
title Ligand-based Methods for Data Management and Modelling
title_short Ligand-based Methods for Data Management and Modelling
title_full Ligand-based Methods for Data Management and Modelling
title_fullStr Ligand-based Methods for Data Management and Modelling
title_full_unstemmed Ligand-based Methods for Data Management and Modelling
title_sort ligand-based methods for data management and modelling
publisher Uppsala universitet, Institutionen för farmaceutisk biovetenskap
publishDate 2015
url http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-248964
http://nbn-resolving.de/urn:isbn:978-91-554-9237-3
work_keys_str_mv AT alvarssonjonathan ligandbasedmethodsfordatamanagementandmodelling
_version_ 1716808064256966657