VariED: an integrated database of variants and gene expression profiles for heart diseases

碩士 === 國立臺灣大學 === 生醫電子與資訊學研究所 === 107 === Heart disease is the top ten causes of death in the world and the cost of heart disease is also increasing year by year. In order to improve the understanding of heart diseases, more and more research efforts have been devoted to the heart disease researches...

Full description

Bibliographic Details
Main Authors: Li-Mei Chiang, 姜莉玫
Other Authors: Eric Y. Chuang
Format: Others
Language:en_US
Published: 2018
Online Access:http://ndltd.ncl.edu.tw/handle/t8m4dg
Description
Summary:碩士 === 國立臺灣大學 === 生醫電子與資訊學研究所 === 107 === Heart disease is the top ten causes of death in the world and the cost of heart disease is also increasing year by year. In order to improve the understanding of heart diseases, more and more research efforts have been devoted to the heart disease researches. However, it is difficult to gather heart tissue directly from human patients, and the gene expression profiles obtained from other tissues may be different from that of the heart. Thus, it is possible to obtain a pathogenic variant which is in a gene but does not express in the heart tissue. To overcome this problem and support researchers to analyze the relationship among variants, populations, and heart diseases, we developed a comprehensive database for heart diseases. As mention above, VariED provides two major functions, Expression Profiles and Variants Search. The former is used to query gene information and confirm whether the target gene expresses in heart tissue; the latter is used to obtain more detailed information of the interested variants. In this study, we developed a web-based database integrating variants and tissue-based expression profiles in heart from three species, including human, mouse and zebrafish. In addition, the population allele frequency from the 1000 Genomes Project, National Heart, Lung, and Blood Institute (NHLBI) Exome Sequencing Project (ESP), Integrative Japanese Genome Variation Database (IJGVD), and Taiwan Biobank were included. We also collected REVEL, GERP++, and CADD scores that can help to elucidate the functional roles of interested variants for diseases. Subsequently, an index scoring system was implemented in VariED. The uniqueness for the scoring system is that we consider tissue-based gene expression level as an important factor in the prediction. Lastly, to help researchers identify causative variants in diseases, a public database named as ClinVar which collected the associations between DNA variants and diseases was integrated. In this thesis, we used several examples to show the potential applications of VariED. For examples, we successfully identified a gene which does not express in heart tissue. Three Brugada syndrome-related variants were analyzed to demonstrate the usage of VariED to find pathogenic variants. We believe VariED not only assists researchers to save time for querying data, but also helps users to identify important DNA variants related to diseases.