Corpus construction based on Ontological domain knowledge

The purpose of this thesis is to contribute a corpus for sentence level interpretation of biomedical language. The available corpora for the biomedical domain are small in terms of amount of text and predicates. Besides that these corpora are developed rather intuitively. In this effort which we cal...

Full description

Bibliographic Details
Main Authors: Benis, Nirupama, Kaliyaperumal, Rajaram
Format: Others
Language:English
Published: Linköpings universitet, Institutionen för datavetenskap 2011
Subjects:
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-71851
id ndltd-UPSALLA1-oai-DiVA.org-liu-71851
record_format oai_dc
spelling ndltd-UPSALLA1-oai-DiVA.org-liu-718512013-01-08T13:50:49ZCorpus construction based on Ontological domain knowledgeengBenis, NirupamaKaliyaperumal, RajaramLinköpings universitet, Institutionen för datavetenskapLinköpings universitet, Institutionen för datavetenskap2011Text miningBiomedical text miningNatural Language ProcessingThe purpose of this thesis is to contribute a corpus for sentence level interpretation of biomedical language. The available corpora for the biomedical domain are small in terms of amount of text and predicates. Besides that these corpora are developed rather intuitively. In this effort which we call BioOntoFN, we created a corpus from the domain knowledge provided by an ontology. By doing this we believe that we can provide a rough set of rules to create corpora from ontologies. Besides that we also designed an annotation tool specifically for building our corpus. We built a corpus for biological transport events. The ontology we used is the piece of Gene Ontology pertaining to transport, the term transport GO: 0006810 and all of its child concepts, which could be called a sub-ontology. The annotation of the corpus follows the rules of FrameNet and the output is annotated text that is in an XML format similar to that of FrameNet. The text for the corpus is taken from abstracts of MEDLINE articles. The annotation tool is a GUI created using Java. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-71851application/pdfinfo:eu-repo/semantics/openAccess
collection NDLTD
language English
format Others
sources NDLTD
topic Text mining
Biomedical text mining
Natural Language Processing
spellingShingle Text mining
Biomedical text mining
Natural Language Processing
Benis, Nirupama
Kaliyaperumal, Rajaram
Corpus construction based on Ontological domain knowledge
description The purpose of this thesis is to contribute a corpus for sentence level interpretation of biomedical language. The available corpora for the biomedical domain are small in terms of amount of text and predicates. Besides that these corpora are developed rather intuitively. In this effort which we call BioOntoFN, we created a corpus from the domain knowledge provided by an ontology. By doing this we believe that we can provide a rough set of rules to create corpora from ontologies. Besides that we also designed an annotation tool specifically for building our corpus. We built a corpus for biological transport events. The ontology we used is the piece of Gene Ontology pertaining to transport, the term transport GO: 0006810 and all of its child concepts, which could be called a sub-ontology. The annotation of the corpus follows the rules of FrameNet and the output is annotated text that is in an XML format similar to that of FrameNet. The text for the corpus is taken from abstracts of MEDLINE articles. The annotation tool is a GUI created using Java.
author Benis, Nirupama
Kaliyaperumal, Rajaram
author_facet Benis, Nirupama
Kaliyaperumal, Rajaram
author_sort Benis, Nirupama
title Corpus construction based on Ontological domain knowledge
title_short Corpus construction based on Ontological domain knowledge
title_full Corpus construction based on Ontological domain knowledge
title_fullStr Corpus construction based on Ontological domain knowledge
title_full_unstemmed Corpus construction based on Ontological domain knowledge
title_sort corpus construction based on ontological domain knowledge
publisher Linköpings universitet, Institutionen för datavetenskap
publishDate 2011
url http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-71851
work_keys_str_mv AT benisnirupama corpusconstructionbasedonontologicaldomainknowledge
AT kaliyaperumalrajaram corpusconstructionbasedonontologicaldomainknowledge
_version_ 1716530453661351936