Use of Ontologies for Data Integration and Curation

Data curation includes the goal of facilitating the re-use and combination of datasets, which is often impeded by incompatible data schema. Can we use ontologies to help with data integration? We suggest a semi-automatic process that involves the use of automatic text searching to help identify over...

Full description

Bibliographic Details
Main Authors: Judith Gelernter, Michael Lesk
Format: Article
Language:English
Published: University of Edinburgh 2011-03-01
Series:International Journal of Digital Curation
Online Access:http://www.ijdc.net/index.php/ijdc/article/view/164
id doaj-39ebe45d765345f89ea8007033d89ca1
record_format Article
spelling doaj-39ebe45d765345f89ea8007033d89ca12020-11-24T22:00:24ZengUniversity of EdinburghInternational Journal of Digital Curation1746-82562011-03-0161707810.2218/ijdc.v6i1.173156Use of Ontologies for Data Integration and CurationJudith GelernterMichael LeskData curation includes the goal of facilitating the re-use and combination of datasets, which is often impeded by incompatible data schema. Can we use ontologies to help with data integration? We suggest a semi-automatic process that involves the use of automatic text searching to help identify overlaps in metadata that accompany data schemas, plus human validation of suggested data matches.<br /><br />Problems include different text used to describe the same concept, different forms of data recording and different organizations of data. Ontologies can help by focussing attention on important words, providing synonyms to assist matching, and indicating in what context words are used. Beyond ontologies, data on the statistical behavior of data can be used to decide which data elements appear to be compatible with which other data elements. When curating data which may have hundreds or even thousands of data labels, semi-automatic assistance with data fusion should be of great help.<br />http://www.ijdc.net/index.php/ijdc/article/view/164
collection DOAJ
language English
format Article
sources DOAJ
author Judith Gelernter
Michael Lesk
spellingShingle Judith Gelernter
Michael Lesk
Use of Ontologies for Data Integration and Curation
International Journal of Digital Curation
author_facet Judith Gelernter
Michael Lesk
author_sort Judith Gelernter
title Use of Ontologies for Data Integration and Curation
title_short Use of Ontologies for Data Integration and Curation
title_full Use of Ontologies for Data Integration and Curation
title_fullStr Use of Ontologies for Data Integration and Curation
title_full_unstemmed Use of Ontologies for Data Integration and Curation
title_sort use of ontologies for data integration and curation
publisher University of Edinburgh
series International Journal of Digital Curation
issn 1746-8256
publishDate 2011-03-01
description Data curation includes the goal of facilitating the re-use and combination of datasets, which is often impeded by incompatible data schema. Can we use ontologies to help with data integration? We suggest a semi-automatic process that involves the use of automatic text searching to help identify overlaps in metadata that accompany data schemas, plus human validation of suggested data matches.<br /><br />Problems include different text used to describe the same concept, different forms of data recording and different organizations of data. Ontologies can help by focussing attention on important words, providing synonyms to assist matching, and indicating in what context words are used. Beyond ontologies, data on the statistical behavior of data can be used to decide which data elements appear to be compatible with which other data elements. When curating data which may have hundreds or even thousands of data labels, semi-automatic assistance with data fusion should be of great help.<br />
url http://www.ijdc.net/index.php/ijdc/article/view/164
work_keys_str_mv AT judithgelernter useofontologiesfordataintegrationandcuration
AT michaellesk useofontologiesfordataintegrationandcuration
_version_ 1725844670325456896