Use of Ontologies for Data Integration and Curation

Data curation includes the goal of facilitating the re-use and combination of datasets, which is often impeded by incompatible data schema. Can we use ontologies to help with data integration? We suggest a semi-automatic process that involves the use of automatic text searching to help identify over...

Full description

Bibliographic Details
Main Authors: Judith Gelernter, Michael Lesk
Format: Article
Language:English
Published: University of Edinburgh 2011-03-01
Series:International Journal of Digital Curation
Online Access:http://www.ijdc.net/index.php/ijdc/article/view/164
Description
Summary:Data curation includes the goal of facilitating the re-use and combination of datasets, which is often impeded by incompatible data schema. Can we use ontologies to help with data integration? We suggest a semi-automatic process that involves the use of automatic text searching to help identify overlaps in metadata that accompany data schemas, plus human validation of suggested data matches.<br /><br />Problems include different text used to describe the same concept, different forms of data recording and different organizations of data. Ontologies can help by focussing attention on important words, providing synonyms to assist matching, and indicating in what context words are used. Beyond ontologies, data on the statistical behavior of data can be used to decide which data elements appear to be compatible with which other data elements. When curating data which may have hundreds or even thousands of data labels, semi-automatic assistance with data fusion should be of great help.<br />
ISSN:1746-8256