Child prime label approaches to evaluate XML structured queries

The adoption of the eXtensible Markup Language (XML) as the standard format to store and exchange semi-structure data has been gaining momentum. The growing number of XML documents leads to the need for appropriate XML querying algorithms which are able to retrieve XML data efficiently. Due to the i...

Full description

Bibliographic Details
Main Author: Alsubai, Shtwai
Other Authors: North, Siobhan
Published: University of Sheffield 2018
Subjects:
004
Online Access:https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.736570
id ndltd-bl.uk-oai-ethos.bl.uk-736570
record_format oai_dc
spelling ndltd-bl.uk-oai-ethos.bl.uk-7365702019-03-05T15:39:32ZChild prime label approaches to evaluate XML structured queriesAlsubai, ShtwaiNorth, Siobhan2018The adoption of the eXtensible Markup Language (XML) as the standard format to store and exchange semi-structure data has been gaining momentum. The growing number of XML documents leads to the need for appropriate XML querying algorithms which are able to retrieve XML data efficiently. Due to the importance of twig pattern matching in XML retrieval systems, finding all matching occurrences of a tree pattern query in an XML document is often considered as a specific task for XML databases as well as a core operation in XML query processing. This thesis presents a design and implementation of a new indexing technique, called the Child Prime Label (CPL) which exploits the property of prime numbers to identify Parent-Child (P-C) edges in twig pattern queries (TPQs) during query evaluation. The CPL approach can be incorporated efficiently within the existing labelling schemes. The major contributions of this thesis can be seen as a set of novel twig matching algorithms which apply the CPL approach and focus on reducing the overhead of storing useless elements and performing unnecessary computations during the output enumeration. The research presented here is the first to provide an efficient and general solution for TPQs containing ordering constraints and positional predicates specified by the XML query languages. To evaluate the CPL approaches, the holistic model was implemented as an experimental prototype in which the approaches proposed are compared against state-of-the-art holistic twig algorithms. Extensive performance studies on various real-world and artificial datasets were conducted to demonstrate the significant improvement of the CPL approaches over the previous indexing and querying methods. The experimental results demonstrate the validity and improvements of the new algorithms over other related methods on common various subclasses of TPQs. Moreover, the scalability tests reveal that the new algorithms are more suitable for processing large XML datasets.004University of Sheffieldhttps://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.736570http://etheses.whiterose.ac.uk/19459/Electronic Thesis or Dissertation
collection NDLTD
sources NDLTD
topic 004
spellingShingle 004
Alsubai, Shtwai
Child prime label approaches to evaluate XML structured queries
description The adoption of the eXtensible Markup Language (XML) as the standard format to store and exchange semi-structure data has been gaining momentum. The growing number of XML documents leads to the need for appropriate XML querying algorithms which are able to retrieve XML data efficiently. Due to the importance of twig pattern matching in XML retrieval systems, finding all matching occurrences of a tree pattern query in an XML document is often considered as a specific task for XML databases as well as a core operation in XML query processing. This thesis presents a design and implementation of a new indexing technique, called the Child Prime Label (CPL) which exploits the property of prime numbers to identify Parent-Child (P-C) edges in twig pattern queries (TPQs) during query evaluation. The CPL approach can be incorporated efficiently within the existing labelling schemes. The major contributions of this thesis can be seen as a set of novel twig matching algorithms which apply the CPL approach and focus on reducing the overhead of storing useless elements and performing unnecessary computations during the output enumeration. The research presented here is the first to provide an efficient and general solution for TPQs containing ordering constraints and positional predicates specified by the XML query languages. To evaluate the CPL approaches, the holistic model was implemented as an experimental prototype in which the approaches proposed are compared against state-of-the-art holistic twig algorithms. Extensive performance studies on various real-world and artificial datasets were conducted to demonstrate the significant improvement of the CPL approaches over the previous indexing and querying methods. The experimental results demonstrate the validity and improvements of the new algorithms over other related methods on common various subclasses of TPQs. Moreover, the scalability tests reveal that the new algorithms are more suitable for processing large XML datasets.
author2 North, Siobhan
author_facet North, Siobhan
Alsubai, Shtwai
author Alsubai, Shtwai
author_sort Alsubai, Shtwai
title Child prime label approaches to evaluate XML structured queries
title_short Child prime label approaches to evaluate XML structured queries
title_full Child prime label approaches to evaluate XML structured queries
title_fullStr Child prime label approaches to evaluate XML structured queries
title_full_unstemmed Child prime label approaches to evaluate XML structured queries
title_sort child prime label approaches to evaluate xml structured queries
publisher University of Sheffield
publishDate 2018
url https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.736570
work_keys_str_mv AT alsubaishtwai childprimelabelapproachestoevaluatexmlstructuredqueries
_version_ 1718995796593999872