Automating User-Centered Design of Data-Intensive Processes

Business Intelligence (BI) enables organizations to collect and analyze internal and external business data to generate knowledge and business value, and provide decision support at the strategic, tactical, and operational levels. The consolidation of data coming from many sources as a result of man...

Full description

Bibliographic Details
Main Author: Theodorou, Vasileios
Other Authors: Technische Universität Dresden, Fakultät Informatik
Format: Doctoral Thesis
Language:English
Published: Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden 2017
Subjects:
ETL
Online Access:http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-229974
http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-229974
http://www.qucosa.de/fileadmin/data/qucosa/documents/22997/vasileios-thesis%281%29.pdf
id ndltd-DRESDEN-oai-qucosa.de-bsz-14-qucosa-229974
record_format oai_dc
collection NDLTD
language English
format Doctoral Thesis
sources NDLTD
topic ETL
ETL
process quality
quality measures
user-centered design
ddc:004
rvk:ST 265
spellingShingle ETL
ETL
process quality
quality measures
user-centered design
ddc:004
rvk:ST 265
Theodorou, Vasileios
Automating User-Centered Design of Data-Intensive Processes
description Business Intelligence (BI) enables organizations to collect and analyze internal and external business data to generate knowledge and business value, and provide decision support at the strategic, tactical, and operational levels. The consolidation of data coming from many sources as a result of managerial and operational business processes, usually referred to as Extract-Transform-Load (ETL) is itself a statically defined process and knowledge workers have little to no control over the characteristics of the presentable data to which they have access. There are two main reasons that dictate the reassessment of this stiff approach in context of modern business environments. The first reason is that the service-oriented nature of today’s business combined with the increasing volume of available data make it impossible for an organization to proactively design efficient data management processes. The second reason is that enterprises can benefit significantly from analyzing the behavior of their business processes fostering their optimization. Hence, we took a first step towards quality-aware ETL process design automation by defining through a systematic literature review a set of ETL process quality characteristics and the relationships between them, as well as by providing quantitative measures for each characteristic. Subsequently, we produced a model that represents ETL process quality characteristics and the dependencies among them and we showcased through the application of a Goal Model with quantitative components (i.e., indicators) how our model can provide the basis for subsequent analysis to reason and make informed ETL design decisions. In addition, we introduced our holistic view for a quality-aware design of ETL processes by presenting a framework for user-centered declarative ETL. This included the definition of an architecture and methodology for the rapid, incremental, qualitative improvement of ETL process models, promoting automation and reducing complexity, as well as a clear separation of business users and IT roles where each user is presented with appropriate views and assigned with fitting tasks. In this direction, we built a tool —POIESIS— which facilitates incremental, quantitative improvement of ETL process models with users being the key participants through well-defined collaborative interfaces. When it comes to evaluating different quality characteristics of the ETL process design, we proposed an automated data generation framework for evaluating ETL processes (i.e., Bijoux). To this end, we classified the operations based on the part of input data they access for processing, which facilitated Bijoux during data generation processes both for identifying the constraints that specific operation semantics imply over input data, as well as for deciding at which level the data should be generated (e.g., single field, single tuple, complete dataset). Bijoux offers data generation capabilities in a modular and configurable manner, which can be used to evaluate the quality of different parts of an ETL process. Moreover, we introduced a methodology that can apply to concrete contexts, building a repository of patterns and rules. This generated knowledge base can be used during the design and maintenance phases of ETL processes, automatically exposing understandable conceptual representations of the processes and providing useful insight for design decisions. Collectively, these contributions have raised the level of abstraction of ETL process components, revealing their quality characteristics in a granular level and allowing for evaluation and automated (re-)design, taking under consideration business users’ quality goals.
author2 Technische Universität Dresden, Fakultät Informatik
author_facet Technische Universität Dresden, Fakultät Informatik
Theodorou, Vasileios
author Theodorou, Vasileios
author_sort Theodorou, Vasileios
title Automating User-Centered Design of Data-Intensive Processes
title_short Automating User-Centered Design of Data-Intensive Processes
title_full Automating User-Centered Design of Data-Intensive Processes
title_fullStr Automating User-Centered Design of Data-Intensive Processes
title_full_unstemmed Automating User-Centered Design of Data-Intensive Processes
title_sort automating user-centered design of data-intensive processes
publisher Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden
publishDate 2017
url http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-229974
http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-229974
http://www.qucosa.de/fileadmin/data/qucosa/documents/22997/vasileios-thesis%281%29.pdf
work_keys_str_mv AT theodorouvasileios automatingusercentereddesignofdataintensiveprocesses
_version_ 1718560595083526144
spelling ndltd-DRESDEN-oai-qucosa.de-bsz-14-qucosa-2299742017-11-09T03:27:11Z Automating User-Centered Design of Data-Intensive Processes Theodorou, Vasileios ETL ETL process quality quality measures user-centered design ddc:004 rvk:ST 265 Business Intelligence (BI) enables organizations to collect and analyze internal and external business data to generate knowledge and business value, and provide decision support at the strategic, tactical, and operational levels. The consolidation of data coming from many sources as a result of managerial and operational business processes, usually referred to as Extract-Transform-Load (ETL) is itself a statically defined process and knowledge workers have little to no control over the characteristics of the presentable data to which they have access. There are two main reasons that dictate the reassessment of this stiff approach in context of modern business environments. The first reason is that the service-oriented nature of today’s business combined with the increasing volume of available data make it impossible for an organization to proactively design efficient data management processes. The second reason is that enterprises can benefit significantly from analyzing the behavior of their business processes fostering their optimization. Hence, we took a first step towards quality-aware ETL process design automation by defining through a systematic literature review a set of ETL process quality characteristics and the relationships between them, as well as by providing quantitative measures for each characteristic. Subsequently, we produced a model that represents ETL process quality characteristics and the dependencies among them and we showcased through the application of a Goal Model with quantitative components (i.e., indicators) how our model can provide the basis for subsequent analysis to reason and make informed ETL design decisions. In addition, we introduced our holistic view for a quality-aware design of ETL processes by presenting a framework for user-centered declarative ETL. This included the definition of an architecture and methodology for the rapid, incremental, qualitative improvement of ETL process models, promoting automation and reducing complexity, as well as a clear separation of business users and IT roles where each user is presented with appropriate views and assigned with fitting tasks. In this direction, we built a tool —POIESIS— which facilitates incremental, quantitative improvement of ETL process models with users being the key participants through well-defined collaborative interfaces. When it comes to evaluating different quality characteristics of the ETL process design, we proposed an automated data generation framework for evaluating ETL processes (i.e., Bijoux). To this end, we classified the operations based on the part of input data they access for processing, which facilitated Bijoux during data generation processes both for identifying the constraints that specific operation semantics imply over input data, as well as for deciding at which level the data should be generated (e.g., single field, single tuple, complete dataset). Bijoux offers data generation capabilities in a modular and configurable manner, which can be used to evaluate the quality of different parts of an ETL process. Moreover, we introduced a methodology that can apply to concrete contexts, building a repository of patterns and rules. This generated knowledge base can be used during the design and maintenance phases of ETL processes, automatically exposing understandable conceptual representations of the processes and providing useful insight for design decisions. Collectively, these contributions have raised the level of abstraction of ETL process components, revealing their quality characteristics in a granular level and allowing for evaluation and automated (re-)design, taking under consideration business users’ quality goals. Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden Technische Universität Dresden, Fakultät Informatik Prof. Dr. Wolfgang Lehner Prof. Dr. Alberto Abello 2017-11-08 doc-type:doctoralThesis application/pdf http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-229974 urn:nbn:de:bsz:14-qucosa-229974 http://www.qucosa.de/fileadmin/data/qucosa/documents/22997/vasileios-thesis%281%29.pdf eng