Dataflow-Driven Crowdsourcing: Relational Models and Algorithms

Recently, microtask crowdsourcing has become a popular approach for addressing various data mining problems. Crowdsourcing workflows for approaching such problems are composed of several data processing stages which require consistent representation for making the work reproducible. This paper is de...

Full description

Bibliographic Details
Main Author:	D. A. Ustalov
Format:	Article
Language:	English
Published:	Yaroslavl State University 2016-04-01
Series:	Modelirovanie i Analiz Informacionnyh Sistem
Subjects:	crowdsourcing dataflow model relational model computational linguistics
Online Access:	https://www.mais-journal.ru/jour/article/view/329

id	doaj-f1c752ca9cab43e3ad1639fa6bb8a023
record_format	Article
spelling	doaj-f1c752ca9cab43e3ad1639fa6bb8a0232021-07-29T08:15:21ZengYaroslavl State UniversityModelirovanie i Analiz Informacionnyh Sistem1818-10152313-54172016-04-0123219521010.18255/1818-1015-2016-2-195-210291Dataflow-Driven Crowdsourcing: Relational Models and AlgorithmsD. A. Ustalov0N.N. Krasovskii Institute of Mathematics and Mechanics of the Ural Branch of the Russian Academy of Sciences, Sofia Kovalevskaya str., 16, Yekaterinburg, 620990, RussiaRecently, microtask crowdsourcing has become a popular approach for addressing various data mining problems. Crowdsourcing workflows for approaching such problems are composed of several data processing stages which require consistent representation for making the work reproducible. This paper is devoted to the problem of reproducibility and formalization of the microtask crowdsourcing process. A computational model for microtask crowdsourcing based on an extended relational model and a dataflow computational model has been proposed. The proposed collaborative dataflow computational model is designed for processing the input data sources by executing annotation stages and automatic synchronization stages simultaneously. Data processing stages and connections between them are expressed by using collaborative computation workflows represented as loosely connected directed acyclic graphs. A synchronous algorithm for executing such workflows has been described. The computational model has been evaluated by applying it to two tasks from the computational linguistics field: concept lexicalization refining in electronic thesauri and establishing hierarchical relations between such concepts. The “Add–Remove–Confirm” procedure is designed for adding the missing lexemes to the concepts while removing the odd ones. The “Genus–Species–Match” procedure is designed for establishing “is-a” relations between the concepts provided with the corresponding word pairs. The experiments involving both volunteers from popular online social networks and paid workers from crowdsourcing marketplaces confirm applicability of these procedures for enhancing lexical resources.https://www.mais-journal.ru/jour/article/view/329crowdsourcingdataflow modelrelational modelcomputational linguistics
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	D. A. Ustalov
spellingShingle	D. A. Ustalov Dataflow-Driven Crowdsourcing: Relational Models and Algorithms Modelirovanie i Analiz Informacionnyh Sistem crowdsourcing dataflow model relational model computational linguistics
author_facet	D. A. Ustalov
author_sort	D. A. Ustalov
title	Dataflow-Driven Crowdsourcing: Relational Models and Algorithms
title_short	Dataflow-Driven Crowdsourcing: Relational Models and Algorithms
title_full	Dataflow-Driven Crowdsourcing: Relational Models and Algorithms
title_fullStr	Dataflow-Driven Crowdsourcing: Relational Models and Algorithms
title_full_unstemmed	Dataflow-Driven Crowdsourcing: Relational Models and Algorithms
title_sort	dataflow-driven crowdsourcing: relational models and algorithms
publisher	Yaroslavl State University
series	Modelirovanie i Analiz Informacionnyh Sistem
issn	1818-1015 2313-5417
publishDate	2016-04-01
description	Recently, microtask crowdsourcing has become a popular approach for addressing various data mining problems. Crowdsourcing workflows for approaching such problems are composed of several data processing stages which require consistent representation for making the work reproducible. This paper is devoted to the problem of reproducibility and formalization of the microtask crowdsourcing process. A computational model for microtask crowdsourcing based on an extended relational model and a dataflow computational model has been proposed. The proposed collaborative dataflow computational model is designed for processing the input data sources by executing annotation stages and automatic synchronization stages simultaneously. Data processing stages and connections between them are expressed by using collaborative computation workflows represented as loosely connected directed acyclic graphs. A synchronous algorithm for executing such workflows has been described. The computational model has been evaluated by applying it to two tasks from the computational linguistics field: concept lexicalization refining in electronic thesauri and establishing hierarchical relations between such concepts. The “Add–Remove–Confirm” procedure is designed for adding the missing lexemes to the concepts while removing the odd ones. The “Genus–Species–Match” procedure is designed for establishing “is-a” relations between the concepts provided with the corresponding word pairs. The experiments involving both volunteers from popular online social networks and paid workers from crowdsourcing marketplaces confirm applicability of these procedures for enhancing lexical resources.
topic	crowdsourcing dataflow model relational model computational linguistics
url	https://www.mais-journal.ru/jour/article/view/329
work_keys_str_mv	AT daustalov dataflowdrivencrowdsourcingrelationalmodelsandalgorithms
_version_	1721256528485285888

Dataflow-Driven Crowdsourcing: Relational Models and Algorithms

Similar Items