Efficient processing and classification of wave energy spectrum data with a distributed pipeline

Processing of large amounts of data often consists of several steps, e.g. pre- and post-processing stages, which are executed sequentially with data written to disk after each step, however, when pre-processing stage for each task is different the more efficient way of processing data is to construc...

Full description

Bibliographic Details
Main Authors: I. G. Gankevich, A. B. Degtyarev
Format: Article
Language:Russian
Published: Institute of Computer Science 2015-06-01
Series:Компьютерные исследования и моделирование
Subjects:
Online Access:http://crm.ics.org.ru/uploads/crmissues/crm_2015_3/15718.pdf
id doaj-9f93b81b100d45369ad62e598b6aa92b
record_format Article
spelling doaj-9f93b81b100d45369ad62e598b6aa92b2020-11-24T21:30:42ZrusInstitute of Computer ScienceКомпьютерные исследования и моделирование2076-76332077-68532015-06-017351752010.20537/2076-7633-2015-7-3-517-5202301Efficient processing and classification of wave energy spectrum data with a distributed pipelineI. G. GankevichA. B. DegtyarevProcessing of large amounts of data often consists of several steps, e.g. pre- and post-processing stages, which are executed sequentially with data written to disk after each step, however, when pre-processing stage for each task is different the more efficient way of processing data is to construct a pipeline which streams data from one stage to another. In a more general case some processing stages can be factored into several parallel subordinate stages thus forming a distributed pipeline where each stage can have multiple inputs and multiple outputs. Such processing pattern emerges in a problem of classification of wave energy spectra based on analytic approximations which can extract different wave systems and their parameters (e.g. wave system type, mean wave direction) from spectrum. Distributed pipeline approach achieves good performance compared to conventional "sequential-stage" processing.http://crm.ics.org.ru/uploads/crmissues/crm_2015_3/15718.pdfdistributed systembig datadata processingparallel computing
collection DOAJ
language Russian
format Article
sources DOAJ
author I. G. Gankevich
A. B. Degtyarev
spellingShingle I. G. Gankevich
A. B. Degtyarev
Efficient processing and classification of wave energy spectrum data with a distributed pipeline
Компьютерные исследования и моделирование
distributed system
big data
data processing
parallel computing
author_facet I. G. Gankevich
A. B. Degtyarev
author_sort I. G. Gankevich
title Efficient processing and classification of wave energy spectrum data with a distributed pipeline
title_short Efficient processing and classification of wave energy spectrum data with a distributed pipeline
title_full Efficient processing and classification of wave energy spectrum data with a distributed pipeline
title_fullStr Efficient processing and classification of wave energy spectrum data with a distributed pipeline
title_full_unstemmed Efficient processing and classification of wave energy spectrum data with a distributed pipeline
title_sort efficient processing and classification of wave energy spectrum data with a distributed pipeline
publisher Institute of Computer Science
series Компьютерные исследования и моделирование
issn 2076-7633
2077-6853
publishDate 2015-06-01
description Processing of large amounts of data often consists of several steps, e.g. pre- and post-processing stages, which are executed sequentially with data written to disk after each step, however, when pre-processing stage for each task is different the more efficient way of processing data is to construct a pipeline which streams data from one stage to another. In a more general case some processing stages can be factored into several parallel subordinate stages thus forming a distributed pipeline where each stage can have multiple inputs and multiple outputs. Such processing pattern emerges in a problem of classification of wave energy spectra based on analytic approximations which can extract different wave systems and their parameters (e.g. wave system type, mean wave direction) from spectrum. Distributed pipeline approach achieves good performance compared to conventional "sequential-stage" processing.
topic distributed system
big data
data processing
parallel computing
url http://crm.ics.org.ru/uploads/crmissues/crm_2015_3/15718.pdf
work_keys_str_mv AT iggankevich efficientprocessingandclassificationofwaveenergyspectrumdatawithadistributedpipeline
AT abdegtyarev efficientprocessingandclassificationofwaveenergyspectrumdatawithadistributedpipeline
_version_ 1725962129710776320