Weighted Tensor Approximation Based on Apache Spark
碩士 === 元智大學 === 資訊工程學系 === 106 === This paper is aimed at implementing weighted tensor approximation (WTA) on Apache Spark, and achieving much less processing time than previous work. Unlike traditional tensor approximation (TA), this paper takes the validity of the input data into consideration dur...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2017
|
Online Access: | http://ndltd.ncl.edu.tw/handle/2u2b69 |
id |
ndltd-TW-106YZU05392015 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-106YZU053920152019-05-16T00:15:13Z http://ndltd.ncl.edu.tw/handle/2u2b69 Weighted Tensor Approximation Based on Apache Spark 基於Apache Spark平台的權重張量近似法 Qin Ding 丁琴 碩士 元智大學 資訊工程學系 106 This paper is aimed at implementing weighted tensor approximation (WTA) on Apache Spark, and achieving much less processing time than previous work. Unlike traditional tensor approximation (TA), this paper takes the validity of the input data into consideration during the data compression process. By giving different weights to valid and invalid data, we can eliminate the impact of invalid data and obtain better approximation results. However, because of the large amount of raw data, the compression process of WTA using just a stand-alone workstation not only takes a long time, but also poses a great challenge for hardware. In this paper, we investigate on how to parallelize the compression process for less processing time. We choose to implement WTA on Spark, whose computational performance is often faster than other distributed computing platforms, such as Apache Hadoop. The feasibility of WTA on Spark is achieved by transforming the original multilinear problem into a common linear one. The input tensor is also partitioned into small blocks to further reduce compression time. Yu-Ting Tsai 蔡侑庭 2017 學位論文 ; thesis 15 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 元智大學 === 資訊工程學系 === 106 === This paper is aimed at implementing weighted tensor approximation (WTA) on Apache Spark, and achieving much less processing time than previous work. Unlike traditional tensor approximation (TA), this paper takes the validity of the input data into consideration during the data compression process. By giving different weights to valid and invalid data, we can eliminate the impact of invalid data and obtain better approximation results. However, because of the large amount of raw data, the compression process of WTA using just a stand-alone workstation not only takes a long time, but also poses a great challenge for hardware. In this paper, we investigate on how to parallelize the compression process for less processing time. We choose to implement WTA on Spark, whose computational performance is often faster than other distributed computing platforms, such as Apache Hadoop. The feasibility of WTA on Spark is achieved by transforming the original multilinear problem into a common linear one. The input tensor is also partitioned into small blocks to further reduce compression time.
|
author2 |
Yu-Ting Tsai |
author_facet |
Yu-Ting Tsai Qin Ding 丁琴 |
author |
Qin Ding 丁琴 |
spellingShingle |
Qin Ding 丁琴 Weighted Tensor Approximation Based on Apache Spark |
author_sort |
Qin Ding |
title |
Weighted Tensor Approximation Based on Apache Spark |
title_short |
Weighted Tensor Approximation Based on Apache Spark |
title_full |
Weighted Tensor Approximation Based on Apache Spark |
title_fullStr |
Weighted Tensor Approximation Based on Apache Spark |
title_full_unstemmed |
Weighted Tensor Approximation Based on Apache Spark |
title_sort |
weighted tensor approximation based on apache spark |
publishDate |
2017 |
url |
http://ndltd.ncl.edu.tw/handle/2u2b69 |
work_keys_str_mv |
AT qinding weightedtensorapproximationbasedonapachespark AT dīngqín weightedtensorapproximationbasedonapachespark AT qinding jīyúapachesparkpíngtáidequánzhòngzhāngliàngjìnshìfǎ AT dīngqín jīyúapachesparkpíngtáidequánzhòngzhāngliàngjìnshìfǎ |
_version_ |
1719162631124680704 |