An Efficient Method of Sharing Mass Spatio-Temporal Trajectory Data Based on Cloudera Impala for Traffic Distribution Mapping in an Urban City

The efficient sharing of spatio-temporal trajectory data is important to understand traffic congestion in mass data. However, the data volumes of bus networks in urban cities are growing rapidly, reaching daily volumes of one hundred million datapoints. Accessing and retrieving mass spatio-temporal...

Full description

Bibliographic Details
Main Authors: Lianjie Zhou, Nengcheng Chen, Sai Yuan, Zeqiang Chen
Format: Article
Language:English
Published: MDPI AG 2016-10-01
Series:Sensors
Subjects:
Online Access:http://www.mdpi.com/1424-8220/16/11/1813
id doaj-4b9c28cae2a64ecab014f6b1b755522f
record_format Article
spelling doaj-4b9c28cae2a64ecab014f6b1b755522f2020-11-25T00:38:56ZengMDPI AGSensors1424-82202016-10-011611181310.3390/s16111813s16111813An Efficient Method of Sharing Mass Spatio-Temporal Trajectory Data Based on Cloudera Impala for Traffic Distribution Mapping in an Urban CityLianjie Zhou0Nengcheng Chen1Sai Yuan2Zeqiang Chen3State Key Laboratory for Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, 129 Luoyu Road, Wuhan 430079, ChinaState Key Laboratory for Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, 129 Luoyu Road, Wuhan 430079, ChinaState Key Laboratory for Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, 129 Luoyu Road, Wuhan 430079, ChinaState Key Laboratory for Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, 129 Luoyu Road, Wuhan 430079, ChinaThe efficient sharing of spatio-temporal trajectory data is important to understand traffic congestion in mass data. However, the data volumes of bus networks in urban cities are growing rapidly, reaching daily volumes of one hundred million datapoints. Accessing and retrieving mass spatio-temporal trajectory data in any field is hard and inefficient due to limited computational capabilities and incomplete data organization mechanisms. Therefore, we propose an optimized and efficient spatio-temporal trajectory data retrieval method based on the Cloudera Impala query engine, called ESTRI, to enhance the efficiency of mass data sharing. As an excellent query tool for mass data, Impala can be applied for mass spatio-temporal trajectory data sharing. In ESTRI we extend the spatio-temporal trajectory data retrieval function of Impala and design a suitable data partitioning method. In our experiments, the Taiyuan BeiDou (BD) bus network is selected, containing 2300 buses with BD positioning sensors, producing 20 million records every day, resulting in two difficulties as described in the Introduction section. In addition, ESTRI and MongoDB are applied in experiments. The experiments show that ESTRI achieves the most efficient data retrieval compared to retrieval using MongoDB for data volumes of fifty million, one hundred million, one hundred and fifty million, and two hundred million. The performance of ESTRI is approximately seven times higher than that of MongoDB. The experiments show that ESTRI is an effective method for retrieving mass spatio-temporal trajectory data. Finally, bus distribution mapping in Taiyuan city is achieved, describing the buses density in different regions at different times throughout the day, which can be applied in future studies of transport, such as traffic scheduling, traffic planning and traffic behavior management in intelligent public transportation systems.http://www.mdpi.com/1424-8220/16/11/1813cloud computingdata retrievalBeidou positioning sensorspatial-temporal data
collection DOAJ
language English
format Article
sources DOAJ
author Lianjie Zhou
Nengcheng Chen
Sai Yuan
Zeqiang Chen
spellingShingle Lianjie Zhou
Nengcheng Chen
Sai Yuan
Zeqiang Chen
An Efficient Method of Sharing Mass Spatio-Temporal Trajectory Data Based on Cloudera Impala for Traffic Distribution Mapping in an Urban City
Sensors
cloud computing
data retrieval
Beidou positioning sensor
spatial-temporal data
author_facet Lianjie Zhou
Nengcheng Chen
Sai Yuan
Zeqiang Chen
author_sort Lianjie Zhou
title An Efficient Method of Sharing Mass Spatio-Temporal Trajectory Data Based on Cloudera Impala for Traffic Distribution Mapping in an Urban City
title_short An Efficient Method of Sharing Mass Spatio-Temporal Trajectory Data Based on Cloudera Impala for Traffic Distribution Mapping in an Urban City
title_full An Efficient Method of Sharing Mass Spatio-Temporal Trajectory Data Based on Cloudera Impala for Traffic Distribution Mapping in an Urban City
title_fullStr An Efficient Method of Sharing Mass Spatio-Temporal Trajectory Data Based on Cloudera Impala for Traffic Distribution Mapping in an Urban City
title_full_unstemmed An Efficient Method of Sharing Mass Spatio-Temporal Trajectory Data Based on Cloudera Impala for Traffic Distribution Mapping in an Urban City
title_sort efficient method of sharing mass spatio-temporal trajectory data based on cloudera impala for traffic distribution mapping in an urban city
publisher MDPI AG
series Sensors
issn 1424-8220
publishDate 2016-10-01
description The efficient sharing of spatio-temporal trajectory data is important to understand traffic congestion in mass data. However, the data volumes of bus networks in urban cities are growing rapidly, reaching daily volumes of one hundred million datapoints. Accessing and retrieving mass spatio-temporal trajectory data in any field is hard and inefficient due to limited computational capabilities and incomplete data organization mechanisms. Therefore, we propose an optimized and efficient spatio-temporal trajectory data retrieval method based on the Cloudera Impala query engine, called ESTRI, to enhance the efficiency of mass data sharing. As an excellent query tool for mass data, Impala can be applied for mass spatio-temporal trajectory data sharing. In ESTRI we extend the spatio-temporal trajectory data retrieval function of Impala and design a suitable data partitioning method. In our experiments, the Taiyuan BeiDou (BD) bus network is selected, containing 2300 buses with BD positioning sensors, producing 20 million records every day, resulting in two difficulties as described in the Introduction section. In addition, ESTRI and MongoDB are applied in experiments. The experiments show that ESTRI achieves the most efficient data retrieval compared to retrieval using MongoDB for data volumes of fifty million, one hundred million, one hundred and fifty million, and two hundred million. The performance of ESTRI is approximately seven times higher than that of MongoDB. The experiments show that ESTRI is an effective method for retrieving mass spatio-temporal trajectory data. Finally, bus distribution mapping in Taiyuan city is achieved, describing the buses density in different regions at different times throughout the day, which can be applied in future studies of transport, such as traffic scheduling, traffic planning and traffic behavior management in intelligent public transportation systems.
topic cloud computing
data retrieval
Beidou positioning sensor
spatial-temporal data
url http://www.mdpi.com/1424-8220/16/11/1813
work_keys_str_mv AT lianjiezhou anefficientmethodofsharingmassspatiotemporaltrajectorydatabasedonclouderaimpalafortrafficdistributionmappinginanurbancity
AT nengchengchen anefficientmethodofsharingmassspatiotemporaltrajectorydatabasedonclouderaimpalafortrafficdistributionmappinginanurbancity
AT saiyuan anefficientmethodofsharingmassspatiotemporaltrajectorydatabasedonclouderaimpalafortrafficdistributionmappinginanurbancity
AT zeqiangchen anefficientmethodofsharingmassspatiotemporaltrajectorydatabasedonclouderaimpalafortrafficdistributionmappinginanurbancity
AT lianjiezhou efficientmethodofsharingmassspatiotemporaltrajectorydatabasedonclouderaimpalafortrafficdistributionmappinginanurbancity
AT nengchengchen efficientmethodofsharingmassspatiotemporaltrajectorydatabasedonclouderaimpalafortrafficdistributionmappinginanurbancity
AT saiyuan efficientmethodofsharingmassspatiotemporaltrajectorydatabasedonclouderaimpalafortrafficdistributionmappinginanurbancity
AT zeqiangchen efficientmethodofsharingmassspatiotemporaltrajectorydatabasedonclouderaimpalafortrafficdistributionmappinginanurbancity
_version_ 1725295697990778880