Summary: | Scalable real-time processing of large amounts of data has become a research topic of particular importance due to the continuously rising amount of data that is generated by devices equipped with sensing components. While existing approaches allow for fault-tolerant and scalable stream processing, we present a pipeline architecture that consists of well-known open source tools to specifically integrate spatiotemporal internet of things (IoT) data streams. In a case study, we utilize the architecture to tackle the online map matching problem, a pre-processing step for trajectory mining algorithms. Given the rising amount of vehicle location data that is generated on a daily basis, existing map matching algorithms have to be implemented in a distributed manner to be executable in a stream processing framework that provides scalability. We demonstrate how to implement state-of-the-art map matching algorithms in our distributed stream processing pipeline and analyze measured latencies.
|