ENBIS-17 in Naples

9 – 14 September 2017; Naples (Italy) Abstract submission: 21 November 2016 – 10 May 2017

Fast Clustering of Streaming Time Series Summarized by Histograms

12 September 2017, 11:40 – 12:10

Abstract

Submitted by
Antonio Balzanella
Authors
Antonio Balzanella (Università della Campania Luigi Vanvitelli), Rosanna Verde (Università della Campania Luigi Vanvitelli), Antonio Irpino (Università della Campania Luigi Vanvitelli)
Abstract
This paper deals with the on-line clustering of multiple data streams. We assume that a sensor network is used for monitoring over time a physical phenomenon. Each sensor, performs repeated measurements at a very high frequency so that it is not possible to store the whole amount of data into some easy to access media. Still, we assume that the monitored phenomenon is highly evolving. We can think, for instance, at temperature monitoring, seismic activity monitoring, pollution monitoring.
Our aim is to find groups of sensors which behave similarly over time.
The proposed strategy is made by two phases: the online phase aims at summarizing the incoming data; the offline phase provides the partitioning of the streams into clusters. In the online phase, the incoming observations are split into batches. Each subsequence in the batch is summarized by a histogram. Finally, a fast clustering algorithm is performed on the histograms in order to get a local partitioning of the data. The offline step, finds a consensus partition starting from the local partitions of the data streams.
Through an application on real data, we show the effectiveness of our strategy in finding homogeneous groups of data streams.

Return to programme