Back to Top

■ DIsCO: DynamIc Data COmpression in Distributed Stream Processing Systems

DIsCO: DynamIc Data COmpression in Distributed Stream Processing Systems, Zacheilas Nikos, Vana Kalogeraki, DAIS 2017, Neuchâtel, Switzerland, June 19 - 22, 2017

Supporting high throughput in Distributed Stream Processing Systems (DSPSs) has been an important goal in recent years. Current works either focus on automatically increasing the system resources whenever the current setup is inadequate or apply load shedding techniques discarding some of the incoming data. However, both approaches have significant shortcomings as they require on the fly application reconfiguration where the application needs to be stopped and re-uploaded in the cluster with the new configurations, and can lead to significant information loss. One approach that has not yet been considered for improving the throughput of DSPSs is exploiting compression algorithms to minimize the communication overhead between components especially in cases where we have large-sized data like live CCTV camera reports. This work is the first that provides a novel framework, built on top of Apache Storm, which enables dynamic compression of incoming streaming data. Our approach uses a profiling algorithm to automatically determine the compression algorithm that should be applied and supports both lossless and lossy compression techniques. Furthermore, we propose a novel algorithm for determining when profiling should be applied. Finally, our detailed experimental evaluation with commonly used stream processing applications, indicates a clear improvement on the applications’ throughput when our proposed techniques are applied.

Bibtex Entry.
  author    = {Nikos Zacheilas and
               Vana Kalogeraki},
  title     = {DIsCO: DynamIc Data COmpression in Distributed Stream Processing Systems},
  booktitle = {Distributed Applications and Interoperable Systems - 17th {IFIP} {WG}
               6.1 International Conference, {DAIS} 2017, Held as Part of the 12th
               International Federated Conference on Distributed Computing Techniques,
               DisCoTec 2017, Neuch{\^{a}}tel, Switzerland, June 19-22, 2017, Proceedings},
  pages     = {19--33},
  year      = {2017},
  crossref  = {DBLP:conf/dais/2017},
  url       = {},
  doi       = {10.1007/978-3-319-59665-5_2},
  timestamp = {Tue, 06 Jun 2017 17:26:32 +0200},
  biburl    = {},
  bibsource = {dblp computer science bibliography,}