Back to Top

VaVeL presented at Data Science Summit in Warsaw

Dr. Maciej Grzenda (Warsaw University of Technology) gave a talk and a presentiation on 16th of May 2017 at Data Science Summit (http://dssconf.pl/ ) in Warsaw where was discussed state of the art in data-intensive architecture planning. The talk took place at a major IT event attended by hundreds of IT developers and Data Science professionals from industry and academia. Overall, the event attracted interest of 1500+ participants and partners, including Accenture, Roche, ING Bank, RedHat, Samsung and other. In particular, Dr. Grzenda highlighted VaVeL developments in the domain of big data architectures. The presentation was co-authored by Jarosław Legierski, PhD (Orange Polska).

Big Data deployments require carefully tuned system architecture frequently combining multiple data storage and processing frameworks. Architecture patterns such as Lambda and Kappa have to be adapted to the unique needs of individual projects. VaVeL project provides a number of components and solutions answering the need for the processing of heterogeneous data streams and periodically updated data sets that can contribute to big data architecture planning.

The talk included an overview of data storage systems (relational database management systems, Apache Hadoop, NoSQL platforms) and discussions on applicability of these system categories. Furthermore, the need for stream processing, stream mining and combining it with the processing of cached data sets was addressed. Among design patterns and components developed within VaVeL project were discussed those inspired by the needs of the City of Warsaw, such as: converting most recent location polled from RESTful service output into deduplicated location data streams, combining periodical download of data sets with processing of geodata streams, joining both categories of data or processing news streams including downloading and persisting linked web documents.