In transportation applications, sensor data are heterogeneous, noisy, and unlabelled. Outside of many other challenges, there is the little-studied question of how to aggregate the data and provide the aggregated information such that the transportation system is stable, in a very precise, closed-loop fashion...
Current Big Data technologies, including new big data architectures as well as scalable knowledge discovery and machine learning algorithms, are already very efficient in managing massive streams of information. However, despite the enormous recent progress in the Big Data space, current approaches have severe limitations when managing urban data streams. Addressing the veracity and variety are the main impediments, and therefore the main topics of current work.
Considering data sources in concert is an important factor in improving our uderstanding of the data, however cannot fully address the inherent noisy nature of the data, and the fact that in general we lack
reliable information to label or classify the data. The size and complexity of the datasets makes reliance to manual resources impractical; however having ground truth information is very important for any algorithm that attempts to improve the citizen experience. To address this problem we introduce novel Visual Analytics and Crowdsourcing techniques that optimize the efficiency that user input can be incorporated in the analysis tasks. The following list summarizes the challenges that have to be met in order to effectively exploit urban data.