Visualizing Public Complex Data
At the Public-Private Platform, we do not only host seminars, workshops, conferences etc., we also provide seed-funding for research projects that have a connection to our work. Currently, the PP Platform is funding parts of a project conducted by Professor Dolores Romero Morales from the Department of Economics, CBS and Associated Professor Daniel Hardt from the Department of IT Management, CBS. They are conducting a cross departmental project under the title: ‘Visualizing Public Complex Data’.
We asked Dolores and Daniel a few questions about the project. Firstly, we asked them to explain the overall purpose of the project:
“Our project concerns the interpretation and visualization of linguistic data, such as the news feeds in conventional media and social media. We have built tools to assess the importance of the different words and phrases, and furthermore to determine how closely related they are. Determining the importance and relatedness of words and phrases raises a variety of challenges in syntactic and semantic interpretation, which are at the frontier of current research in computational linguistics.
After performing this linguistic analysis, and with the help of cutting-edge Data Science tools, we construct a visualization of the data, where the word size displays its importance, and the relative position of words indicates their relatedness. We provide a temporal sequence of such visualizations, resulting in a powerful and intuitive representation of a news feed: the user at a glance understands what the important terms are, how they are related, and how this changes over time.”
Second we asked them how it is actually possible to visualize public complex data:
“To construct this visualization, we have developed a mathematical optimization model that builds dynamic visualization maps. We consider a set of N words and T time periods. We have two (dynamic) attributes associated with the words— the word frequency and the similarity between pair of words, which are both dynamic. We consider three objectives to construct our dynamic visualization map. The first objective takes care that in each time period the words are roughly placed accordingly to the dissimilarity measure. The second objective aims at spreading the words across to ensure the readability of the dynamic visualization map. The third objective ensures a
smooth transition between the visualization maps.”
Lastly, Dolores and Daniel were asked to explain whether or not they had reached any specific findings yet:
“The major innovation is that we have constructed a visualization tool that combines linguistic research and computational optimization work. We believe that this visualization tool, while still at the prototype stage, has the potential to provide a new approach to the understanding and presentation of news feeds, which have become omnipresent in today's informational landscape.”