Description
In many domains, such as biology, chemistry, medicine, and the humanities, large amounts of data exist. Visual exploratory analysis of these data is often not practicable due to their size and their unstructured nature. Traditional machine learning (ML) requires large-scale labeled training data and a clear target definition, which is typically not available when exploring unknown data. For such large-scale, unstructured, open-ended, and domain-specific problems, we need an interactive approach combining the strengths of ML and human analytical skills into a unified process that helps users to "detect the expected and discover the unexpected".
In a project in collaboration with FH St.Pölten, we investigate how humans and machines can learn about and from the data in a joint fashion. The focus of TU Wien thereby lies on suitable visual interfaces that facilitate the joint human-machine data exploration process.
Projects
Within this project, multiple student project and theses topics are available:
(PR/DA, 1-2 persons) User Study: Analogue Exploratory Analysis: The goal is to better understand how people explore sets of unstructured data, such as a collection of images or documents, by observing their behavior when they analyze a set of printed artifacts. The task is two-fold: 1) setting up an environment where user activities can be tracked and logged, and 2) conducting and evaluating the study on-site. There is a possibility to conduct this work as paid student assistant (details to be discussed). This work extends a previous study comparing document analysis in analogue and digital settings: link to master thesis.
(PR, 1-2 persons) Joint Human-Machine Data Exploration Framework: The goal of this work is to design, develop, and test an extensible system that supports users to annotate their data while they explore it, and to train an ensemble of machine learning models in the background during the process. The core of the system is the user interface for exploratory visual analysis for large unstructured data sets – potentially from heterogeneous data sources. There is a possibility to conduct this work as paid student assistant (details to be discussed).
(BA/DA, 1 person): Prototypical Visualization: Exploratory analysis of unstructured data requires inspection of individual data samples to incrementally understand how to structure the data. If the data is very large, comprehensive visual inspection is no longer feasible. In this work, the student shall design and develop a visual interface, which selectively shows representative data instances supporting or contradicting users’ expected structure within the data. The foundation could be a prototypical network, which can be used to select representative data items supporting or contradicting the suggested data structure. Depending on the type of work (BA or DA), the focus of the work should lie on a reasonable sampling of the design space, comprising different prototype selection strategies and layout options.
Requirements
- Strong interest in visualization, user interfaces, machine learning, and human-computer interaction
- Very good programming skills
- Experience with web technologies (JavaScript, d3, ...) advantageous
- Experience with ML libraries also advantageous
Environment
Most sub-projects are most suited for a web-based approach, using d3.js or WebGL as front-end and a Python backend.