Project Overview
This project implements the methodology from "Extracting Movement-based Topics for Analysis of Space Use" (G. Andrienko, N. Andrienko, D. Hecker, EuroVis Workshop on Visual Analytics, 2023). The implementation was developed for the course 193.166 Visualisierung (VU 4.0), TU Wien, 2025W.
Reference Paper
Dataset
Instead of the Milan car movement data used in the original paper, this project uses the Beijing T-Drive taxi trajectory dataset (2008). The focus is on implementing the analytical pipeline and enabling interactive exploration of spatial and movement patterns in urban taxi trajectories.
Key Objectives
Space Tessellation
Extract characteristic points from trajectories and construct Voronoi cells to partition the urban space.
Topic Modeling
Apply Non-negative Matrix Factorization (NMF) to discover place-based and movement-based topics.
Interactive Visualization
Explore discovered patterns through an interactive web-based visualization built with D3.js.
How the Visualization Works
The visualization provides an interactive interface to explore the discovered spatial and movement topics. Built with D3.js and Leaflet, it allows users to examine patterns across different tessellation configurations and topic numbers.
Visualization Features
-
Dataset Selection
Choose from multiple tessellation configurations (different radii) and topic numbers to explore various granularities of spatial patterns. -
Dual View Modes
Toggle between Places visualization (colored Voronoi polygons showing areas) and Moves visualization (curved arcs showing transitions between cells). -
Topic Filtering
View all topics at once or focus on a specific topic. Filter features by minimum topic weight to highlight the most significant patterns. -
Interactive Map Controls
Hover over features to highlight them, click for detailed information including topic distributions, and zoom/pan to explore different areas of Beijing. -
Visual Customization
Adjust opacity, stroke width, color schemes, and toggle hover effects to create the optimal visualization for your analysis needs.
Places Visualization
The places view displays Voronoi cells colored by their primary topic assignment. Each color represents a different spatial topic, revealing areas that tend to be used together in the taxi trajectory patterns.
Moves Visualization
The moves view shows transitions between cells as curved arcs, colored by movement topic. This reveals common routes and movement patterns in the Beijing taxi network.
How the Backend Works
The backend consists of a Python-pipeline that processes raw trajectory data through multiple stages to extract meaningful spatial and movement patterns.
Pipeline Architecture
-
Data Preprocessing & Filtering
Load raw taxi trajectory data from the T-Drive dataset. Apply spatial bounding box filtering, temporal filtering (date/time-of-day), and trajectory validation to clean the input data. -
Characteristic Point Extraction
Identify significant points in trajectories: stops (low speed, stationary periods), turns (significant direction changes), and transitions. These points capture the essential structure of movement patterns. -
Spatial Clustering & Tessellation
Group characteristic points with a configurable radius parameter. Construct Voronoi polygons around cluster centroids to create a tessellation of the urban space. Each cell represents a distinct spatial area. -
Sequence Generation
Transform trajectories into sequences of cell visits (places) and cell-to-cell transitions (moves). Create document-term matrices where documents are trajectories and terms are cells or transitions. -
Topic Modeling with NMF
Apply Non-negative Matrix Factorization independently to place sequences and move sequences. This discovers latent topics that represent frequently co-occurring places or movement patterns. Each topic is a probability distribution over cells or transitions. -
Evaluation & Export
Run ensemble NMF with multiple random seeds. Compute quality metrics (reconstruction error, sparsity). Export results as GeoJSON for visualization including topic assignments and weights for each cell/transition.
Technology Stack
Key Files & Scripts
run_tessellation_pipeline.py
Main script for space tessellation. Configurable parameters for radius, subset size, and spatial/temporal filtering.
run_nmf.py
Runs NMF topic modeling on tessellation results. Supports both places and moves with evaluation metrics.
run_all_exp.py
Batch experiment runner that tests multiple parameter combinations for systematic exploration.
generate_datasets_list.py
Generates the dataset catalog JSON file consumed by the web visualization.
Getting Started
Quick Start
To run the visualization with existing results:
- Start a local web server from the project root:
python -m http.server 8000 - Open your browser to:
http://localhost:8000/visualization - Select a dataset and explore the discovered topics
Running the Full Pipeline
To process your own trajectory data or experiment with different parameters:
- Set up the Python environment and install dependencies
- Download the T-Drive dataset or prepare your own trajectory data
- Run the tessellation pipeline:
python run_tessellation_pipeline.py - Run NMF topic modeling:
python run_nmf.py - Generate the dataset catalog:
python generate_datasets_list.py - Launch the visualization using a local web server