Latent Space Cartography (Fork)

alt text

1. Project Overview

Original Repository (uwdata)

The original Latent Space Cartography was a research tool designed to create and explore visual projections of vector space embeddings. While it contained core algorithms for projecting high-dimensional data, the workflow was static. Users had to manually prepare data and execute individual scripts with hardcoded parameters to generate visualizations. It primarily focused on image data, with no native code included for processing text embeddings.

Fork Objectives

This fork re-engineers the project into a dynamic, full-stack web application.


2. Key Differences & Features

Text Processing Evolution

Dynamic Projection Jobs (t-SNE & PCA)

Additional Features

alt text

3. Technical Stack

Component Upstream (Original) GMK-TU Fork
Language Python 2.7 Python 3.12+
Backend Framework Flask 0.x Flask 3.x
ML Engine TensorFlow 1.x / Keras 2 TensorFlow 2.16+ / Keras 3
Database MySQL SQLite3
Frontend Vue.js / Legacy Webpack Vue.js 2.6 / Webpack 5
Numerical Libs Numpy (Legacy) NumPy, Pandas, Scikit-Learn 1.4+

4. Installation & Setup Guide

Prerequisites

Step 1: Backend Setup

  1. Clone the repository and navigate to the root directory.
  2. Create a virtual environment:
    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
    
  3. Install dependencies:
    pip install -r requirements.txt
    

Step 2: Client Build & Execution

The application logic, including the server entry point (server.py), is located within the client folder in this fork.

  1. Navigate to the client directory:
    cd client
    
  2. Install Node dependencies:
    npm install
    

Option A: Production / Standard Use To build the static assets and run the standard server:

  1. Build the Vue.js frontend:
    npm run build
    
    This compiles the Vue assets into the client/build/ directory.
  2. Execute the Python server:
    python server.py
    

Option B: Development To run the frontend with hot-reloading enabled during development:

  1. Execute the development server:
    npm run dev
    
  2. Ensure the Python backend is running separately (via python server.py inside the client folder) to handle API requests.

Access: Open your browser and navigate to http://localhost:5000 (or the port specified by the dev runner).


5. User Workflows

Workflow A: Image Latent Space (VAE Pipeline)

alt text This workflow transforms raw images into a navigable latent space using a Variational Autoencoder.

1. Data Ingestion

2. Job: Vectorization

3. Job: Training

4. Job: Projection (Dynamic)


Workflow B: Text Latent Space (NLP)

alt text This workflow visualizes semantic relationships using imported word vectors.

1. Job: Text Import

2. Visualization


6. API & Configuration Reference

Job System API

The frontend triggers asynchronous jobs via these REST endpoints:

Database Schema (SQLite)

Data is managed in datasets.db (auto-created in the root).

Configuration

7. Example videos

Image imports

Text imports

Dataset Change