At a glance
- app.py: serves
static/index.htmland JSON endpoints (/api/datasets,/api/diagram). - core.py: loads CSV, parses members, builds zones (set combinations), builds super-dual edges.
- main.js: fetches JSON, lays out zones on rings, draws dual edges and set curves.
- labels.js: adds set and member labels to the final diagram (Step 4) so the visualization is readable.
app.py (Flask server)
From README_app.md
Purpose
app.py runs the Flask web server that:
- serves the frontend UI (
static/index.html) - provides JSON endpoints that return the computed diagram data (zones + dual edges)
Key configuration
Creates the Flask app with static_folder="static" and static_url_path="", so files in static/ are served at the site root (e.g., / → static/index.html, /src/js/main.js → static/src/js/main.js).
Datasets
Datasets are defined in DATASETS as key → file path, e.g.
"Simpsons"→data/simpsons.csv"Harry Potter"→data/harrypotter.csv"Full (4)"→data/full4.csv"Full (5)"→data/full5.csv
Routes / API
Provides a small REST API used by the frontend: /api/datasets lists available datasets, and /api/diagram returns zones + dual edges for a selected dataset. The UI calls these endpoints on page load and when the dataset changes.
GET `/`
Serves the UI (static/index.html).
GET `/api/datasets`
Returns an array of objects {"key": "..."} for the dataset dropdown. Keys come from DATASETS and are sorted alphabetically.
GET `/api/diagram?dataset=<key>`
Loads the CSV for the selected dataset, computes zones + dual edges, returns:
Running
Start the development server with python app.py. By default it runs on port 5001 (debug mode). Open http://127.0.0.1:5001/ in a browser.
core.py (Data + graph)
From README_core.md
Purpose
core.py converts a CSV dataset into:
- members (each with incident sets)
- zones (unique set-combination nodes)
- super-dual edges (between zones differing by one set)
Loading CSV
Reads CSV rows via csv.DictReader. If no delimiter is provided, it tries to sniff the delimiter; if sniffing fails it falls back to a default (comma for most files, semicolon for datasets like EU.csv).
Parsing members
Chooses a name column (tries common headers like name, Name, id, label, etc.) and treats all other columns as set names. A cell is interpreted as membership when it contains a truthy value (e.g., 1, true, yes, y, x).
Building zones
Groups members by their set-combination signature (sorted set names joined by |, e.g., A|B|C). The number of sets in the signature is the zone rank. Zones are sorted by (rank, signature) and assigned sequential IDs.
Building the super-dual graph
Builds directed edges from subset → superset between zones that differ by exactly one set. Each edge is annotated with addedSet, the set that is present in the target zone but not in the source zone.
Output usage
app.py uses these outputs to build the JSON returned by /api/diagram.
main.js (D3 renderer)
From README_mainjs.md
Purpose
main.js renders the visualization in the browser using SVG + D3:
1. fetch diagram JSON from the backend
2. place zone nodes on concentric rings
3. draw the super-dual edges
4. draw one closed curve per set
Data flow
On startup, the frontend fetches /api/datasets to populate the dataset dropdown, then fetches /api/diagram?dataset=… to get zones and dualEdges. It then renders Steps 2–4 from that JSON.
Step 2 — Layout (rings)
Computes a ring layout where zones are placed by rank (rank 1 outer ring, higher ranks inward). Combination nodes are assigned angles “between” their parent zones (immediate subsets), and angles are adjusted to keep a minimum separation so nodes on the same ring don’t overlap.
Step 3 — Dual edges
Draws the super-dual edges as straight SVG lines between zone positions. Each edge is colored by addedSet so it’s clear which set the edge corresponds to.
Step 4 — Curves
For each set S, builds a closed curve from boundary evidence points: midpoints of dual edges that add S, plus outer anchor points slightly outside the outermost ring containing S. Optional phantom points can be added for stability when true boundary evidence is missing. Points are ordered around a centroid and connected with a closed Catmull–Rom spline.
Rendering entry point
The script initializes the dropdown and triggers rendering when the dataset changes.
labels.js (Labels + Members)
From README_labelsjs.md
Purpose
labels.js adds readable text to the final visualization (Step 4):
- Set labels: places each set name outside its curve (so labels don’t sit in intersections).
- Member labels: places member names inside the correct intersection region (zone), avoiding overlaps.
Context / inputs
The module receives shared context from main.js via setLabelContext(...):
sets: list of all set namesnodePoints: zone node positions (used to keep labels away from nodes)placedMemberLabelRects: rectangles of already placed member labels (collision avoidance)center,labelOffset: geometry parameters used for “outward” label placement
Note: labels.js imports safeId from main.js to build stable SVG ids and relies on the global d3 (loaded in index.html).
Set labels (outside curves)
placeSetLabels(...) places each set name using an SVG <textPath>:
- Starts from the curve polygon points for a set (computed in
main.js). - Builds an “outward offset” version of the curve (pushing points away from the center).
- Samples candidate positions along the curve and scores them to:
- avoid intersections (hard penalty if label center is inside another set’s fill)
- avoid overlaps with already placed set labels (rectangle collision checks)
- avoid nodes (prefer positions farther from zone node points)
- prefer outside (slight preference for positions farther from the center)
- Creates a hidden label path and attaches the text with
startOffsetso it follows the curve.
Member labels (inside the correct zone)
placeMemberLabels(...) tries to place member names inside the exact intersection region of each zone:
- Uses hidden filled curve paths + SVG
isPointInFillto test whether a point is inside/outside each set curve. - A point is accepted for a zone if it matches all sets:
- inside curve
SifSis inzone.setNames - outside curve
SifSis not inzone.setNames - Samples many candidate positions around the zone node (different radii + angles).
- Requires the whole text rectangle (center + corners) to remain inside the zone region.
- Avoids overlaps with set labels and already placed member labels.
- Uses font-size fallbacks (smaller fonts for dense zones). If still too crowded, it falls back to a “+N” style overflow for remaining members.
Debugging
When debug mode is enabled (SHOW_DEBUG_POINTS), the module can draw helper points to make it easier to verify where curves and label candidates are being evaluated.