ControVis

ControVis - Informations and Overview

This web site has been created for the course "Visualisierug 2" at the Vienna University of Technology and demonstrates our implementation of a visual analysis tool introduced in this paper by the authors Ulrik Brandes and Jürgen Lerner. The tool is used for the visual analysis of controversy in user-generated encyclopedias - this is also where the name of our project comes from: ControversyVisualization.

Time for some theory

First of all we will try to explain the theoretical background of the project in order to be able to understand how this visualization actually works and what it shows us. Our implementation tries to visualize controversies between authors of wikipedia articles. This is done by using the revision history of articles and the information about the relations between the authors that can be derived from it.

The basic idea behind the visualization is to build a so called "who revises whom"-network. This is done by analyzing the list of revisions of a page ordered by time. Consecutive revisions can be interpreted as one author revising the changes of the other. Since we are interested in visualizing conflicts the revision edges (edges in the revision network between consecutive revisions) are weighted in such a way that the weights can be interpreted as the disagreements between the authors.

The final visualization of the conflicts is done by plotting the revision network in 2D space and mapping the authors (also shown as ellipses) to a big ellipse. This mapping is done by simply computing the two smallest eigenvectors of the adjacency matrix that is built from the revision network and then normalizing and mapping the resulting coordinate values to an ellipse. The detailed explanation can be found in the original paper. However there are several visual properties of the visualization that represent different characteristics of the revision network and the authros.

Implementation

We chose to implement the visualization tool using webtechnologies because we thaught it would be practical to do the visualization online just like the articles of wikipedia itself. This decision was a big challenge, mainly because of the inability of those technologies to handle the huge amount of data provided by the wikipedia-history. Therefore, especially in the beginning, we had to overcome a lot of minor and sometimes even major problems.

Achievments

Used Libraries

Challenges

Open Issues

List of Calculated Values:

Calculation of the eigenvectors e1, e2 for the two smallest eigenvalues ev1 and ev2 of the Adjacency-Matrix This values are used for the calculation of the viewing parameters. The paper suggests the use of "orthogonal iteration".
Calculation of the position p(v) of user v p(v)=(p1(v),p2(v))=(xv,s*yv); xv=E1 ;yv=E2; Skewness s=Ev2/Ev1; this values are then normalize onto the main-ellipse:: r1=length of the horizontal ellipse-axis r2=s*r1=length of the vertical ellipse; Position on the ellipse: (r1 * p1(v)/i(v), r2 * p2(v)/i(v)).
Calculation of the involvment i(v) of the user v(i(v) is used for the area-calculation of the user-nodes) involvement i(v) = √(p1(v)^2+ p2(v)^2)
Revisor vs beeing revised The out-degree of user v is calculated from the sum of all weights ω(v,u) of the Revision-Matrix. The in-degree from the sum of the all the weights ω(u,v). The user-node is an ellipse with the ratio height/width=out/in and the area i(v)
Direction of the involvment The line thickness ω(u,v)
Steady vs Unsteady For e_i(v) with i=1..n as the number of Revisions from the user v in week i, you can calculate the mean an the variance like this: μ=Sum(e_i)/n; σ^2=Sum((e_i-μ)^2/n); for authors with a minimum number(not defined in the paper) of edits you then calculate the relative standard deviation with σ/μ where this value should be normalized over all users so that the smallest value will be 0 and the biggest one 1 (1=black=unsteady;1=red=steady)
Total number of edits This Bar-Chart shows the times on which the site is open to heavy changes. This can be used for furhter filtering of the revisions.