Large Data Scalability in Interactive Visual Analysis

Harald Piringer
Large Data Scalability in Interactive Visual Analysis
Supervisor: Meister Eduard Gröller
Duration: Oct 2008 to May 2011
[Thesis]

Information

Abstract

In many areas of science and industry, the amount of data is growing fast and often already exceeds the ability to evaluate it. On the other hand, the unprecedented amount of available data bears an enormous potential for supporting decision-making. Turning data into comprehensible knowledge is thus a key challenge of the 21st century. The power of the human visual system makes visualization an appropriate method to comprehend large data. In particular interactive visualization enables a discourse between the human brain and the data that can transform a cognitive problem to a perceptual one. However, the visual analysis of large and complex datasets involves both visual and computational challenges. Visual limits involve perceptual and cognitive limitations of the user and restrictions of the display devices while computational limits are related to the computational complexity of the involved algorithms. The goal of this thesis is to advance the state of the art in visual analysis with respect to the scalability to large datasets. Due to the multifaceted nature of scalability, the contributions span a broad range to enhance computational scalability, to improve the visual scalability of selected visualization approaches, and to support an analysis of high-dimensional data. Concerning computational scalability, this thesis describes a generic architecture to facilitate the development of highly interactive visual analysis tools using multi-threading. The architecture builds on the separation of the main application thread and dedicated visualization threads, which can be cancelled early due to user interaction. A quantitative evaluation shows fast visual feedback during continuous interaction even for millions of entries. Two variants of scatterplots address the visual scalability of different types of data and tasks. For continuous data, a combination of 2D and 3D scatterplots intends to combine the advantages of 2D interaction and 3D visualization. Several extensions improve the depth perception in 3D and address the problem of unrecognizable point densities in both 2D and 3D. For partly categorical data, the thesis contributes Hierarchical Difference Scatterplots to relate multiple hierarchy levels and to explicitly visualize differences between them in the context of the absolute position of pivoted values. While comparisons in Hierarchical Difference Scatterplots are only qualitative, this thesis also contributes an approach for quantifying subsets of the data by means of statistical moments for a potentially large number of dimensions. This approach has proven useful as an initial overview as well as for a quantitative comparison of local features like clusters. As an important application of visual analysis, the validation of regression models also involves the scalability to multi-dimensional data. This thesis describes a design study of an approach called HyperMoVal for this task. The key idea is to visually relate n-dimensional scalar functions to known validation data within a combined visualization. The integration with other multivariate views is a step towards a user-centric workflow for model building. Being the result of collaboration with experts in engine design, HyperMoVal demonstrates how visual analysis is suitable to significantly improve real-world tasks. Positive user feedback suggests a high impact of the contributions of this thesis also outside the visualization research community. Moreover, most contributions of this thesis have been combined in a commercially distributed software framework for engineering applications that will hopefully raise the awareness and promote the use of visual analysis in multiple application domains.

Additional Files and Images

Additional images and videos

Additional files

Weblinks

No further information available.

BibTeX

@phdthesis{PH-2011-LDS,
  title =      "Large Data Scalability in Interactive Visual Analysis",
  author =     "Harald Piringer",
  year =       "2011",
  abstract =   "In many areas of science and industry, the amount of data is
               growing fast and often already exceeds the ability to
               evaluate it. On the other hand, the unprecedented amount of
               available data bears an enormous potential for supporting
               decision-making. Turning data into comprehensible knowledge
               is thus a key challenge of the 21st century. The power of
               the human visual system makes visualization an appropriate
               method to comprehend large data. In particular interactive
               visualization enables a discourse between the human brain
               and the data that can transform a cognitive problem to a
               perceptual one. However, the visual analysis of large and
               complex datasets involves both visual and computational
               challenges. Visual limits involve perceptual and cognitive
               limitations of the user and restrictions of the display
               devices while computational limits are related to the
               computational complexity of the involved algorithms. The
               goal of this thesis is to advance the state of the art in
               visual analysis with respect to the scalability to large
               datasets. Due to the multifaceted nature of scalability, the
               contributions span a broad range to enhance computational
               scalability, to improve the visual scalability of selected
               visualization approaches, and to support an analysis of
               high-dimensional data. Concerning computational scalability,
               this thesis describes a generic architecture to facilitate
               the development of highly interactive visual analysis tools
               using multi-threading. The architecture builds on the
               separation of the main application thread and dedicated
               visualization threads, which can be cancelled early due to
               user interaction. A quantitative evaluation shows fast
               visual feedback during continuous interaction even for
               millions of entries. Two variants of scatterplots address
               the visual scalability of different types of data and tasks.
               For continuous data, a combination of 2D and 3D scatterplots
               intends to combine the advantages of 2D interaction and 3D
               visualization. Several extensions improve the depth
               perception in 3D and address the problem of unrecognizable
               point densities in both 2D and 3D. For partly categorical
               data, the thesis contributes Hierarchical Difference
               Scatterplots to relate multiple hierarchy levels and to
               explicitly visualize differences between them in the context
               of the absolute position of pivoted values. While
               comparisons in Hierarchical Difference Scatterplots are only
               qualitative, this thesis also contributes an approach for
               quantifying subsets of the data by means of statistical
               moments for a potentially large number of dimensions. This
               approach has proven useful as an initial overview as well as
               for a quantitative comparison of local features like
               clusters. As an important application of visual analysis,
               the validation of regression models also involves the
               scalability to multi-dimensional data. This thesis describes
               a design study of an approach called HyperMoVal for this
               task. The key idea is to visually relate n-dimensional
               scalar functions to known validation data within a combined
               visualization. The integration with other multivariate views
               is a step towards a user-centric workflow for model
               building. Being the result of collaboration with experts in
               engine design, HyperMoVal demonstrates how visual analysis
               is suitable to significantly improve real-world tasks.
               Positive user feedback suggests a high impact of the
               contributions of this thesis also outside the visualization
               research community. Moreover, most contributions of this
               thesis have been combined in a commercially distributed
               software framework for engineering applications that will
               hopefully raise the awareness and promote the use of visual
               analysis in multiple application domains.",
  month =      may,
  address =    "Favoritenstrasse 9-11/186, A-1040 Vienna, Austria",
  school =     "Institute of Computer Graphics and Algorithms, Vienna
               University of Technology",
  keywords =   "high dimensionality, Visualization, Scalability,
               Interaction, Data analysis, multi-threading, scatter plots",
  URL =        "https://www.cg.tuwien.ac.at/research/publications/2011/PH-2011-LDS/",
}