High-Performance Framework for Dataset Generation

Information

  • Publication Type: Bachelor Thesis
  • Workgroup(s)/Project(s):
  • Date: August 2020
  • Date (to): 5. August 2020
  • Matrikelnummer: 1634877
  • First Supervisor: Philipp Erler
  • Keywords: dataset generation, framework, surface reconstruction

Abstract

The aim of this bachelor thesis is the development of a Python framework. The main task for this framework is the generation of datasets, which can be further used for surface reconstruction. They are needed for training a neural network, which is then able to reconstruct meshes on its own given a point cloud of a mesh. In order to optimize the training of the neural network, a lot of training data is needed. This framework utilizes multi-processing to achieve a faster generation process in comparison to sequentially generating one mesh after another.

In addition, the framework is also able to handle any kind of similar pipeline. The user is able to define the steps of such pipeline in an XML document, which then can make calls to arbitrary programs. This fact makes the framework an all-purpose tool for any kind of task that needs to process a lot of data independent from each other.

The results show a great performance increase when generating datasets. This can be seen in the benchmarks that have been done. The time of execution for a fixed amount of files has been measured with different modes of execution. The custom process pool we developed shows a faster time overall compared to using Python's process pool for each step of the pipeline independently. It is also way faster in comparison to running every step for each file sequentially.

Additional Files and Images

Additional images and videos

teaser: The processing steps of the dataset generation shown as a graph. teaser: The processing steps of the dataset generation shown as a graph.

Additional files

Weblinks

No further information available.

BibTeX

@bachelorsthesis{riegler_2020_framework,
  title =      "High-Performance Framework for Dataset Generation",
  author =     "Maximilian Riegler",
  year =       "2020",
  abstract =   "The aim of this bachelor thesis is the development of a
               Python framework. The main task for this framework is the
               generation of datasets, which can be further used for
               surface reconstruction. They are needed for training a
               neural network, which is then able to reconstruct meshes on
               its own given a point cloud of a mesh. In order to optimize
               the training of the neural network, a lot of training data
               is needed. This framework utilizes multi-processing to
               achieve a faster generation process in comparison to
               sequentially generating one mesh after another.  In
               addition, the framework is also able to handle any kind of
               similar pipeline. The user is able to define the steps of
               such pipeline in an XML document, which then can make calls
               to arbitrary programs. This fact makes the framework an
               all-purpose tool for any kind of task that needs to process
               a lot of data independent from each other.  The results show
               a great performance increase when generating datasets. This
               can be seen in the benchmarks that have been done. The time
               of execution for a fixed amount of files has been measured
               with different modes of execution. The custom process pool
               we developed shows a faster time overall compared to using
               Python's process pool for each step of the pipeline
               independently. It is also way faster in comparison to
               running every step for each file sequentially.",
  month =      aug,
  address =    "Favoritenstrasse 9-11/E193-02, A-1040 Vienna, Austria",
  school =     "Research Unit of Computer Graphics, Institute of Visual
               Computing and Human-Centered Technology, Faculty of
               Informatics, TU Wien ",
  keywords =   "dataset generation, framework, surface reconstruction",
  URL =        "/research/publications/2020/riegler_2020_framework/",
}