Dimensionality Reduction: Overview, Technical Details, and Some Applications

Research output: Chapter in Book/Report/Conference proceedingBook chapterResearchpeer-review

Abstract

Dimensionality reduction is an Exploratory Data Analysis (EDA) approach allowing for fast visualization of high-dimensional data and the possibility of discovering hidden systematic patterns within a data set. While linear dimensionality reduction techniques, such as Principal Component Analysis (PCA), are considered the golden standard in many areas of data science, they seem to be inadequate for analyzing non-linear high-dimensional data (e.g., images, text, gene expression). Instead, in this case, non-linear dimensionality reduction with t-distributed Neighbor Embedding (tSNE) and Uniform Manifold Approximation and Projection (UMAP) have been widely used, providing state-of-the-art methods to explore high-dimensional data. This chapter will give an overview of dimension reduction techniques, with a particular focus on PCA, tSNE, and UMAP and their applications within the fields of data science and computational biology.

Original languageEnglish
Title of host publicationApplied data science in tourism
Subtitle of host publicationInterdisciplinary approaches, methodologies, and applications
EditorsRoman Egger
PublisherSpringer Nature
Pages151-167
Number of pages17
ISBN (Electronic)978-3-030-88389-8
ISBN (Print)978-3-030-88388-1
DOIs
Publication statusPublished - 2022

Publication series

NameTourism on the verge
PublisherSpringer
ISSN (Print)2366-2611
ISSN (Electronic)2366-262X

Subject classification (UKÄ)

  • Bioinformatics (Computational Biology)

Free keywords

  • High-dimensional data
  • MDS
  • PCA
  • The Curse of Dimensionality
  • tSNE
  • UMAP

Fingerprint

Dive into the research topics of 'Dimensionality Reduction: Overview, Technical Details, and Some Applications'. Together they form a unique fingerprint.

Cite this