0% found this document useful (0 votes)
6 views21 pages

Vis HD Star Beiwang 2015

Uploaded by

minhnhat02032021
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views21 pages

Vis HD Star Beiwang 2015

Uploaded by

minhnhat02032021
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

Eurographics Conference on Visualization (EuroVis) (2015) STAR – State of The Art Report

R. Borgo, F. Ganovelli, and I. Viola (Editors)

Visualizing High-Dimensional Data:


Advances in the Past Decade

S. Liu1 , D. Maljovec1 , B. Wang1 , P.-T. Bremer2 and V. Pascucci1

1 Scientific Computing and Imaging Institute, University of Utah


2 Lawrence
Livermore National Laboratory

Abstract
Massive simulations and arrays of sensing devices, in combination with increasing computing resources, have
generated large, complex, high-dimensional datasets used to study phenomena across numerous fields of study.
Visualization plays an important role in exploring such datasets. We provide a comprehensive survey of advances
in high-dimensional data visualization over the past 15 years. We aim at providing actionable guidance for data
practitioners to navigate through a modular view of the recent advances, allowing the creation of new visualiza-
tions along the enriched information visualization pipeline and identifying future opportunities for visualization
research.
Categories and Subject Descriptors (according to ACM CCS): I.3.3 [Computer Graphics]: Picture/Image
Generation—Line and curve generation

1 Introduction for multivariate data. These papers provide a valuable sum-


mary of existing techniques and inspiring discussions of fu-
With the ever-increasing amount of available computing ture directions in their respective domains. However, few
resources, our ability to collect and generate a wide vari- surveys in the past decade have aimed at providing a general,
ety of large, complex, high-dimensional datasets continues coherent, and unified picture that addresses the full spectrum
to grow. High-dimensional datasets show up in numerous of techniques for visualizing high-dimensional data.
fields of study, such as economy, biology, chemistry, polit-
ical science, astronomy, and physics, to name a few. Their We provide a comprehensive survey of advances in high-
wide availability, increasing size and complexity have led dimensional data visualization over the past 15 years, with
to new challenges and opportunities for their effective vi- the following objectives: providing actionable guidance for
sualization. The physical limitations of the display devices data practitioners to navigate through a modular view of the
and our visual system prevent the direct display and instanta- recent advances, allowing the creation of new visualizations
neous recognition of structures with higher dimensions than along the enriched information visualization pipeline, and
two or three. In the past decade, a variety of approaches have identifying opportunities for future visualization research.
been introduced to visually convey high-dimensional struc- Our contributions are as follows. First, we propose a cat-
tural information by utilizing low-dimensional projections egorization of recent advances based on the information vi-
or abstractions: from dimension reduction to visual encod- sualization (InfoVis) pipeline [CMS99] enriched with cus-
ing, and from quantitative analysis to interactive exploration. tomized action-driven classifications (Figure 2, Section 2).
A number of surveys have focused on different aspects of We further assess the amount of interplay between user in-
high-dimensional data visualization, such as parallel coordi- teractions and pipeline-based categorization and put user in-
nates [Ins09, HW13], quality measures [BTK11], clutter re- teractions into a measurable context (Table 1, Section 6).
duction [ED07], visual data mining [HG02, Kei02, DOL03], Second, we highlight key contributions of each advancement
and interactive techniques [BCS96]. High-dimensional as- (Sections 3, Section 4, Section 5). In particular, we provide
pects of scientific data have also been investigated within the an extensive survey of visualization techniques derived from
surveys [BH07,KH13]. The surveys [WB94,Cha06,Mun14] topological data analysis (Section 3.5, Section 4.4), a new
focus on the various aspects of visual encoding techniques area of study that provides a multi-scale and coordinates-

c The Eurographics Association 2015.


S. Liu, D. Maljovec, B. Wang, P.-T Bremer & V. Pascucci / Visualizing High-Dimensional Data:Advances in the Past Decade

free summary of high-dimensional data [Car09]. Further- tion. This category includes visual encodings based on axes
more, we connect advances in high-dimensional data visu- (e.g., scatterplots and parallel coordinate plots), glyphs, pix-
alization with volume rendering and machine learning (Sec- els, and hierarchical representations; together with anima-
tion 7). Finally, we reflect on our categorization with respect tion and perception. View transformation (Section 5) corre-
to actionable tasks, and identify emerging future directions sponds to methods focusing on screen space and rendering,
in subspace analysis, model manipulation, uncertainty quan- including illustrative rendering for various visual structures,
tification, and topological data analysis (Section 8). as well as screen space measures for reducing clutter or arti-
facts and highlighting important features.
Such a design allows us to easily classify the core con-
tribution of vastly different methods that operate on en-
tirely different objects, but at the same time, reveal their
interconnections through the linked pipeline. In addition,
the pipeline-based categorization provides the reader with
a modular view of the recent advances, allowing new sys-
tems to be configured based on possibilities provided by the
reviewed methods.
User interactivity is an integral part within each pro-
cessing step of the pipeline, as illustrated in Figure 2.
Based on the amount of user interaction, we can classify
Figure 1: Interactive survey website for paper navigation. all high-dimensional data visualization methods into three
categories: computation-centric, interactive exploration, and
model manipulation. The distinction between interactive ex-
2 Survey Method and Categorization ploration and model manipulation is made to emphasize a
We conduct a thorough literature review based on relevant particular manipulation paradigm, where the underlying data
works from major visualization venues, namely Visweek, model is modified based on interaction to reflect user inten-
EuroVis, PacificVis, and the journal IEEE Transactions on tion. A summary of the interplay between processing steps
Visualization and Computer Graphics (TVCG) from the pe- and interactions is illustrated in Table 1, where user interac-
riod between 2000 and 2014. To ensure the survey covers tions are put into a measurable context. The corresponding
the state-of-the-art, we further selectively search through ref- details are discussed in Section 6.
erences within the initial set of papers. Beyond the visual-
ization field, we also dedicate special attention to the ex- 3 Data Transformation
ploratory data analysis techniques in the statistics commu- We start by describing different types of high-dimensional
nity. Through such a rigorous search process, we have iden- datasets. We then give an in-depth discussion on the action-
tified more than 200 papers that focus on a wide spectrum driven subcategories centered around typical analysis tech-
of techniques for high-dimensional data visualization. To niques during data transformation, namely, dimension re-
help organize the large quantity of papers, we have produced duction, clustering (in particular, subspace clustering) and
an interactive survey website (www.sci.utah.edu/ regression analysis. We focus especially on their usages in
~shusenl/highDimSurvey/website, based on the visualization methods. In addition, we pay special attention
SurVis [Bec14] framework; a screen shot is shown in Fig- to topological data analysis, which is a promising emerging
ure 1) that allows readers to interactively select and filter field.
papers through various tags. However, due to the space limi-
tation, only a subset of the complete list of references (avail- 3.1 High-Dimensional Data
able through the survey website) is mentioned in the paper. We provide an overview of the different aspects of high-
As illustrated in Figure 2, we base our main catego- dimensional datasets, to define the scope of our discussion
rization on the three transformation steps of the informa- and highlight distinct properties of these datasets. Our dis-
tion visualization pipeline [CMS99] (and its minor varia- cussions on different data types are inspired by the book by
tion in [BTK11]), namely, data transformation, visual map- Munzner [Mun14].
ping, and view transformation. Each category is enriched Data Types. In our survey, we limit our exposition to
with novel, customized subcategories. Data transformation table-based data, and exclude (potentially high-dimensional)
(Section 3) corresponds to the analysis-centric methods such graph/network data from the discussion. A high-dimensional
as dimension reduction, regression, subspace clustering, fea- dataset is commonly modeled as a point cloud embedded in
ture extraction, topological analysis, data sampling, and ab- a high-dimensional space, with the values of attributes cor-
straction. Visual mapping (Section 4), the key for most vi- responding to the coordinates of the points. Based on the un-
sual encoding tasks, focuses on organizing the information derlying model of the data and the analysis and visualization
from the data transformation stage for visual representa-

c The Eurographics Association 2015.


S. Liu, D. Maljovec, B. Wang, P.-T Bremer & V. Pascucci / Visualizing High-Dimensional Data:Advances in the Past Decade

Source Data
Dimension Reduction Subspace Clustering Regression Analysis Topological Data Analysis

Data Transformation
linear projection[KC03], Dimension Space Exploration Optimization & Morse-Smale Complex
Data non-linear DR[WM04], [TFH11, YRWG13], Design Steering [GBPW10, CL11],
Transformation Control Points Projection[DST04], Subset of Dimension[TMF∗12], [BPFG11, DW13], Reeb Graph &
Distance Metric[LMZ∗14], Non-Axis-Parallel Subspace Structural Summaries Contour Tree[PSBM07],
Precision Measures[LV09] [Vid11, AWD12] [PBK10, GBPW10] Topological Features[WSPVJ11]
Transformed
Data

Axis Based Glyphs Pixel-Oriented Hierarchy Based Animation Evaluation


Visual
Visual Mapping
User Interactions

Mapping Scatterplot Matrix[WAG06], Per-Element Glyphs Jigsaw Map, Dimension Hierarchy GGobi[SLBC03], Scatterplot Guideline
Parallel Coordinate[JJ09], [CCM10, GWRR11] Pixel Bar Charts [WPWR03], TripAdvisorND [SMT13],
Radial Layout[LT13], [CCM13], [KHL01], Topology-based Hierarchy [NM13], PCPs Effectiveness,
Visual Hybrid Construction Multi-Object Glyphs Value & Relation [HW10, OHWS13], Rolling the Dice [HVW10],
Structure
[YGX∗09, CvW11] [War08, CGSQ11] Dispaly[YHW∗07] Others[ERHH11] [EDF08] Animation [HR07]

View
View Transformation

Transformation
Illustrative Rendering Continuous Visual Representation Accurate Color Blending Image Space Metrics
Illustrative PCP[MM08], Continuous Scatterplot[BW08], Hue-Preserving Blending Clutter Reduction
Views Illuminated 3D scatterplot[SW09], Continuous Parallel Coordiante [KGZ∗12], [AdOL04, JC08],
PCP density based [HW09, LT11], Weaving vs. Blending Pargnostics[DK10],
transfer function[JLJC05] Splatterplots[MG13] [HSKIH07] Pixnostic[SSK06]
User

Figure 2: Categorization based on transformation steps within the information visualization pipeline, with customized action-
driven subcategories.

goals, the attributes consist of input parameters and output pects of high-dimensional data. Visual analysis of the finan-
observations, and the data could be modeled as a scalar or cial time series data is explored in the work by Ziegler et
vector-valued function (where the function values are based al. [ZJGK10]. The work presented by Tam et al. [TFA∗ 11]
on the output observations) on the point cloud defined by the studies facial dynamics utilizing the analysis of time-series
input parameters. Topological data analysis (Section 3.5) ap- data in parameter space. Datasets with spatial information
plies to both point cloud data and functions on point cloud such as multivariate volumes [BDSW13] or multi-spectral
data (e.g., [GBPW10, SMC07]), while regression analysis images [LAK∗ 11] are very common in scientific visualiza-
(Section 3.4) typically applies to the latter (e.g., [PBK10]). tion, and numerous methods have been introduced within the
Attribute Types. The attribute type (e.g., nominal vs. nu- scientific visualization domain, see [BH07, KH13] for com-
merical) can greatly impact the visualization method. In prehensive surveys on these topics. We discuss the intrinsic
many fields and applications, the value of the attributes is interconnections between these two areas in Section 7.
nominal in nature. However, most commonly available high- 3.2 Dimension Reduction
dimensional data visualization techniques such as scatter-
Dimension reduction techniques are key components for
plots or parallel coordinate plots are designed to handle
many visualization tasks. Existing work either extends the
numerical values only. When utilizing these methods for
state-of-the-art techniques, or improves upon their capabili-
visualizing nominal data, information overlapping and vi-
ties with additional visual aid.
sual elements stacking usually exist. One way to address
the challenge is mapping the nominal values to numeri- Linear Projection. Linear projection uses linear transfor-
cal values [RRB∗ 04] (e.g. as implemented in the Xmdv- mation to project the data from a high-dimensional space to
Tool [War94]). Through such a mapping, each axis is used a low-dimensional one. It includes many classical methods,
more efficiently and the spacing becomes more meaningful. such as Principal component analysis (PCA), Multidimen-
In the Parallel Sets work [BKH05], the authors introduce a sional scaling (MDS), Linear discriminate analysis (LDA),
new visual representation that adapts the notion of parallel and various factor analysis methods.
coordinates but replaces the data points with a frequency- PCA [Jol05] is designed to find an orthogonal linear
based visual representation that is designed for nominal transformation that maximizes the variance of the result-
data. The Conjunctive Visual Form [Wea09] allows users to ing embedding. PCA can be calculated by an eigende-
rapidly query nominal values with certain conjunctive rela- composition of the data’s covariance matrix or a singular
tionships through simple interactions. The GPLOM (Gener- value decomposition of the data matrix. The interactive PCA
alized Plot Matrix) [IML13] extends the Scatterplot Matrix (iPCA) [JZF∗ 09] introduces a system that visualizes the re-
(SPLOM) to handle nominal data. sults of PCA using multiple coordinated views. The system
Spatiotemporal Data. Some recent advances focus on de- allows synchronized exploration and manipulations among
veloping visual encoding that capture the spatiotemporal as- the original data space, the eigenspace, and the projected
space, which aids the user in understanding both the PCA

c The Eurographics Association 2015.


S. Liu, D. Maljovec, B. Wang, P.-T Bremer & V. Pascucci / Visualizing High-Dimensional Data:Advances in the Past Decade

process and the dataset. When visualizing labeled data, class inspired by the perceptual processes of identifying distance
separation is usually desired. Methods such as LDA aim to relationships in parallel coordinates using polylines.
provide a linear projection that maximizes the class separa- Dimension Reduction Precision Measure. One of the fun-
tion. The recent work by Koren et. al. [KC03] generalizes damental challenges in dimension reduction is assessing and
PCA and LDA by providing a family of flexible linear pro- measuring the quality of the resulting embeddings. Lee et al.
jections to cope with different kinds of data. introduce the ranking-based metric [LV09] that assesses the
Non-linear Dimension Reduction. There are two distinct ranking discrepancy before and after applying dimension re-
groups of techniques in non-linear dimension reduction, un- duction. This technique is then generalized [MLGH13] and
der either the metric or non-metric setting. The graph-based used for visualizing dimension reduction quality. A projec-
techniques are designed to handle metric inputs, such as tion precision measure is introduced in [SvLB10], where a
Isomap [TDSL00], Local Linear Embedding (LLE) [RS00], local precision score is calculated for each point with a cer-
and Laplacian Eigenmap (LE) [BN03], where a neighbor- tain neighborhood size. In the distortion-guided exploration
hood graph is used to capture local distance proximities and work [LWBP14], several distortion measures are proposed
to build a data-driven model of the space. for different dimension reduction techniques, where these
The other group of techniques address non-metric prob- measures aid in understanding the cause of highly distorted
lems commonly referred to as non-metric MDS or stress- areas during interactive manipulation and exploration. For
based MDS by capturing non-metric dissimilarities. The MDS, the stress can be used as a precision measure. Seifert
fundamental idea behind the non-metric MDS is to mini- et al. [SSK10] further develop this idea by incorporating the
mize the mapping error directly through iterative optimiza- analysis and visualization for better understanding of the lo-
tions. The well-known Shepard-Kruskal algorithm [Kru64] calized stress phenomena.
begins by finding a monotonic transformation that maps the 3.3 Subspace Clustering
non-metric dissimilarities to the metric distances, which pre-
Clustering is one of the most widely used data-driven
serves the rank-order of dissimilarity. Then, the resulting
analysis methods. Instead of providing an in-depth discus-
embedding is iteratively improved based on stress. The pro-
sion on all clustering techniques, in this survey, we fo-
gressive and iterative nature of these methods has been ex-
cus on subspace clustering techniques which have a great
ploited recently by Williams et al. [WM04], where the user
impact for understanding and visualizing high-dimensional
is presented with a coarse approximation from partial data.
datasets. Dimension reduction aims to compute one sin-
The refinement is on-demand based on user inputs.
gle embedding that best describes the structure of the data.
Control Points Based Projection. For handling large and However, this could become ineffective due to the increas-
complex datasets, the traditional linear or non-linear di- ing complexity of the data. Alternatively, one could perform
mension reductions are limited by their computational effi- subspace clustering, where multiple embeddings can be gen-
ciency. Some recent developments, e.g., [DST04, PNML08, erated through clustering either the dimensions or the data
PEP∗ 11a, JPC∗ 11, PSN10], utilize a two-phases approach, points, for capturing various aspects of the data.
where the control points (anchor points) are projected first,
Dimension Space Exploration. Guided by the user, dimen-
followed by the projection of the rest of the points based on
sion space exploration methods interactively group relevant
the control points location and local features preservation.
dimensions into subsets. Such exploration allows us to better
Such designs lead to a much more scalable system. Further-
understand their relationships and to identify shared patterns
more, the control points allow the user to easily manipulate
among the dimensions. Turkay et al. introduce a dual visual
and modify the outcome of the dimension reduction compu-
analysis model [TFH11] where both the dimension embed-
tation to achieve desirable results.
ding and point embedding can be explored simultaneously.
Distance Metric. For a given dimension reduction algo- Their later improvement [TLLH12] allows for the group-
rithm, a suitable distance metric is essential for the com- ing of a collection of dimensions as a factor, which per-
putation outcome as it is more likely to reveal important mits effective exploration of the heterogeneous relationships
structural information. Brown et al. [BLBC12] introduce among them. The Projection Matrix/Tree work [YRWG13]
the distance function learning concept, where a new dis- extends a similar concept to allow a recursive exploration of
tance metric is calculated from the manipulation of point both the dimension space and data space. Several visual en-
layouts by an expert user. In [Gle13], the author attempts coding methods also rely on the concept of dimension space
to associate a linear basis with a certain meaningful con- exploration. These methods are discussed in Section 4.3.
cept constructed based on user-defined examples. Machine
Clustering Subsets of Dimensions. Comparing to the di-
learning techniques can then be employed to find a set of
mension space exploration, where the user is responsible
simple linear bases that achieve an accurate projection ac-
for identifying patterns and relationships, subspace clus-
cording to the prior examples. The structure-based analysis
tering/finding methods automatically group related dimen-
method [LMZ∗ 14] introduces a data-driven distance metric
sions into clusters. Subspace clustering filters out the in-
terferences introduced by irrelevant dimensions, allowing

c The Eurographics Association 2015.


S. Liu, D. Maljovec, B. Wang, P.-T Bremer & V. Pascucci / Visualizing High-Dimensional Data:Advances in the Past Decade

lower-dimensional structures to be discovered. These meth- HyperMoVal [PBK10] is a software system used for val-
ods, such as ENCLUS [CFZ99], originate from the data idating regression models against actual data. It uses sup-
mining and knowledge discovery community. They in- port vector regression (SVR) [SS04b] to fit a model to high-
troduce some very interesting exploration strategies for dimensional data, highlights discrepancies between the data
high-dimensional datasets, and can be particularly effec- and the model, and computes sensitivity information on the
tive when the dimensions are not tightly coupled. The model. The software allows for adding more model param-
TripAdvisorND [NM13] system employs a sightseeing eters to refine their regression to an acceptable level of ac-
metaphor for high-dimensional space navigation and explo- curacy. Berger et al. [BPFG11] utilize two different types of
ration. It utilizes subspace clustering to identify the sights regression models (SVR and nearest neighbor regression) to
for the exploration. The subspace search and visualization analyze a trade-off study in performance car engine design.
work [TMF∗ 12] utilizes the SURFING (subspaces rele- Utilizing the predictive power of the regression, they are able
vant for clustering) [BPR∗ 04] algorithm to search the high- to provide a guided navigation of the high-dimensional space
dimensional space and automatically identifies a large can- centered around a user-selected focal point. The user adjusts
didate set of interesting subspaces. For the work presented the focal point through multiple linked views, and sensitiv-
by Ferdosi et al. [FBT∗ 10], morphological operators are ap- ity and uncertainty information are encoded around the focal
plied on the density field generated from the (3D) PCA pro- point.
jection of the high-dimensional data for identifying subspace Tuner [TWSM∗ 11] begins as an automated adaptive sam-
clusters. pling algorithm where a sparse sampling of the parame-
Non-Axis-Aligned Subspaces. Instead of clustering the di- ter space is refined by building a Gaussian Process Model
mensions, which essentially creates axis-aligned linear sub- (GPM) (see [RW06] for a good overview) and using adap-
spaces, identifying non-axis-aligned subspaces is a more tive sampling to focus additional samples in areas with ei-
flexible alternative. Projection Pursuit [FT74] is one of the ther a high goodness of fit or high uncertainty. The software
earliest works aimed at automatically identifying the inter- then relies heavily on user interaction to study the sensitivi-
esting non-axis-aligned subspaces. Projections are consid- ties with respect to each input parameter and steers the com-
ered to be more interesting when they deviate more from a putation toward the user-defined optimal solution. Demir et
normal distribution. Some advances have been made in the al. [DW13] improve the effectiveness of GPMs by utiliz-
machine learning community to perform non-axis-aligned ing a block-wise matrix inversion scheme that can be im-
subspace clustering [Vid11]. Instead of clustering the dimen- plemented on the GPU, greatly increasing efficiency. In ad-
sions, the points are grouped together for sharing similar lin- dition, their method involves progressive refinement of the
ear subspaces. In particular, we assume the complex struc- GPM and can be halted at any point, if the improvement be-
ture of the data can be approximated by a mixture of linear comes insignificant.
subspaces (of varying dimensions), and each of the linear Most of these methods convey sensitivity information
subspaces corresponds to a set of points where their rela- through user exploration of the input space. In Section 4.2,
tionships can be approximately captured by the same linear explicit visual encodings for understanding sensitivity infor-
subspace. mation are also discussed.
For very high-dimensional data, the subspace finding Structural Summaries. Researchers have also used re-
algorithms typically have a relatively high computational gression to summarize data as in the works by Reddy et
complexity. By utilizing random projection, Anand et al. [RPH08] and Gerber et al. [GBPW10]. Both methods
al. [AWD12] introduce an efficient subspace finding algo- summarize the structures of the data via skeleton repre-
rithm for data with thousands of dimensions. It generates a sentations. Reddy et al. [RPH08] use a clustering algo-
set of candidate subspaces through random projections and rithm followed by construction of a minimum spanning
presents the top-scoring subspaces in an exploration tool. tree of the cluster centroids in order to determine possible
3.4 Regression Analysis trends in the data. These trends are then fitted with princi-
ple curves [HS89] which go through the medial-axis of the
Regression analysis in high dimension is an extensive and
data. HDViz [GBPW10], on the other hand, approximates a
active field of research in its own right. We make no attempt
topological segmentation (for more details, see Section 3.5)
to survey the entire area, but rather focus on the interplay
and constructs an inverse linear regression for each segment
between visualization and regression analysis.
of the data. In both examples, regression is used as a post-
Optimization and Design Steering. Pure optimization processing step of the algorithms in order to present sum-
problems often are not the focus in the visualization com- maries of the extracted subsets of the data.
munity. What is more common are design steering methods
where, in addition to a multivariate input space, the user has 3.5 Topological Data Analysis
one or several output or response variables they want to ex- A crucial step in gaining insights from large, complex,
plore (e.g., [BPFG11,TWSM∗ 11]), where the results require high-dimensional data involves feature abstraction, extrac-
a qualitative examination, or are used to inform decisions. tion, and evaluation in the spatiotemporal domain for ef-

c The Eurographics Association 2015.


S. Liu, D. Maljovec, B. Wang, P.-T Bremer & V. Pascucci / Visualizing High-Dimensional Data:Advances in the Past Decade

fective exploration and visualization. Topological data anal- tree [CSA03] and Reeb graph [PSBM07] in arbitrary dimen-
ysis (TDA), a new field of study (see [Zom05, BDFF∗ 08, sions have been developed. A generalization of the contour
EH08, EH10, Car09, Ghr08] for seminal works and surveys), tree has been introduced by Carr et al. [CD14, DCK∗ 12]
has provided efficient and reliable feature-driven analysis called the joint contour net (JCN), which allows for the
and visualization capabilities. Specifically, the construction analysis of multi-field data.
of topological structures [Ree46, Sma61] from scalar func-
tions on point clouds (e.g., Morse-Smale complexes, con- Reeb Graph/Contour Tree/Merge Tree
tour trees, and Reeb graphs) as “summaries” over data is at
the core of such TDA methods. Reeb graphs/contour trees
2D Scalar function
capture very different structural information of a real-valued
function compared to the Morse-Smale complexes as the for-
mer is contour-based and the latter is gradient-based (Figure Morse-Smale Complex
3). They both provide meaningful abstractions of the high-
dimensional data, which reduces the amount of data needed
to be processed or stored; and they utilize sophisticated hier-
archical representations capturing features at multiple scales,
which enables progressive simplifications of features differ- Figure 3: Contour- and gradient-based topological structure
entiating small and large scale structures in the data. of a 2D scalar function.
Morse-Smale Complexes. The Morse-Smale complex
(MSC) [EHNP03, EHZ03] describes the topology of a func- Other Topological Features. Ghrist [Ghr08] and Carls-
tion by clustering the points in the domain into regions of son [Car09] both offer several applications of TDA and in
monotonic gradient flow, where each region is associated particular highlight the topological theory used in a study
with a sink-source pair defined by local minima and max- of statistics of natural images [LPM03]. Mapper [SMC07]
ima of the function. The MSC can be represented using a decomposes data into a simplicial complex resembling a
graph where the vertices are critical points and the edges generalized Reeb graph, and visualizes the data using a
are the boundaries of areas of similar gradient behavior. The graph structure with varying node sizes. The software is
simplification of the MSC is obtained by removing pairs shown to extract salient features in a study of diabetes by
of vertices in the graph and updating connectivities among correctly classifying normal patients and patients with two
their neighboring vertices, merging nearby clusters by redi- causes of diabetes. Wang et al. [WSPVJ11] utilize TDA
recting the gradient flow. MSCs have been shown to be ef- techniques developed by Silva et al. [dSMVJ09] to re-
fective in identifying, ordering, and selectively removing cover important structures in high-dimensional data con-
features of large-scale data in scientific visualizations (e.g., taining non-trivial topology. Specifically, they are interested
[BEHP04, GBPH08, GNP∗ 05]). in high-dimensional branching and circular structures. The
HDViz [GBPW10] employs an approximation of the circle-valued coordinate functions are constructed to repre-
MSC (in high dimensions) to analyze scalar functions on sent such features. Subsequently, they perform dimension re-
point cloud data. It creates a hierarchical segmentation of duction on the data while ensuring such structures are visu-
the data by clustering points based on their monotonic flow ally preserved.
behavior, and designs new visual metaphors based on such
a segmentation. Correa et al. [CL11] suggest that by con- 4 Visual Mapping
sidering a different type of neighborhood structure, we can
Visual mapping plays an essential role in converting the
improve the accuracy in the extracted topology compared to
analysis result or the original dataset into visual structures
those obtained within HDViz.
based on various visual encodings. Here, we divide the ap-
Reeb Graphs and Contour Trees. The Reeb graph of a proaches based on their structural patterns, compositions,
real-valued function describes the connectivity of its level and movements (i.e., animations). In addition, the methods
sets. A contour tree is a special case of Reeb graph that arises that evaluate the effectiveness of visual encoding are also
in simply-connected domains. The Reeb graph stores infor- discussed.
mation regarding the number of components at any function
value as well as how these components split and merge as the 4.1 Axis-Based Methods
function value changes. Such an abstraction offers a global Axis-based methods refer to visual mappings where el-
summary of the topology of the level sets and enables the ement relationships are expressed through axes represent-
development of compact and effective methods for modeling ing the dimensions/variables. These methods include some
and visualizing scientific data, especially in high dimensions of the most well-known visual mapping approaches, such as
(i.e., [NLC11, SMC07]). scatterplot matrices (SPLOMs) and parallel coordinate plots
Efficient algorithms for computing the contour (PCPs).
Scatterplot Matrix. A scatterplot matrix, or SPLOM, is a

c The Eurographics Association 2015.


S. Liu, D. Maljovec, B. Wang, P.-T Bremer & V. Pascucci / Visualizing High-Dimensional Data:Advances in the Past Decade

collection of bivariate scatterplots that allows users to view A few methods have proposed metrics for ordering
multiple bivariate relationships simultaneously. One of the the axes automatically. Tatu et al. [TAE∗ 09] introduce
primary drawbacks of SPLOMs is the scalability. The num- PCP ranking methods for both classified and unclassified
ber of the bivariate scatterplots increases quadratically with datasets. For unlabeled data, the Hough space measure is
respect to the dataset’s dimensionality. Numerous studies used, and for labeled data, a similarity measure and overlap
have introduced methods for improving the scalability of measures are adopted. Ferdosi et al. introduce a dimension
SPLOMs by automatically or semi-automatically identify- ordering method [FR11] that is applicable for both PCPs
ing more interesting plots. and SPLOMs utilizing the subspace analysis method from
Scagnostics are a set of measures designed for identify- their earlier work [FBT∗ 10] discussed in the Section 3.3.
ing interesting plots originally introduced by John W. Tukey. Johansson and Johansson [JJ09] propose an interactive sys-
The recent works of Wilkinson et al. [WAG05, WAG06] ex- tem adopting a weighted combination of quality metrics for
tend the concept to include nine measures capturing proper- dimension selection and automatic ordering of the axes to
ties such as outliers, shape, trend, and density. In addition, enhance visual patterns such as clustering and correlation.
they improve the computational efficiency by using graph- Hurley et al. utilize Eulerian tours and Hamiltonian decom-
theoretic measures. Scagnostics is also extended to handle positions of complete graphs, which represent the relation-
time series data [DAW13]. Guo [Guo03] introduces an inter- ship between the dimensions, in their recent work [HO10] to
active feature selection method for finding interesting plots address the axis ordering challenge.
by evaluating the maximum conditional entropy of all pos- Clutter reduction is another important aspect in PCPs, es-
sible axis-parallel scatterplots. The rank by feature frame- pecially for large point counts. Peng et al. [PWR04] were
work [SS04a, SS06] allows users to choose a ranking crite- able to reduce clutter for both SPLOMs and PCPs without
rion, such as histogram distribution properties and correla- altering the information content simply by reordering the di-
tion coefficients between axes, for scatterplots in SPLOMs. mensions. A focus+context visualization scheme can also be
Data class labels can play an important role in identifying used for reducing the clutter and highlighting the essential
interesting plots and selecting a meaningful ranking order. features in the PCP [NH06]. In this context, the overview
Sips et al. utilize class consistency [SNLH09] as a quality captures both the outliers and the trends in the dataset. The
metric for 2D scatterplots. The class consistency measure outliers are indicated by single lines, and the trends that cap-
is defined by the distance to the class’s center or entropies ture the overall relationship between axes are approximated
of the spatial distributions of classes. Tatu et al. [TAE∗ 09] by polygon strips. The selected data items are emphasized
introduce different metrics for ranking the “interestingness” through visual highlighting. In addition, several of the clut-
of scatterplots and PCPs for both classified and unclassified ter reduction methods employing screen space measures are
datasets. For data with labels, a class density measure and a discussed in detail in Section 5.4.
histogram density measure are adopted as ranking functions Finally, many visual encoding improvements exist for
for the scatterplots. PCPs. Progressive PCPs [RZH12] demonstrate the power of
The ranking order provides only an indirect way to as- a progressive refinement scheme for enhancing the ability
sess the scatterplots, Lehmann et al. [LAE∗ 12] introduces a of PCPs to handle large datasets. In the work of Dang et
system for visually exploring all the plots as a whole. By re- al. [DWA10], density is expressed by stacking overlapping
ordering the rows and columns in the SPLOMs, this method elements. For the PCP case, a 3D visualization is presented,
groups relevant plots in the spatial vicinity of one another. In where either the edges are stacked as curves or the points on
addition, an abstraction can be obtained from the reordered the axes are stacked vertically as dots.
SPLOM to provide a global view. Radial Layout. The star coordinate plot [Kan00], also re-
Parallel Coordinates. Compared to a SPLOM, where only ferred to as a bi-plot [HGM∗ 97], is a generalization of the
bivariate relationships can be directly expressed, the Paral- axis-aligned bivariate scatterplot. The star coordinate axes
lel Coordinate Plot (PCP) [Ins09, ID91] allows patterns that represent the unit basis vectors of an affine projection. The
highlight multivariate relations to be revealed by showing user is allowed to modify the orientation and the length of
all the axes at once (typically, in a vertical layout). How- the axes as a way of altering the projection. However, due
ever, due to the linear ordering of the PCP axes, for a given to the unbounded manipulation, star coordinates may pro-
n-dimensional dataset, there are n! permutations of the or- duce affine projections where substantial distortion occurs.
dering of the axes. Each of the orderings highlights certain Lehmann et al. extend the star coordinate concept with an
aspects of the high-dimensional structure. Therefore, one of orthographic constraint [LT13] to restrict the generated pro-
the significant challenges when dealing with PCPs is deter- jection to be orthographic, which better preserves the struc-
mining an appropriate order of the axes. In addition, as the ture of the original dataset.
number of points increases, the line density in the PCP in- Similar to the star coordinates, Radviz [HGM∗ 97] adopts
creases dramatically, which can lead to overplotting and vi- a circular pattern.The difference is that Radviz does not de-
sual clutter thus hindering the discovery of patterns. fine an explicit projection matrix. In Radviz, n dimensional

c The Eurographics Association 2015.


S. Liu, D. Maljovec, B. Wang, P.-T Bremer & V. Pascucci / Visualizing High-Dimensional Data:Advances in the Past Decade

anchors are placed along the perimeter of a circle, each rep- tem works by mapping different facial features to separate
resenting one of the dimensions of an n-dimensional dataset. dimensions. In a few recent works, glyphs have been utilized
A spring model is constructed for each point, where one end to provide statistical and sensitivity information in order to
of a spring is attached to a dimensional anchor and the other present trends in the data. By utilizing local linear regression
is attached to the data point. The point is then displayed to compute partial derivatives around sampled data points
where the sum of the spring forces equals zero. Albuquerque and representing the information in terms of glyph shape,
et al. [AEL∗ 10] devise a RadViz quality measure allowing sensitivity information can be visually encoded into scatter-
automatic optimization of the dimensional anchor layout. plots [CCM09, CCM10, GWRR11, CCM13].
DataMeadow [EST07] introduces a radial visual encod- Correa et al. [CCM09] aimed at incorporating uncertainty
ing named DataRoses, which is represented as a PCP laid information into PCA projections and k-means clustering
out radially as opposed to linearly. Lastly, PolarEyez [JN02] and accomplished this goal by augmenting scatterplots with
introduces a focus+context visualization where the high- tornado plots. Together these glyphs encode uncertainty and
dimensional function parameter space is encoded in a radial partial derivative information. The idea of mapping sen-
fashion around a user-controlled focal point. Data near the sitivity information to a line segment through each data
focal point is represented with more precision, and the focal point has been extended in their later work [CCM10] with
point can be altered to focus on different parts of the data. the introduction of the flow-based scatterplot (FBS) that
highlights functional relationships between inputs and out-
puts. The works by Guo et al. [GWRR11] and Chan et
al. [CCM13] attempt to provide more than a single partial
derivative information into their scatterplots by experiment-
ing with different glyph shapes such as star plots among oth-
ers. [GWRR11] also uses a bar chart similar to the tornado
plot used in [CCM09], and [CCM13] provides two other in-
terpretations. The first is a generalization of the FBS called
the generalized sensitivity scatterplot (GSS). By using or-
Figure 4: Scattering points in parallel coordinates by Yuan et thogonal regression, GSSs can represent the partial deriva-
al. [YGX∗ 09]. tive of any variable with respect to any other variable. The
other is a fan glyph that works similarly to the star glyph,
allowing for viewing multiple partial derivatives, but rather
Hybrid Construction. The axis-based methods can also be than displaying magnitude as in the star glyph, the fan glyph
combined to create new visualizations. The scattering points highlights the direction of each partial derivative, since all
in parallel coordinate work [YGX∗ 09] (Figure 4) embeds a line segments are normalized in length.
MDS plot between a pair of PCP axes. The flexible linked
The methods described above all deal with encoding ex-
axes work [CvW11] is a generalization of the PCP and the
tra information per data point into glyphs, but the DICON
SPLOM. The tool gives the user the ability to create new
system [CGSQ11] attempts to show the trend of data within
configurations by drawing and linking axes in either scat-
a collection of data points by visually encoding statistical
terplot or PCP style. Proposed by Fanea et al., the integra-
information about the set of points being represented. DI-
tion of parallel coordinate and star glyphs [FCI05] provides
CON uses dynamic icons based on treemap visualization to
a way to “unfold” the overlapped values in the PCP axis in
encode clusters of data into separate icons, and allows the
3D space. In this work [FCI05], each axis in the PCP is re-
user to interactively merge, split, filter, regroup, and high-
placed with a star glyph that represents the values across all
light clusters or data within clusters. Due to the interactive
points, and then each high-dimensional point is described as
nature, the authors have developed a stabilized Voronoi lay-
a set of line segments in 3D connecting the individual values
out that allows data within the treemap to maintain spatial
in the star glyphs.
coherence as the user edits the clusters. They further encode
In addition, there is a number of visual representations skew and kurtosis into the shape of the icon before applying
that derive from the the well-known visual mappings. Angu- the Voronoi algorithm, thus allowing for statistical details to
lar histograms [GPL∗ 11] introduced a novel visual encoding be presented.
that improves the scalability of PCPs by overcome the over-
Finally, Ward [War08] gives a thorough, practical treat-
plotting issue. The tiled PCP [CMR07] adopts a row-column
ment of generating and organizing effective glyphs for mul-
2D configuration instead of the 1D linear layout of the tra-
tivariate data, paying particular attention to the common pit-
ditional PCP for simultaneous visualization of multiple time
falls involving the use of glyphs.
steps and variables.
4.3 Pixel-Oriented Approaches
4.2 Glyphs
In an effort to encode the maximal amount of informa-
Chernoff faces [Che73] are one of the first attempts to map
tion, several works have targeted dense pixel displays. Re-
a high-dimensional data point into a single glyph. The sys-

c The Eurographics Association 2015.


S. Liu, D. Maljovec, B. Wang, P.-T Bremer & V. Pascucci / Visualizing High-Dimensional Data:Advances in the Past Decade

searchers have focused on encoding data values as individ- cussed topological structures, which can provide a ranking
ual pixels and creating separate displays, or subwindows, for of features with the help of persistence simplification and
each dimension. thus be treated as a hierarchy.
Some of the earliest works in this area date back to the mid Various visual metaphors have been designed for con-
1990s [KK94,AKpK96]. VisDB [KK94] visualizes database tour trees [PCMS09, WBP12]. In particular, variations of
queries by creating a 2D image for each dimension involved topological landscapes have been proposed [BMW∗ 12,
in the query and mapping individual values of a dimension DBW∗ 12, HW10, OHJS10, OHJ∗ 11, WBP07]. These visual
to pixels. The mapped data is sorted and colored by rele- metaphors have, or potentially have, capabilities for the visu-
vance such that the data most related to the query appears alization of high-dimensional datasets. In particular, Weber
in the center of the image, and the data spirals outward as et al. [WBP07] have presented such a metaphor for visu-
it loses relevance to the query. Circle segments [AKpK96] ally mapping the contour tree of high-dimensional functions
arrange multidimensional data in a radial fashion with equal to a 2D terrain where the relative size, volume, and nest-
size sectors being carved out for each dimension. ing of the topological features are preserved. Harvey and
The pixel concept can be applied to bar charts to create Wang [HW10] have extended this work by computing all
pixel bar charts [KHL∗ 01]. Pixel bar charts first separate possible planar landscapes and they are able to preserve ex-
data into separate bars based on one dimension or attribute, actly the volumes of the high-dimensional features in the
and it can also split the data along the orthogonal direction areas of the terrain. In addition, the works of Oesterling et
using another dimension, although most results are reported al. [OHJS10, OHJ∗ 11] have used this same metaphor to vi-
using only one direction for splitting data. Once split, the sualize a related structure, the join tree. They use a novel
data points are sorted along the horizontal axis within the high-dimensional interpolation scheme in order to estimate
bars using one dimension and ordered along the vertical axis the density from the raw data points, and visually map the
using another dimension. Wattenberg introduces the jigsaw density as points on top of their generated terrains.
map [Wat05], which again maps data points to pixels and Oesterling et al. [OHWS13] continued this line of work
uses discrete space-filling curves in order to fill a 2D plane by creating a linked view software system including user in-
in a more sensible fashion than a comparative treemap lay- teractions into the analysis by allowing users to brush and
out. link with PCPs and PCA projections of the data. In addition,
The Value and Relation (VaR) displays [YPS∗ 04, they have presented a new method of sorting the features
YHW∗ 07] combine the recursive pattern displays [KKA95] based on either persistence, cluster size, or cluster stability,
with MDS in order to lay out the separate subwindows thus adjusting the placement of features in the topological
such that similar dimensions are placed closer together. A landscape.
latter iteration [YHW∗ 07] enhances the work by provid- Other Hierarchical Structures. In the structure-based
ing more robust visualizations including jigsaw maps, scat- brushes work [FWR00], a data hierarchy is constructed to
terplot glyphs, and a novel concept known as the Rainfall be visualized by both a PCP and a treemap [Shn92], allow-
metaphor geared at establishing the relationship of all di- ing users to navigate among different levels-of-detail and se-
mensions to a single dimension of interest. lect the feature(s) of interest. The structure decomposition
tree [ERHH11] presents a novel technique that embeds a
4.4 Hierarchy-Based Approaches
cluster hierarchy in a dimensional anchor-based visualiza-
For visualizing high-dimensional datasets, hierarchical vi- tion using a weighted linear dimension reduction technique.
sual representations are used to capture dimensional rela- It provides a detail plus overview structural representation,
tionships, represent contour tree structure, and provide new and conveys coordinate value information in the same con-
visual encodings for representing high-dimensional struc- struction. The system supports user-guided pruning, opti-
tures. mization of the decision tree, and encoding the tree structure
Dimension Hierarchies. Large numbers of dimensions hin- in an explorable visual hierarchy. Kreuseler et al. present a
der our ability to navigate the data space and cause scala- novel visualization technique [KS02] for visualizing com-
bility issues for visual mapping. A hierarchical organization plex hierarchical graphs in a focus+context manner for vi-
of dimensions explicitly reveals the dimension relationships, sual data mining tasks.
helping to alleviate the complexity of the dataset. Yang et
4.5 Animation
al. propose an interactive hierarchical dimension ordering,
spacing, and filtering approach [WPWR03] based on dimen- Many techniques for visualizing high-dimensional data
sion similarity. The dimension hierarchy is represented and utilize animated transitions to enhance the perception of
navigated by a multiple ring structure (InterRing [YWR02]), point and structure correspondences among multiple rele-
where the innermost-ring represents the coarsest level in the vant plots.
hierarchy. The GGobi system [SLBC03] provides a mechanism for
Topology-Based Hierarchies. In Section 3.5, we have dis- calculating a continuous linear projection transition between
any pair of linear projections based on the principal an-

c The Eurographics Association 2015.


S. Liu, D. Maljovec, B. Wang, P.-T Bremer & V. Pascucci / Visualizing High-Dimensional Data:Advances in the Past Decade

gles between them. In the Rolling the Dice work [EDF08], user the opportunity to see effects such as planarity and
a transition between any pair of scatterplots in a SPLOM linearity when visualizing dense scatterplots. Johansson et
is made possible by connecting a series of 3D tran- al. [JLJC05] reveal structures in PCPs by adopting the trans-
sitions between scatterplots that share an axis. Rnav- fer function concept commonly used in volume rendering.
Graph [WO11] constructs a graph connecting a number of Based on user input, the transfer function maps the line den-
interesting scatterplots. A smooth animation is generated be- sities into different opacities to highlight features.
tween all scatterplots that are connected by an edge. The Illustrative rendering techniques are also used for high-
TripAdvisorND [NM13] system allows users to explore the lighting the focused areas, such as the well-known Table-
neighborhood of a subspace by tilting the projection plane Lens approach [RC94] for visualizing large tables. Such a
using a polygonal touchpad interface. magic lens based approach permits fast exploration of an
4.6 Perception Evaluation area of interest without presenting all the details, therefore,
reduces clutter in the view. MoleView [HTE11], for visual-
The design goal of visual mapping and encoding is to di-
izing scatterplots and graphs, adopts a semantic lens for al-
rectly convey the information to the user through visual per-
lowing users to focus on the area of interest and keep the in-
ception. The evaluation of this mapping is vitally important
focused data unchanged while simplifying or deforming the
in determining the effectiveness of the overall visualization.
rest of data to maintain context. A survey on the distortion-
Sedlmair et al. have carried out an extensive investigation oriented magic lens techniques is presented by Leung and
of the effectiveness of visual encoding choices [SMT13], Apperley [LA94].
including 2D scatterplot, interactive 3D scatterplot, and
SPLOMs. Their findings reveal that the 2D scatterplot is
often decent, and certain dimension reduction techniques
provide a good alternative. In addition, SPLOMs some-
times add additional value, and the interactive 3D scatter-
plot rarely helps and often hurts the perception of class
separation. The efficacy of several PCP variants for clus-
ter identification has been studied in [HVW10]. The com-
parison is performed among nine PCP variations based on
existing methods and combinations of them. The evalu- Figure 5: Illuminated 3D scatterplot by Sanftmann et
ation reveals that, aside from the scatterplots embedded al. [SW09].
into parallel coordinates [YGX∗ 09], a number of seemingly
valid improvements do not result in significant performance 5.2 Continuous Visual Representation
gains for cluster identification tasks. Heer et al. investi-
For most high-dimensional visualization techniques, a
gate the animated transition effectiveness between statistic
discrete visual representation is assumed since each element
graphs [HR07] such as bar charts, pie charts, and scatter-
corresponds to a data point. However, due to limitations such
plots. Their results reveal that animated transitions, when
as visual clutter and computational cost, many applications
used appropriately, can significantly improve graphical per-
prefer a continuous representation.
ception.
The work of Bachthaler and Weiskopf [BW08] presents a
5 View Transformation mathematical model for constructing a continuous scatter-
plot. The follow-up work [BW09] introduces an adaptive
View transformations dictate what we ultimately see on rendering extension for continuous scatterplots increasing
the screen. As pointed out by Bertini et al. [BTK11], the the rendering efficiency. This concept is extended to create
view transformation can also be described as the rendering continuous PCPs [HW09] based on the point and line duality
process that generates images in the screen space. between scatterplots and parallel coordinates. The authors
5.1 Illustrative Rendering propose a mathematical model that maps density from a con-
tinuous scatterplot [BW08] to parallel coordinates. Lehmann
Illustrative rendering describes methods that focus on
et al. introduce a feature detection algorithm design for con-
achieving a specific visual style by applying custom-
tinuous PCPs [LT11].
designed rendering algorithms. The illustrative PCPs
work [MM08] provides a set of artistic style rendering Clutter caused by overlapping in PCPs and scatterplots
techniques for enhancing parallel coordinate visualization. occludes data distribution and outliers. In the splatterplot
Some of the rendering techniques include spline-based edge work [MG13], the authors introduce a hybrid representation
bundling, opacity-based hints to convey cluster density, and for scatterplots to overcome the overdraw issue when scal-
shading effects to illustrate local line density. Illuminated ing to very large datasets. The proposed abstraction auto-
scatterplots [SW09] (Figure 5) classify points based on matically groups dense data points into an abstract contour
the eigenanalysis of the covariance matrix, and give the representation and renders the rest of the area using selected
representatives, thus preserving the visual cue for outliers. A

c The Eurographics Association 2015.


S. Liu, D. Maljovec, B. Wang, P.-T Bremer & V. Pascucci / Visualizing High-Dimensional Data:Advances in the Past Decade

splatting framework for extracting clusters in PCPs is pre- amount of user interaction is shown in Table 1. In this cat-
sented by [ZCQ∗ 09], where a polyline splatter is introduced egorization, each step is further divided into computation-
for cluster detection, and a segment splatter is used for clut- centric approaches, interactive exploration, and model ma-
ter reduction. nipulation. In both of the recent surveys [MPG∗ 14,TJHH14]
on the user interaction in visualization applications, the level
5.3 Accurate Color Blending
of integration between the computation and visualization
When rendering semi-transparent objects, color blending (indicate user interaction) is used for classifying the meth-
methods have a significant impact on the perception of order ods. In many ways, their classifications are aligned with our
and structure. proposed approach, with the distinction that our discussion is
As stated in the Hue-Preserving Color-Blending directly connected to the information visualization pipeline.
work [KGZ∗ 12], the commonly adopted alpha-compositing
6.1 Computation-Centric Approaches
can result in false colors that may lead to a deceiving visual-
ization. The authors propose a data-driven machine learning Computation-centric approaches require only limited
model for optimizing and predicting a hue-preserving user input such as setting initial parameters. These
blending. This model can be applied to high-dimensional methods center around algorithms designed for well-
visualization techniques such as illustrative PCPs [MM08], defined computational problems such as dimension re-
where a depth ordering clue is better preserved. In the duction [RS00, MRC02, KC03, WM04], subspace analy-
Weaving vs. Blending work [HSKIH07], the authors sis [CFZ99, TMF∗ 12, FBT∗ 10, AWD12], regression anal-
investigate the effectiveness of two color mixing schemes: ysis [BPFG11, BPFG11], quality metric based rank-
color blending and color weaving (interleaved pattern). The ing [WAG05, TAE∗ 09], etc. Computation-centric ap-
results indicate that color weaving allows users to better proaches exist at each of the processing steps, but are most
infer the value of individual components; however, as the concentrated in the data transformation step.
number of components increases, the advantage of color 6.2 Interactive Exploration
weaving diminishes.
Interactive methods navigate, query, and filter the exist-
5.4 Image Space Metrics ing model interactively for more effective visual communi-
As discussed in Section 4.1, a number of quality mea- cation. In this section, we focus only on representative meth-
sures have been proposed to analyze the visual structure and ods where the interactive exploration mechanism is their key
automatically identify interesting patterns in PCPs or scat- contribution.
terplots. In this section, we discuss the image space based In the data transformation step, the interactive explo-
quality measures that are applied in the screen space. ration scheme allows users to guide progressive dimen-
Arterode et al. propose a method [AdOL04] for uncov- sion reduction, where a partial result is presented upon re-
ering clusters and reducing clutter by analyzing the density quest [WM04]. In works by Turkay et al. [TFH11, TLLH12]
or frequency of the plot. Image processing based techniques and Yuan et al. [YRWG13], a subset of dimensions is inter-
such as grayscale manipulation and thresholding are used to actively selected and explored in dimension space.
achieve the desired visualization. Johansson et al. introduce In the visual mapping step, there are large number of
a screen space quality measure for clutter reduction [JC08] methods focused on interactive exploration and querying
to address the challenge of very large datasets. The metric the high-dimensional dataset. Such methods play an impor-
is based on distance transformation, and the computation is tant role in the knowledge Discovery in Databases (KDD)
carried out on the GPU for interactive performance. process, where the term visual data mining [KK96, Kei02,
Pargnostics [DK10], a portmanteau for parallel coordi- DOL03] is used to describe these applications. Interac-
nates and diagnostics (similar to Scagnostics [WAG05]), is tive filtering, zooming, distortion, linking and brushing, or
a set of screen space measures for identifying distance pat- a combination of them have been adopted to include the
terns among pairs of axes in PCPs. The metrics include line user as part of the exploring and querying process. Po-
crossings, crossing angles, convergence, and over-plotting. laris [STH02] is a visual query and analysis system de-
For each of the metrics, the system provides ranked views for signed for relational databases. This system is later de-
pairs of axes, allowing the user to guide exploration and vi- veloped into the well-known commercial product Tableau.
sualization. Pixnostic [SSK06] is an image space based qual- Stolte et al. introduce an approach for zooming along one
ity metric for ranking interestingness for pixel based (Sec- or more dimensions for multi-scale exploration by travers-
tion 4.3) visualization such as Pixel Bar Chars [KHL∗ 01]. ing a graph [STH03]. In this system, relational queries can
be defined by visual specifications allowing fast incremen-
6 User Interaction tal development and intuitive understanding of the data.
Hao et al. have introduced the Intelligent Visual Analyt-
As illustrated in Figure 2, interaction is integrated ics Queries [HDK∗ 07]. Their approach utilizes correlation
with each of the processing steps. An alternative sub- and similarity measurements for mining data relationships.
categorization for each of the processing steps based on the We believe new research directions could stem from visual

c The Eurographics Association 2015.


S. Liu, D. Maljovec, B. Wang, P.-T Bremer & V. Pascucci / Visualizing High-Dimensional Data:Advances in the Past Decade

Data Transformation Visual Mapping View Transformation


Computation-Centric dimension reduc- automatic parallel co- quality metrics in image
tion [WM04], subspace ordinate axis reorder- space [JC08], continu-
finding [TMF∗ 12], regres- ing [PWR04], scatterplot ous visual representa-
sion analysis ranking [WAG05] tion [BW08]
Interactive Exploration interactive, progres- visual querying and fil- interactive magic lens ef-
sive dimension reduc- tering [SVW10], animated fects [HTE11], illuminated
tion [WM04], dimension transition [SLBC03] 3D scatterplot [SW09]
space exploration [TFH11]
Model Manipulation user-guided embedding distance function learning PCP transfer func-
manipulation [LWBP14], [BLBC12, Gle13], visual tion [JLJC05], inverse
control point based projec- to parameter interaction projection extrapola-
tion [JPC∗ 11] [HBM∗ 13] tion [PdSABD∗ 12]

Table 1: The transformation pipelines intertwine with user interaction. The subcategorizing is based on the different levels of
user involvement.

data mining and visual queries. The Select and Slice Ta- for distance learning scenarios. The explainers [Gle13] are
ble [SVW10] allows users to study the relationships between projection functions created from a set of user-defined anno-
data subsets and the semantic zone (user-defined areas of in- tations.
terest). The semantic zones are arranged along one axis of The control point based projection methods [DST04,
the table, while the data subsets are arranged along the other PNML08,PEP∗ 11a,JPC∗ 11,PSN10] update the overall pro-
axis. In addition, the method enables the combination and jection result based on user manipulation of the control
manipulation of the semantic zones for further exploration. points. In the iLAMP method [PdSABD∗ 12], inverse pro-
More recent works [GLG∗ 13, GGL∗ 14] by Gratzl et al. in- jection extrapolation is used for generating synthetic mul-
troduce some very interesting interactive methods for rank- tidimensional data out of existing projections for param-
ings multi-attributes and explore subsets of tabular datasets. eter space exploration. In the Local Clustering Operation
Both of the works of Poco et al. [PEP∗ 11b] and Sanft- work [GXWY10], the visual structure is modified in PCPs
mann and Weiskopf [SW12] present methods for navigat- through user-guided deformation operators. Finally, Liu et
ing a 3D projection. However, their approaches are quite al. [LWBP14] allow for direct manipulation of the dimen-
different. The method introduced by Poco et al. [PEP∗ 11b] sion reduction embedding to resolve structural ambiguities.
focuses on enhancing the visual encoding and exploration The interactively updated distortion measure is used for
usability of a 3D projection calculated by the Least-Square feedback during manipulation.
Projection [PNML08] algorithm. On the other hand, Sanft-
mann and Weiskopf [SW12] present an interpolation scheme 7 Connections with Related Fields
for generating 3D rigid body rotations between a pair of 3D
We investigate the connections between recent advances
axis-aligned scatterplots that share a common axis.
in high-dimensional data visualization and related fields in
In the view transformation step, interactivity is inherent in the hope of inspiring new research directions.
both the magic lens based methods [HTE11,LA94], and illu-
minated 3D scatterplots [SW09] (discussed in Section 5.1). 7.1 Multivariate Volume Visualization
Multivariate volume visualization and high-dimensional
6.3 Model Manipulation
visualization are often studied under different contexts: the
Model manipulation techniques represent a class of meth- former is normally considered as scientific visualization re-
ods that integrate user manipulation as part of the algorithm, search [BH07,KH13], while the latter is mostly studied from
and update the underlying model to reflect the user input to the perspective of information visualization and visual ana-
obtain new insights. lytics. In addition, they focus on different kinds of data and
Take the distance function learning work [BLBC12], attempt to accomplish distinct goals.
for example. The initial embedding is created using a Despite the differences, recent advances in both areas
default distance measure. Through interaction, the initial have shown that they share a number of fundamental tech-
point layout is modified based on the expert user’s domain niques and principles. Standard high-dimensional data visu-
knowledge. The system then adjusts the underlying dis- alization techniques, such as PCPs, scatterplots, and dimen-
tance model to reflect the user input. Hu et al. present a sion reduction, have found their way into the multivariate
method [HBM∗ 13] for improving the translation of user in- volume visualization literature. For example, the scattering
teraction to algorithm input (visual to parameter interaction) points in parallel coordinates work [YGX∗ 09] is adopted

c The Eurographics Association 2015.


S. Liu, D. Maljovec, B. Wang, P.-T Bremer & V. Pascucci / Visualizing High-Dimensional Data:Advances in the Past Decade

by [GXY12] as a design space for multivariate volume trans- ious aspects of high-dimensional data: dimension reduc-
fer functions. In the work of Liu et al. [LWT∗ 14a], dynamic tion for feature selection and extraction; clustering for ex-
projection and subspace analysis are utilized for exploring ploratory data mining and classification; regression for rela-
the high-dimensional parameter space of volumetric data. tionship inference and prediction. However, we identify sev-
We believe useful and interesting techniques may be devel- eral different directions in which we expect to see further
oped by sharing ideas and discovering new connections be- progress, namely: robust analysis and data de-noising; multi-
tween these two fields. scale analysis; data skeletonization; and high-dimensional
approximations. First, more advanced regression techniques
7.2 Machine Learning
could be developed that are robust to noise and outliers,
Machine learning algorithms under many situations have in particular, a new class of regression techniques inspired
been treated as “black box” approaches, and the param- by geometric and topological intuititions (e.g., [GBPW10]).
eter tuning process can be tedious and unpredictable. To Second, topological data analysis has built-in capabilities in
resolve such a challenge, several visualization approaches separating features from noise at multi-scales; such a multi-
have been introduced to aid the understanding of the various scale notion is expected to be transferrable to a larger class of
machine learning algorithms. Tzeng et al. present a visual- analysis techniques. Third, developing frameworks to extract
ization system that helps users design neural networks more as well as to simplify “skeletons” from high-dimensional
efficiently [TM05]. The works of Teoh and Ma [TM03] and data can be extremely useful for visual data abstraction
van den Elzen and van Wijk [vdEvW11] investigate visual- and exploration (e.g., [SMC07]). Finally, as pointed out by
ization methods for interactively constructing and analyzing Donoho [Don00], perhaps there exists new notions of high-
decision trees. Visualization has also been used to aid model dimensional approximation theory, where we make different
validation [Rd00, MW10]. Numerous challenges for under- regularity assumptions and obtain a very different picture in
standing machine learning algorithms coincide with high- approximating high-dimensional functions. Approximating
dimensional visualization. We believe high-dimensional vi- the Morse-Smale complex in high dimension is considered
sualization will play an important role in designing, tuning, such an example.
and validating machine learning algorithms.
During visual mapping, our surveyed techniques convert
8 Reflections and Future Directions the analysis result into visual structures with various vi-
sual encodings. Development of new analysis results, for ex-
One of our primary objectives in presenting this survey is ample, new approximations of high-dimensional structures,
to provide actionable guidance for data practitioners to nav- would inevitably lead to new visual metaphors (e.g., in the
igate through a modular view of the recent advances. To do case of topological landscape [HW10, DBW∗ 12]). Under
so, we provide a categorization of recent works along an en- visual mapping and view transformation, we also see var-
riched information visualization pipeline. We reflect on the ious methods aimed at summarizing trends in data, such
chosen categories and subcategories (as described briefly in as glyph representations, edge bundling in PCPs, splatting
Section 2) and describe on a high level how they provide as presented in splatterplots [MG13] and PCP-based splat-
actionable guidance. To allow the creation of new visualiza- ters [ZCQ∗ 09], and hierarchical approaches. These could be
tions along the pipeline, one should think beyond data tasks further enhanced with new data skeletonization techniques.
to be performed in any single stage, and focus on under- Finally, we identify a few opportunities for future visual-
standing how results from one stage could be utilized most ization research and discuss them in detail.
effectively in the remaining stages. We argue that the sub-
Subspace Clustering. Finding interesting projections
categories discussed during each pipeline stage correspond
(views) has been an active and important research area for vi-
to sets of actionable items or toolsets that the data practi-
sualizing high-dimensional data. The motivation behind the
tioner could choose from and rely upon. The combinations
various view selection schemes can be traced back to much
of techniques they chose to apply are largely data-driven
earlier work such as projection pursuit [FT74].
and application-dependent. Nevertheless, the techniques sur-
veyed following our categorization aim to provide a modular Along a similar line of research, scatterplot ranking meth-
view during the design process. ods [SS04a, WAG06, TAE∗ 09] are introduced to automati-
cally identify the interesting scatterplots. However, a scat-
We now discuss the challenges addressed by the tech-
terplot matrix captures only limited bivariate relationships.
niques surveyed in the paper, and those that remain to be
Subspace selection methods [CFZ99, BPR∗ 04], originally
tackled. Our discussion is partially inspired by Donoho’s
developed in the data mining community, have recently been
AMS lecture [Don00] where he discusses the curses
adapted for high-dimensional data visualization [FBT∗ 10,
and blessings of dimensionality when it comes to high-
TMF∗ 12] to capture more complicated multivariate struc-
dimensional data analysis.
tures. Despite the added flexibility, the search is still lim-
Data analysis, falls under the data transformation stage ited to axis-aligned subspaces. Recent advances in machine
within our categorization. Some of the surveyed, standard learning, such as subspace clustering (e.g., [Vid11]), assume
data analysis tasks are widely applicable for studying var-

c The Eurographics Association 2015.


S. Liu, D. Maljovec, B. Wang, P.-T Bremer & V. Pascucci / Visualizing High-Dimensional Data:Advances in the Past Decade

the high-dimensional dataset can be represented by a mix- transformation step, more work can be done to define mea-
ture of low-dimensional linear subspaces with mixed dimen- sures of uncertainty associated with the two latter processing
sions. Such methods produce non-axis-aligned subspaces, steps in the visualization pipeline, namely Visual mapping
which work well for datasets where different dimensions are and View transformation.
closely related. In addition, instead of capturing a single lin- Topological Data Analysis and Visualization. Another
ear subspace, they can approximate non-linear structures by important and interesting recent advance is the introduc-
fitting together multiple linear subspaces. tion of TDA to visualization (e.g., [GBPW10, WSPVJ11,
We believe exploring various (non-axis-aligned) subspace DCK∗ 12]). TDA provides an interesting alternative for cap-
clustering methods will lead to new developments in high- turing the structure in high-dimensional data. Since topolog-
dimensional view selection techniques (e.g., some of the re- ical structures are typically scale-invariant, designing mean-
cent work by the authors [LWT∗ 14a, LWT∗ 14b]). ingful and effective visual encodings that capture their inher-
Model Manipulation. We have seen an emerging user ent properties is essential for future development. Approx-
interaction paradigm, referred to as model manipula- imation algorithms exist for computing topological struc-
tion [BLBC12,PdSABD∗ 12,Gle13,HBM∗ 13] in this survey. tures in high dimensions; therefore, it is important to strike
What differentiates the model manipulation interaction from a balance between speed and accuracy, and to convey ap-
other types of interaction is the change of the underlying data propriately the approximation error in the visualization.
model to reflect user intention. These model manipulation Some initial work has been done to provide bounds or es-
based methods allow users to easily transfer their domain timations on the accuracy of these approximated models
knowledge into the exploratory analysis process, allowing (e.g., [GBPW10, CL11, TFO09]).
for effective analysis and visualization. However, since such Other Directions. Finally, as discussed in Section 7, fields
interactive manipulations give users an enormous amount of such as multivariate volume visualization and machine
freedom, one of the main challenges in model manipulation learning share a number of common research problems
is to understand whether or not the manipulation faithfully with high-dimensional data visualization. Finding connec-
conveys the user intention. Rigorous validation between the tions and sharing ideas among these related topics will likely
user intended operations and manipulation outcomes is es- not only yield interesting future research directions, but also
sential for evaluating the effectiveness and usability of these help resolve many challenges in high-dimensional data visu-
methods. alization.
Uncertainty Quantification. Along with the large-scale and
high dimensionality of the data, information pertaining to Acknowledgments
uncertainty is becoming increasingly available and impor- The first two authors contributed equally to this work.
tant. The addition of uncertainty information within visual- This work was performed in part under the auspices of the
izations has been deemed a top research problem in scien- US DOE by LLNL under Contract DE-AC52-07NA27344.,
tific visualization [Joh04], due to the greater availability of LLNL-CONF-658933. This work is also supported in
this information from simulation and quantification, and the part by NSF 0904631, DE-EE0004449, DE-NA0002375,
importance of understanding data quality, confidence, and DE-SC0007446, DE-SC0010498, NSG IIS-1045032, NSF
error issues when interpreting scientific results. Some recent EFT ACI-0906379, DOE/NEUP 120341, DOE/Codesign
works in high-dimensional data visualization have focused P01180734.
on analyzing the uncertainty stemming from the input data or
with respect to the accuracy of a fitted model (see Section 3.4 References
and [ZSWR06]). We believe the extensions and generaliza- [AdOL04] A RTERO A., DE O LIVEIRA M., L EVKOWITZ H.: Un-
covering clusters in crowded parallel coordinates visualizations.
tions of existing uncertainty visualization capabilities (e.g.,
In IEEE Symposium on Information Visualization (2004), pp. 81–
[DKLP02, PWB∗ 09, SZD∗ 10]) to high-dimensional data is 88. 11
one of the important future directions.
[AEL∗ 10] A LBUQUERQUE G., E ISEMANN M., L EHMANN
Another interesting aspect of uncertainty quantification D. J., T HEISEL H., M AGNOR M.: Improving the visual analy-
is based on uncertainty-aware visual analytics discussed sis of high-dimensional datasets using quality measures. In IEEE
by Correa et al. [CCM09], and further explored by Liu et Symposium on Visual Analytics Science and Technology (2010),
IEEE, pp. 19–26. 8
al. [LWBP14] and Schreck et al. [SvLB10], where the un-
certainty (e.g., bias and distortions) arises from the Data [AKpK96] A NKERST M., K EIM D. A., PETER K RIEGEL H.:
Circle segments: A technique for visually exploring large mul-
transformation step. The work by Correa et al. [CCM09] tidimensional data sets. In Proceedings of IEEE Visualization,
measures the uncertainty introduced by three common Hot Topic Session. (1996). 9
Data transformation techniques; and the works of Liu et [AWD12] A NAND A., W ILKINSON L., DANG T. N.: Visual pat-
al. [LWBP14] and Schreck et al. [SvLB10] quantifies the tern discovery using random projections. In IEEE Conference on
amount of distortion for projection techniques. While these Visual Analytics Science and Technology (2012), IEEE, pp. 43–
methods apply to the uncertainty stemming from the Data 52. 5, 11
[BCS96] B UJA A., C OOK D., S WAYNE D. F.: Interactive high-

c The Eurographics Association 2015.


S. Liu, D. Maljovec, B. Wang, P.-T Bremer & V. Pascucci / Visualizing High-Dimensional Data:Advances in the Past Decade

dimensional data visualization. Journal of Computational and [CCM13] C HAN Y.-H., C ORREA C., M A K.-L.: The general-
Graphical Statistics 5, 1 (1996), pp. 78–99. 1 ized sensitivity scatterplot. IEEE Transactions on Visualization
[BDFF∗ 08] B IASOTTI S., D E F LORIANI L., FALCIDIENO B., and Computer Graphics 19, 10 (2013), 1768–1781. 8
F ROSINI P., G IORGI D., L ANDI C., PAPALEO L., S PAGNUOLO [CD14] C ARR H., D UKE D.: Joint contour nets. IEEE Trans-
M.: Describing shapes by geometrical-topological properties of actions on Visualization and Computer Graphics 20, 8 (2014),
real functions. ACM Computing Surveys 40, 4 (2008), 12:1– 1100–1113. 6
12:87. 6
[CFZ99] C HENG C.-H., F U A. W., Z HANG Y.: Entropy-based
[BDSW13] B ISWAS A., D UTTA S., S HEN H.-W., W OODRING subspace clustering for mining numerical data. In Proceedings of
J.: An information-aware framework for exploring multivari- the fifth ACM SIGKDD international conference on Knowledge
ate data sets. IEEE Transactions on Visualization and Computer discovery and data mining (1999), ACM, pp. 84–93. 5, 11, 13
Graphics 19, 12 (2013), 2683–2692. 3
[CGSQ11] C AO N., G OTZ D., S UN J., Q U H.: Dicon: Interactive
[Bec14] B ECK F.: Survis. https://github.com/fabian-beck/survis, visual analysis of multidimensional clusters. IEEE Transactions
2014. 2 on Visualization and Computer Graphics 17, 12 (2011), 2581–
[BEHP04] B REMER P.-T., E DELSBRUNNER H., H AMANN B., 2590. 8
PASCUCCI V.: A topological hierarchy for functions on trian- [Cha06] C HAN W. W.-Y.: A survey on multivariate data visu-
gulated surfaces. IEEE Transactions on Visualization and Com- alization. Department of Computer Science and Engineering.
puter Graphics 10, 385-396 (2004). 6 Hong Kong University of Science and Technology 8, 6 (2006),
[BH07] B ÜRGER R., H AUSER H.: Visualization of multi-variate 1–29. 1
scientific data. EuroGraphics State of the Art Reports (STARs) [Che73] C HERNOFF H.: The use of faces to represent points in
(2007), 117–134. 1, 3, 12 k-dimensional space graphically. Journal of the American Statis-
[BKH05] B ENDIX F., KOSARA R., H AUSER H.: Parallel sets: tical Association 68, 342 (1973), 361–368. 8
visual analysis of categorical data. In IEEE Symposium on Infor-
[CL11] C ORREA C., L INDSTROM P.: Towards robust topology
mation Visualization (2005), pp. 133–140. 3
of sparsely sampled data. IEEE Transactions on Visualization
[BLBC12] B ROWN E. T., L IU J., B RODLEY C. E., C HANG R.: and Computer Graphics 17, 12 (2011), 1852–1861. 6, 14
Dis-function: Learning distance functions interactively. In IEEE
Conference on Visual Analytics Science and Technology (2012), [CMR07] C AAT M., M AURITS N., ROERDINK J.: Design and
IEEE, pp. 83–92. 4, 12, 14 evaluation of tiled parallel coordinates visualization of multi-
channel eeg data. IEEE Transactions on Visualization and Com-
[BMW∗ 12] B EKETAYEV K., M OROZOV D., W EBER G. H., puter Graphics 13, 1 (2007), 70–79. 8
A BZHANOV A., H AMANN . B.: Geometry–preserving topolog-
ical landscapes. In Proceedings of the Workshop at SIGGRAPH [CMS99] C ARD S. K., M ACKINLAY J. D., S HNEIDERMAN B.:
Asia (2012), pp. 155–160. 9 Readings in information visualization: using vision to think.
Morgan Kaufmann, 1999. 1, 2
[BN03] B ELKIN M., N IYOGI P.: Laplacian eigenmaps for dimen-
sionality reduction and data representation. Neural computation [CSA03] C ARR H., S NOEYINK J., A XEN U.: Computing con-
15, 6 (2003), 1373–1396. 4 tour trees in all dimensions. Computational Geometry 24, 2
(2003), 75 – 94. Special Issue on the Fourth CGC Workshop
[BPFG11] B ERGER W., P IRINGER H., F ILZMOSER P., on Computational Geometry. 6
G RÖLLER E.: Uncertainty-aware exploration of continu-
ous parameter spaces using multivariate prediction. Computer [CvW11] C LAESSEN J., VAN W IJK J.: Flexible linked axes for
Graphics Forum 30, 3 (2011), 911–920. 5, 11 multivariate data visualization. IEEE Transactions on Visualiza-
tion and Computer Graphics 17, 12 (2011), 2310–2316. 8
[BPR∗ 04] BAUMGARTNER C., P LANT C., R AILING K.,
K RIEGEL H.-P., K ROGER P.: Subspace selection for clustering [DAW13] DANG T. N., A NAND A., W ILKINSON L.: Timeseer:
high-dimensional data. In Fourth IEEE International Conference Scagnostics for high-dimensional time series. IEEE Transactions
on Data Mining (2004), IEEE, pp. 11–18. 5, 13 on Visualization and Computer Graphics 19, 3 (2013), 470–483.
[BTK11] B ERTINI E., TATU A., K EIM D.: Quality metrics in 7
high-dimensional data visualization: an overview and system- [DBW∗ 12] D EMIR D., B EKETAYEV K., W EBER G. H., B RE -
atization. IEEE Transactions on Visualization and Computer MER P.-T., PASCUCCI V., H AMANN . B.: Topology exploration
Graphics 17, 12 (2011), 2203–2212. 1, 2, 10 with hierarchical landscapes. In Proceedings of the Workshop at
[BW08] BACHTHALER S., W EISKOPF D.: Continuous scatter- SIGGRAPH Asia (2012), pp. 147–154. 9, 13
plots. IEEE Transactions on Visualization and Computer Graph- [DCK∗ 12] D UKE D., C ARR H., K NOLL A., S CHUNCK N., NAM
ics 14, 6 (2008), 1428–1435. 10, 12 H. A., S TASZCZAK A.: Visualizing nuclear scission through a
[BW09] BACHTHALER S., W EISKOPF D.: Efficient and adaptive multifield extension of topological analysis. IEEE Transactions
rendering of 2-d continuous scatterplots. Computer Graphics Fo- on Visualization and Computer Graphics 18, 12 (2012), 2033–
rum 28, 3 (2009), 743–750. 10 2040. 6, 14
[Car09] C ARLSSON G.: Topology and data. Bullentin of the [DK10] DASGUPTA A., KOSARA R.: Pargnostics: Screen-space
American Mathematical Society 46, 2 (2009), 255–308. 2, 6 metrics for parallel coordinates. IEEE Transactions on Visualiza-
tion and Computer Graphics 16, 6 (2010), 1017–1026. 11
[CCM09] C ORREA C., C HAN Y.-H., M A K.-L.: A framework
for uncertainty-aware visual analytics. In IEEE Symposium on [DKLP02] D JURCILOV S., K IM K., L ERMUSIAUX P., PANG A.:
Visual Analytics Science and Technology (2009), pp. 51–58. 8, Visualizing scalar volumetric data with uncertainty. Computers
14 and Graphics 26 (2002), 239–248. 14
[CCM10] C HAN Y.-H., C ORREA C., M A K.-L.: Flow-based [DOL03] D E O LIVEIRA M. C. F., L EVKOWITZ H.: From visual
scatterplots for sensitivity analysis. In IEEE Symposium on Vi- data exploration to visual data mining: A survey. IEEE Transac-
sual Analytics Science and Technology (2010), IEEE, pp. 43–50. tions on Visualization and Computer Graphics 9, 3 (2003), 378–
8 394. 1, 11

c The Eurographics Association 2015.


S. Liu, D. Maljovec, B. Wang, P.-T Bremer & V. Pascucci / Visualizing High-Dimensional Data:Advances in the Past Decade

[Don00] D ONOHO D. L.: High-dimensional data analysis: The [FWR00] F UA Y.-H., WARD M., RUNDENSTEINER E.:
curses and blessings of dimensionality. AMS Lecture: Math Structure-based brushes: a mechanism for navigating hierarchi-
Challenges of the 21st Century, 2000. 13 cally organized data and information spaces. IEEE Transactions
on Visualization and Computer Graphics 6, 2 (2000), 150–159.
[dSMVJ09] DE S ILVA V., M OROZOV D., V EJDEMO -
9
J OHANSSON M.: Persistent cohomology and circular
coordinates. In Proceedings 25th Annual Symposium on [GBPH08] G YULASSY A., B REMER P.-T., PASCUCCI V.,
Computational Geometry (2009), pp. 227–236. 6 H AMANN B.: A practical approach to Morse-Smale complex
[DST04] D E S ILVA V., T ENENBAUM J. B.: Sparse multidimen- computation: Scalability and generality. IEEE Transactions on
sional scaling using landmark points. Tech. rep., Technical re- Visualization and Computer Graphics 14, 6 (2008), 1619–1626.
port, Stanford University, 2004. 4, 12 6

[DW13] D EMIR I., W ESTERMANN R.: Progressive high-quality [GBPW10] G ERBER S., B REMER P., PASCUCCI V., W HITAKER
response surfaces for visually guided sensitivity analysis. Com- R.: Visual exploration of high dimensional scalar functions.
puter Graphics Forum 32, 3pt1 (2013), 21–30. 5 IEEE Transactions on Visualization and Computer Graphics 16,
6 (2010), 1271–1280. 3, 5, 6, 13, 14
[DWA10] DANG T. N., W ILKINSON L., A NAND A.: Stacking
graphic elements to avoid over-plotting. IEEE Transactions on [GGL∗ 14] G RATZL S., G EHLENBORG N., L EX A., P FISTER H.,
Visualization and Computer Graphics 16, 6 (2010), 1044–1052. S TREIT M.: Domino: Extracting, comparing, and manipulating
7 subsets across multiple tabular datasets. IEEE Transactions on
Visualization and Computer Graphics 20, 12 (2014), 2023–2032.
[ED07] E LLIS G., D IX A.: A taxonomy of clutter reduction for 12
information visualisation. IEEE Transactions on Visualization
and Computer Graphics 13, 6 (2007), 1216–1223. 1 [Ghr08] G HRIST R.: Barcodes: The persistent topology of data.
Bulletin of the American Mathematical Society 45 (2008), 61–75.
[EDF08] E LMQVIST N., D RAGICEVIC P., F EKETE J.-D.: 6
Rolling the dice: Multidimensional visual exploration using scat-
terplot matrix navigation. IEEE Transactions on Visualization [Gle13] G LEICHER M.: Explainers: Expert explorations with
and Computer Graphics 14, 6 (2008), 1539–1148. 10 crafted projections. IEEE Transactions on Visualization and
Computer Graphics 19, 12 (2013), 2042–2051. 4, 12, 14
[EH08] E DELSBRUNNER H., H ARER J.: Persistent homology –
a survey. Contemporary Mathematics 453 (2008), 257. 6 [GLG∗ 13] G RATZL S., L EX A., G EHLENBORG N., P FISTER H.,
S TREIT M.: Lineup: Visual analysis of multi-attribute rankings.
[EH10] E DELSBRUNNER H., H ARER J.: Computational Topol-
IEEE Transactions on Visualization and Computer Graphics 19,
ogy - an Introduction. American Mathematical Society, 2010.
12 (2013), 2277–2286. 12
6
[GNP∗ 05] G YULASSY A., NATARAJAN V., PASCUCCI V., B RE -
[EHNP03] E DELSBRUNNER H., H ARER J., NATARAJAN V.,
MER P.-T., H AMANN B.: Topology-based simplification for fea-
PASCUCCI V.: Morse-Smale complexes for piece-wise linear
ture extraction from 3D scalar fields. In Proceedings of IEEE
3-manifolds. In Proceedings 19th Annual symposium on Com-
Visualization (2005), pp. 535–542. 6
putational geometry (2003), pp. 361–370. 6
[EHZ03] E DELSBRUNNER H., H ARER J., Z OMORODIAN A. J.: [GPL∗ 11] G ENG Z., P ENG Z., L ARAMEE R., ROBERTS J.,
Hierarchical Morse-Smale complexes for piecewise linear 2- WALKER R.: Angular histograms: Frequency-based visualiza-
manifolds. Discrete and Computational Geometry 30, 87-107 tions for large, high dimensional data. IEEE Transactions on Vi-
(2003). 6 sualization and Computer Graphics 17, 12 (2011), 2572–2580.
8
[ERHH11] E NGEL D., ROSENBAUM R., H AMANN B., H AGEN
H.: Structural decomposition trees. Computer Graphics Forum [Guo03] G UO D.: Coordinating computational and visual ap-
30, 3 (2011), 921–930. 9 proaches for interactive feature selection and multivariate clus-
tering. Information Visualization 2, 4 (2003), 232–246. 7
[EST07] E LMQVIST N., S TASKO J., T SIGAS P.: Datameadow:
A visual canvas for analysis of large-scale multivariate data. In [GWRR11] G UO Z., WARD M., RUNDENSTEINER E., RUIZ C.:
IEEE Symposium on Visual Analytics Science and Technology Pointwise local pattern exploration for sensitivity analysis. In
(2007), pp. 187–194. 8 IEEE Conference on Visual Analytics Science and Technology
(2011), pp. 131–140. 8
[FBT∗ 10] F ERDOSI B. J., B UDDELMEIJER H., T RAGER S.,
W ILKINSON M. H., ROERDINK J. B.: Finding and visualizing [GXWY10] G UO P., X IAO H., WANG Z., Y UAN X.: Interactive
relevant subspaces for clustering high-dimensional astronomical local clustering operations for high dimensional data in parallel
data using connected morphological operators. In IEEE Sympo- coordinates. In IEEE Pacific Visualization Symposium (2010),
sium on Visual Analytics Science and Technology (2010), IEEE, pp. 97–104. 12
pp. 35–42. 5, 7, 11, 13 [GXY12] G UO H., X IAO H., Y UAN X.: Scalable multivariate
[FCI05] FANEA E., C ARPENDALE S., I SENBERG T.: An inter- volume visualization and analysis based on dimension projection
active 3d integration of parallel coordinates and star glyphs. In and parallel coordinates. IEEE transactions on visualization and
IEEE Symposium on Information Visualization (2005), pp. 149– computer graphics (2012). 13
156. 8 [HBM∗ 13] H U X., B RADEL L., M AITI D., H OUSE L., N ORTH
[FR11] F ERDOSI B. J., ROERDINK J. B.: Visualizing high- C.: Semantics of directly manipulating spatializations. IEEE
dimensional structures by dimension ordering and filtering us- Transactions on Visualization and Computer Graphics 19, 12
ing subspace analysis. Computer Graphics Forum 30, 3 (2011), (2013), 2052–2059. 12, 14
1121–1130. 7
[HDK∗ 07] H AO M., DAYAL U., K EIM D., M ORENT D.,
[FT74] F RIEDMAN J., T UKEY J.: A projection pursuit algorithm S CHNEIDEWIND J.: Intelligent visual analytics queries. In IEEE
for exploratory data analysis. IEEE Transactions on Computers Symposium on Visual Analytics Science and Technology (2007),
C-23, 9 (1974), 881–890. 5, 13 pp. 91–98. 11

c The Eurographics Association 2015.


S. Liu, D. Maljovec, B. Wang, P.-T Bremer & V. Pascucci / Visualizing High-Dimensional Data:Advances in the Past Decade

[HG02] H OFFMAN P. E., G RINSTEIN G. G.: A survey of visu- [Joh04] J OHNSON C. R.: Top scientific visualization research
alizations for high-dimensional data mining. Information visual- problems. IEEE Computer Graphics and Applications (2004).
ization in data mining and knowledge discovery (2002), 47–82. 14
1
[Jol05] J OLLIFFE I.: Principal component analysis. Wiley Online
[HGM∗ 97] H OFFMAN P., G RINSTEIN G., M ARX K., G ROSSE Library, 2005. 3
I., S TANLEY E.: DNA visual and analytic data mining. In Pro-
ceedings of IEEE Visualization (1997), pp. 437–441. 7 [JPC∗ 11] J OIA P., PAULOVICH F., C OIMBRA D., C UMINATO J.,
N ONATO L.: Local affine multidimensional projection. IEEE
[HO10] H URLEY C., O LDFORD R.: Pairwise display of high- Transactions on Visualization and Computer Graphics 17, 12
dimensional information via eulerian tours and hamiltonian de- (2011), 2563–2571. 4, 12
compositions. Journal of Computational and Graphical Statis-
tics 19, 4 (2010). 7 [JZF∗ 09] J EONG D. H., Z IEMKIEWICZ C., F ISHER B., R IB -
ARSKY W., C HANG R.: iPCA: An interactive system for pca-
[HR07] H EER J., ROBERTSON G. G.: Animated transitions in based visual analytics. Computer Graphics Forum 28, 3 (2009),
statistical data graphics. IEEE Transactions on Visualization and 767–774. 3
Computer Graphics 13, 6 (2007), 1240–1247. 10
[Kan00] K ANDOGAN E.: Star coordinates: A multi-dimensional
[HS89] H ASTIE T., S TUETZLE W.: Principal curves. Journal of
visualization technique with uniform treatment of dimensions. In
the American Statistical Association 84, 406 (1989), 502–516. 5
IEEE Information Visualization Symposium, Late Breaking Hot
[HSKIH07] H AGH -S HENAS H., K IM S., I NTERRANTE V., Topics (2000), pp. 9–12. 7
H EALEY C.: Weaving versus blending: a quantitative assessment
of the information carrying capacities of two alternative methods [KC03] KOREN Y., C ARMEL L.: Visualization of labeled data
for conveying multivariate data with color. IEEE Transactions on using linear transformations. In IEEE Symposium on Information
Visualization and Computer Graphics 13, 6 (2007), 1270–1277. Visualization (2003), pp. 121–128. 4, 11
11 [Kei02] K EIM D. A.: Information visualization and visual data
[HTE11] H URTER C., T ELEA A., E RSOY O.: Moleview: An at- mining. IEEE Transactions on Visualization and Computer
tribute and structure-based semantic lens for large element-based Graphics 8, 1 (2002), 1–8. 1, 11
plots. IEEE Transactions on Visualization and Computer Graph- [KGZ∗ 12] K UHNE L., G IESEN J., Z HANG Z., H A S., M UELLER
ics 17, 12 (2011), 2600–2609. 10, 12 K.: A data-driven approach to hue-preserving color-blending.
[HVW10] H OLTEN D., VAN W IJK J. J.: Evaluation of cluster IEEE Transactions on Visualization and Computer Graphics 18,
identification performance for different PCP variants. Computer 12 (2012), 2122–2129. 11
Graphics Forum 29, 3 (2010), 793–802. 10 [KH13] K EHRER J., H AUSER H.: Visualization and visual anal-
[HW09] H EINRICH J., W EISKOPF D.: Continuous parallel co- ysis of multifaceted scientific data: A survey. IEEE Transactions
ordinates. IEEE Transactions on Visualization and Computer on Visualization and Computer Graphics 19, 3 (2013), 495–513.
Graphics 15, 6 (2009), 1531–1538. 10 1, 3, 12
[HW10] H ARVEY W., WANG Y.: Topological landscape en- [KHL∗ 01] K EIM D., H AO M., L ADISCH J., H SU M., DAYAL
sembles for visualization of scalar-valued functions. Computer U.: Pixel bar charts: a new technique for visualizing large multi-
Graphics Forum 29, 3 (2010), 993–1002. 9, 13 attribute data sets without aggregation. In IEEE Symposium on
[HW13] H EINRICH J., W EISKOPF D.: State of the art of parallel Information Visualization (2001), pp. 113–120. 9, 11
coordinates. STAR Proceedings of Eurographics 2013 (2013), [KK94] K EIM D., K RIEGEL H.-P.: Visdb: database exploration
95–116. 1 using multidimensional visualization. Computer Graphics and
[ID91] I NSELBERG A., D IMSDALE B.: Parallel coordinates. In Applications, IEEE 14, 5 (1994), 40–49. 9
Human-Machine Interactive Systems. Springer, 1991, pp. 199– [KK96] K EIM D. A., K RIEGEL H.-P.: Visualization techniques
233. 7 for mining large databases: A comparison. Knowledge and Data
[IML13] I M J.-F., M C G UFFIN M., L EUNG R.: Gplom: The gen- Engineering, IEEE Transactions on 8, 6 (1996), 923–938. 11
eralized plot matrix for visualizing multidimensional multivariate [KKA95] K EIM D., K RIEGEL H.-P., A NKERST M.: Recursive
data. IEEE Transactions on Visualization and Computer Graph- pattern: a technique for visualizing very large amounts of data.
ics 19, 12 (2013), 2606–2614. 3 In Proceedings of IEEE Visualization (1995), pp. 279–286, 463.
[Ins09] I NSELBERG A.: Parallel Coordinates : Visual Multidi- 9
mensional Geometry and its Applications. Springer, 2009. 1,
[Kru64] K RUSKAL J. B.: Multidimensional scaling by optimiz-
7
ing goodness of fit to a nonmetric hypothesis. Psychometrika 29,
[JC08] J OHANSSON J., C OOPER M.: A screen space quality 1 (1964), 1–27. 4
method for data abstraction. Computer Graphics Forum 27, 3
(2008), 1039–1046. 11, 12 [KS02] K REUSELER M., S CHUMANN H.: A flexible approach
for visual data mining. IEEE Transactions on Visualization and
[JJ09] J OHANSSON S., J OHANSSON J.: Interactive dimensional- Computer Graphics 8, 1 (2002), 39–51. 9
ity reduction through user-defined combinations of quality met-
rics. IEEE Transactions on Visualization and Computer Graphics [LA94] L EUNG Y. K., A PPERLEY M. D.: A review and taxon-
15, 6 (2009), 993–1000. 7 omy of distortion-oriented presentation techniques. ACM Trans-
actions on Computer-Human Interaction 1, 2 (1994), 126–160.
[JLJC05] J OHANSSON J., L JUNG P., J ERN M., C OOPER M.: Re- 10, 12
vealing structure within clustered parallel coordinates displays.
In IEEE Symposium on Information Visualization (2005), IEEE, [LAE∗ 12] L EHMANN D. J., A LBUQUERQUE G., E ISEMANN
pp. 125–132. 10, 12 M., M AGNOR M., T HEISEL H.: Selecting coherent and relevant
plots in large scatterplot matrices. Computer Graphics Forum 31,
[JN02] JAYARAMAN S., N ORTH C.: A radial focus+context vi-
6 (2012), 1895–1908. 7
sualization for multi-dimensional functions. In Proceedings of
IEEE Visualization (2002), pp. 443–450. 8 [LAK∗ 11] L AWRENCE J., A RIETTA S., K AZHDAN M., L EPAGE

c The Eurographics Association 2015.


S. Liu, D. Maljovec, B. Wang, P.-T Bremer & V. Pascucci / Visualizing High-Dimensional Data:Advances in the Past Decade

D., O’H AGAN C.: A user-assisted approach to visualizing mul- unique mutational profile and excellent survival. In Proceedings
tidimensional images. IEEE Transactions on Visualization and of the National Academy of Sciences (2011), vol. 108, pp. 7265–
Computer Graphics 17, 10 (2011), 1487–1498. 3 7270. 6
[LMZ∗ 14] L EE J. H., M C D ONNELL K. T., Z ELENYUK A., [NM13] NAM J. E., M UELLER K.: Tripadvisor-nd: A tourism-
I MRE D., M UELLER K.: A structure-based distance metric for inspired high-dimensional space exploration framework with
high-dimensional space exploration with multidimensional scal- overview and detail. IEEE Transactions on Visualization and
ing. IEEE Transations on Visualization and Computer Graphics Computer Graphics 19, 2 (2013), 291–305. 5, 10
20, 3 (2014), 351–364. 4
[OHJ∗ 11] O ESTERLING P., H EINE C., JANICKE H., S CHEUER -
[LPM03] L EE A. B., P EDERSEN K. S., M UMFORD D.: The non- MANN G., H EYER G.: Visualization of high-dimensional point
linear statistics of high-contrast patches in natural images. Inter- clouds using their density distribution’s topology. IEEE Trans-
national Journal of Computer Vision 54, 1-3 (2003), 83–103. 6 actions on Visualization and Computer Graphics 17, 11 (2011),
[LT11] L EHMANN D., T HEISEL H.: Features in continuous par- 1547–1559. 9
allel coordinates. IEEE Transactions on Visualization and Com- [OHJS10] O ESTERLING P., H EINE C., JÄNICKE H., S CHEUER -
puter Graphics 17, 12 (2011), 1912–1921. 10 MANN G.: Visual analysis of high dimensional point clouds us-
[LT13] L EHMANN D. J., T HEISEL H.: Orthographic star coordi- ing topological landscape. In IEEE Pacific Visualization Sympo-
nates. IEEE Transactions on Visualization and Computer Graph- sium (2010), pp. 113–120. 9
ics 19, 12 (2013), 2615–2624. 7 [OHWS13] O ESTERLING P., H EINE C., W EBER G., S CHEUER -
[LV09] L EE J. A., V ERLEYSEN M.: Quality assessment of di- MANN G.: Visualizing nd point clouds as topological landscape
mensionality reduction: Rank-based criteria. Neurocomputing profiles to guide local data analysis. IEEE Transactions on Visu-
72, 7 (2009), 1431–1443. 4 alization and Computer Graphics 19, 3 (2013), 514–526. 9
[LWBP14] L IU S., WANG B., B REMER P.-T., PASCUCCI [PBK10] P IRINGER H., B ERGER W., K RASSER J.: Hypermoval:
V.: Distortion-guided structure-driven interactive exploration of Interactive visual validation of regression models for real-time
high-dimensional data. Computer Graphics Forum 33, 3 (2014), simulation. In Proceedings of the 12th Eurographics / IEEE -
101–110. 4, 12, 14 VGTC Conference on Visualization (2010), EuroVis’10, Euro-
graphics Association, pp. 983–992. 3, 5
[LWT∗ 14a] L IU S., WANG B., T HIAGARAJAN J. J., B REMER
P.-T., PASCUCCI V.: Multivariate volume visualization through [PCMS09] PASCUCCI V., C OLE -M C L AUGHLIN K., S CORZELLI
dynamic projections. Large Data Analysis and Visualization G.: The toporrery: Computation and presentation of multi-
(LDAV), 2014 IEEE Symposium on (2014). 13, 14 resolution topology. Mathematical Foundations of Scientific Vi-
sualization, Computer Graphics, and Massive Data Exploration
[LWT∗ 14b] L IU S., WANG B., T HIAGARAJAN J. J., B REMER
(2009), 19–40. 9
P.-T., V V. P.: Visual exploration of high-dimensional data: Sub-
space analysis through dynamic projections. Tech. Rep. UUSCI- [PdSABD∗ 12] P ORTES DOS S ANTOS A MORIM E., B RAZIL
2014-003, SCI Institute, University of Utah, 2014. 14 E. V., DANIELS J., J OIA P., N ONATO L. G., S OUSA M. C.:
ilamp: Exploring high-dimensional spacing through backward
[MG13] M AYORGA A., G LEICHER M.: Splatterplots: Overcom-
multidimensional projection. In IEEE Conference on Visual An-
ing overdraw in scatter plots. IEEE Transactions on Visualization
alytics Science and Technology (2012), IEEE, pp. 53–62. 12, 14
and Computer Graphics 19, 9 (2013), 1526–1538. 10, 13
[MLGH13] M OKBEL B., L UEKS W., G ISBRECHT A., H AMMER [PEP∗ 11a] PAULOVICH F., E LER D., P OCO J., B OTHA C.,
B.: Visualizing the quality of dimensionality reduction. Neuro- M INGHIM R., N ONATO L.: Piece wise laplacian-based projec-
computing 112 (2013), 109–123. 4 tion for interactive data exploration and organization. Computer
Graphics Forum 30, 3 (2011), 1091–1100. 4, 12
[MM08] M C D ONNELL K. T., M UELLER K.: Illustrative paral-
lel coordinates. Computer Graphics Forum 27, 3 (2008), 1031– [PEP∗ 11b] P OCO J., E TEMADPOUR R., PAULOVICH F., L ONG
1038. 10, 11 T., ROSENTHAL P., O LIVEIRA M., L INSEN L., M INGHIM R.:
A framework for exploring multidimensional data with 3d pro-
[MPG∗ 14] M UHLBACHER T., P IRINGER H., G RATZL S., S EDL - jections. Computer Graphics Forum 30, 3 (2011), 1111–1120.
MAIR M., S TREIT M.: Opening the black box: Strategies for in- 12
creased user involvement in existing algorithm implementations.
IEEE Transactions on Visualization and Computer Graphics 20, [PNML08] PAULOVICH F., N ONATO L., M INGHIM R., L EV-
12 (2014), 1643–1652. 11 KOWITZ H.: Least square projection: A fast high-precision mul-
tidimensional projection technique and its application to docu-
[MRC02] M ORRISON A., ROSS G., C HALMERS M.: A hy- ment mapping. IEEE Transactions on Visualization and Com-
brid layout algorithm for sub-quadratic multidimensional scaling. puter Graphics 14, 3 (2008), 564–575. 4, 12
In IEEE Symposium on Information Visualization (2002), IEEE,
pp. 152–158. 11 [PSBM07] PASCUCCI V., S CORZELLI G., B REMER P.-T., M AS -
CARENHAS A.: Robust on-line computation of reeb graphs: Sim-
[Mun14] M UNZNER T.: Visualization Analysis and Design. CRC plicity and speed. ACM Transactions on Graphics 26, 3 (2007).
Press, 2014. 1, 2 6
[MW10] M IGUT M., W ORRING M.: Visual exploration of clas- [PSN10] PAULOVICH F., S ILVA C., N ONATO L.: Two-phase
sification models for risk assessment. In IEEE Symposium on mapping for projecting massive data sets. IEEE Transactions on
Visual Analytics Science and Technology (2010), pp. 11–18. 13 Visualization and Computer Graphics 16, 6 (2010), 1281–1290.
[NH06] N OVOTNY M., H AUSER H.: Outlier-preserving fo- 4, 12
cus+context visualization in parallel coordinates. IEEE Trans- [PWB∗ 09] P OTTER K., W ILSON A., B REMER P.-T., W ILLIAMS
actions on Visualization and Computer Graphics 12, 5 (2006), D., PASCUCCI V., J OHNSON C.: Ensemblevis: A flexible ap-
893–900. 7 proach for the statistical visualization of ensemble data. In Pro-
[NLC11] N ICOLAU M., L EVINE A. J., C ARLSSON G.: Topology ceedings of IEEE Workshop on Knowledge Discovery from Cli-
based data analysis identifies a subgroup of breast cancers with a mate Data: Prediction, Extremes, and Impacts (2009). 14

c The Eurographics Association 2015.


S. Liu, D. Maljovec, B. Wang, P.-T Bremer & V. Pascucci / Visualizing High-Dimensional Data:Advances in the Past Decade

[PWR04] P ENG W., WARD M. O., RUNDENSTEINER E. A.: [SS06] S EO J., S HNEIDERMAN B.: Knowledge discovery in
Clutter reduction in multi-dimensional data visualization using high-dimensional data: Case studies and a user survey for the
dimension reordering. In IEEE Symposium on Information Visu- rank-by-feature framework. IEEE Transactions on Visualization
alization (2004), IEEE, pp. 89–96. 7, 12 and Computer Graphics 12, 3 (2006), 311–322. 7
[RC94] R AO R., C ARD S. K.: The table lens: merging graph- [SSK06] S CHNEIDEWIND J., S IPS M., K EIM D. A.: Pixnostics:
ical and symbolic representations in an interactive focus+ con- Towards measuring the value of visualization. In IEEE Sympo-
text visualization for tabular information. In Proceedings of sium on Visual Analytics Science and Technology (2006), IEEE,
the SIGCHI conference on Human factors in computing systems pp. 199–206. 11
(1994), ACM, pp. 318–322. 10 [SSK10] S EIFERT C., S ABOL V., K IENREICH W.: Stress maps:
[Rd00] R HEINGANS P., DES JARDINS M.: Visualizing high- analysing local phenomena in dimensionality reduction based vi-
dimensional predictive model quality. In Proceedings of IEEE sualisations. In IEEE International Symposium on Visual Analyt-
Visualization (2000), pp. 493–496. 13 ics Science and Technology. (2010). 4
[Ree46] R EEB G.: Sur les points singuliers d’une forme de pfaff [STH02] S TOLTE C., TANG D., H ANRAHAN P.: Polaris: a sys-
completement intergrable ou d’une fonction numerique [on the tem for query, analysis, and visualization of multidimensional re-
singular points of a complete integral pfaff form or of a numerical lational databases. IEEE Transactions on Visualization and Com-
function]. Comptes Rendus Acad. Science Paris 222 (1946), 847– puter Graphics 8, 1 (2002), 52–65. 11
849. 6 [STH03] S TOLTE C., TANG D., H ANRAHAN P.: Multiscale vi-
[RPH08] R EDDY C. K., P OKHARKAR S., H O T. K.: Generating sualization using data cubes. IEEE Transactions on Visualization
hypotheses of trends in high-dimensional data skeletons. In IEEE and Computer Graphics 9, 2 (2003), 176–187. 11
Symposium on Visual Analytics Science and Technology (2008), [SvLB10] S CHRECK T., VON L ANDESBERGER T., B REMM S.:
IEEE, pp. 139–146. 5 Techniques for precision-based visual analysis of projected data.
[RRB∗ 04] ROSARIO G. E., RUNDENSTEINER E. A., B ROWN Information Visualization 9, 3 (2010), 181–193. 4, 14
D. C., WARD M. O., H UANG S.: Mapping nominal values to [SVW10] S HRINIVASAN Y. B., VAN W IJK J. J.: Supporting ex-
numbers for effective visualization. Information Visualization 3, ploratory analysis with the select & slice table. Computer Graph-
2 (2004), 80–95. 3 ics Forum 29, 3 (2010), 803–812. 12
[RS00] ROWEIS S. T., S AUL L. K.: Nonlinear dimensionality [SW09] S ANFTMANN H., W EISKOPF D.: Illuminated 3d scat-
reduction by locally linear embedding. Science 290, 5500 (2000), terplots. Computer Graphics Forum 28, 3 (2009), 751–758. 10,
2323–2326. 4, 11 12
[RW06] R ASMUSSEN C. E., W ILLIAMS C. K. I.: Gaussian Pro- [SW12] S ANFTMANN H., W EISKOPF D.: 3d scatterplot naviga-
cesses for Machine Learning (Adaptive Computation and Ma- tion. IEEE Transactions on Visualization and Computer Graph-
chine Learning). The MIT Press, 2006. 5 ics 18, 11 (2012), 1969–1978. 12
[RZH12] ROSENBAUM R., Z HI J., H AMANN B.: Progressive [SZD∗ 10] S ANYAL J., Z HANG S., DYER J., M ERCER A., A M -
parallel coordinates. In IEEE Pacific Visualization Symposium BURN P., M OORHEAD R. J.: Noodles: A tool for visualization
(2012), pp. 25–32. 7 of numerical weather model ensemble uncertainty. IEEE Trans-
[Shn92] S HNEIDERMAN B.: Tree visualization with tree-maps: 2- actions on Visualization and Computer Graphics 16, 6 (2010),
d space-filling approach. ACM Transactions on graphics (TOG) 1421 – 1430. 14
11, 1 (1992), 92–99. 9 [TAE∗ 09] TATU A., A LBUQUERQUE G., E ISEMANN M.,
[SLBC03] S WAYNE D. F., L ANG D. T., B UJA A., C OOK D.: S CHNEIDEWIND J., T HEISEL H., M AGNOR M., K EIM D.:
GGobi: evolving from XGobi into an extensible framework for Combining automated analysis and visualization techniques for
interactive data visualization. Computational Statistics & Data effective exploration of high-dimensional data. In IEEE Sympo-
Analysis 43, 4 (2003), 423–444. 9, 12 sium on Visual Analytics Science and Technology (2009), IEEE,
pp. 59–66. 7, 11, 13
[Sma61] S MALE S.: On gradient dynamical systems. The Annals
of Mathematics 74 (1961), 199–206. 6 [TDSL00] T ENENBAUM J. B., D E S ILVA V., L ANGFORD J. C.:
A global geometric framework for nonlinear dimensionality re-
[SMC07] S INGH G., M EMOLI F., C ARLSSON G.: Topological duction. Science 290, 5500 (2000), 2319–2323. 4
methods for the analysis of high dimensional data sets and 3d ob-
ject recognition. In Symposium on Point Based Graphics (2007), [TFA∗ 11] TAM G. K. L., FANG H., AUBREY A. J., G RANT
pp. 91–100. 3, 6, 13 P. W., ROSIN P. L., M ARSHALL D., C HEN M.: Visualization
of time-series data in parameter space for understanding facial
[SMT13] S EDLMAIR M., M UNZNER T., T ORY M.: Empiri- dynamics. Computer Graphics Forum 30, 3 (2011), 901–910. 3
cal guidance on scatterplot and dimension reduction technique
[TFH11] T URKAY C., F ILZMOSER P., H AUSER H.: Brushing
choices. IEEE Transactions on Visualization and Computer
dimensions-a dual visual analysis model for high-dimensional
Graphics 19, 12 (2013), 2634–2643. 10
data. IEEE Transactions on Visualization and Computer Graph-
[SNLH09] S IPS M., N EUBERT B., L EWIS J. P., H ANRAHAN P.: ics 17, 12 (2011), 2591–2599. 4, 11, 12
Selecting good views of high-dimensional data using class con-
[TFO09] TAKAHASHI S., F UJISHIRO I., O KADA M.: Applying
sistency. Computer Graphics Forum 28, 3 (2009), 831–838. 7
manifold learning to plotting approximate contour trees. IEEE
[SS04a] S EO J., S HNEIDERMAN B.: A rank-by-feature frame- Transactions on Visualization and Computer Graphics 15, 6
work for unsupervised multidimensional data exploration using (2009), 1185–1192. 14
low dimensional projections. In IEEE Symposium on Informa-
[TJHH14] T URKAY C., J EANQUARTIER F., H OLZINGER A.,
tion Visualization (2004), IEEE, pp. 65–72. 7, 13
H AUSER H.: On computationally-enhanced visual analysis of
[SS04b] S MOLA A. J., S CHÖLKOPF B.: A tutorial on support heterogeneous data and its application in biomedical informatics.
vector regression. Statistics and Computing 14, 3 (2004), 199– In Interactive Knowledge Discovery and Data Mining in Biomed-
222. 5 ical Informatics. Springer, 2014, pp. 117–140. 11

c The Eurographics Association 2015.


S. Liu, D. Maljovec, B. Wang, P.-T Bremer & V. Pascucci / Visualizing High-Dimensional Data:Advances in the Past Decade

[TLLH12] T URKAY C., L UNDERVOLD A., L UNDERVOLD A. J., [WM04] W ILLIAMS M., M UNZNER T.: Steerable, progressive
H AUSER H.: Representative factor generation for the interactive multidimensional scaling. In IEEE Symposium on Information
visual analysis of high-dimensional data. IEEE Transactions on Visualization (2004), pp. 57–64. 4, 11, 12
Visualization and Computer Graphics 18, 12 (2012), 2621–2630. [WO11] WADDELL A., O LDFORD R. W.: RnavGraph: A visual-
4, 11 ization tool for navigating through high-dimensional data. 10
[TM03] T EOH S. T., M A K.-L.: Paintingclass: interactive con-
[WPWR03] WANG J., P ENG W., WARD M. O., RUNDEN -
struction, visualization and exploration of decision trees. In Pro-
STEINER E. A.: Interactive hierarchical dimension ordering,
ceedings of the ninth ACM SIGKDD international conference on
spacing and filtering for exploration of high dimensional datasets.
Knowledge discovery and data mining (2003), ACM, pp. 667–
In IEEE Symposium on Information Visualization (2003), IEEE,
672. 13
pp. 105–112. 9
[TM05] T ZENG F.-Y., M A K.-L.: Opening the black box - data
[WSPVJ11] WANG B., S UMMA B., PASCUCCI V., V EJDEMO -
driven visualization of neural networks. In Proceedings of IEEE
J OHANSSON M.: Branching and circular features in high dimen-
Visualization (2005), pp. 383–390. 13
sional data. IEEE Transactions on Visualization and Computer
[TMF∗ 12] TATU A., M AAS F., FARBER I., B ERTINI E., Graphics 17, 12 (2011), 1902–1911. 6, 14
S CHRECK T., S EIDL T., K EIM D.: Subspace search and vi-
[YGX∗ 09] Y UAN X., G UO P., X IAO H., Z HOU H., Q U H.: Scat-
sualization to make sense of alternative clusterings in high-
tering points in parallel coordinates. IEEE Transactions on Visu-
dimensional data. In IEEE Conference on Visual Analytics Sci-
alization and Computer Graphics 15, 6 (2009), 1001–1008. 8,
ence and Technology (2012), IEEE, pp. 63–72. 5, 11, 12, 13
10, 12
[TWSM∗ 11] T ORSNEY-W EIR T., S AAD A., M OLLER T., H EGE
H.-C., W EBER B., V ERBAVATZ J., B ERGNER S.: Tuner: Princi- [YHW∗ 07] YANG J., H UBBALL D., WARD M. O., RUNDEN -
pled parameter finding for image segmentation algorithms using STEINER E. A., R IBARSKY W.: Value and relation display: in-
visual response surface exploration. IEEE Transactions on Vi- teractive visual exploration of large data sets with hundreds of
sualization and Computer Graphics 17, 12 (2011), 1892–1901. dimensions. IEEE Transactions on Visualization and Computer
5 Graphics 13, 3 (2007), 494–507. 9

[vdEvW11] VAN DEN E LZEN S., VAN W IJK J.: Baobabview: [YPS∗ 04] YANG J., PATRO A., S HIPING H., M EHTA N., WARD
Interactive construction and analysis of decision trees. In IEEE M., RUNDENSTEINER E.: Value and relation display for inter-
Conference on Visual Analytics Science and Technology (2011), active exploration of high dimensional datasets. In IEEE Sympo-
pp. 151–160. 13 sium on Information Visualization (2004), pp. 73–80. 9

[Vid11] V IDAL R.: A tutorial on subspace clustering. IEEE Sig- [YRWG13] Y UAN X., R EN D., WANG Z., G UO C.: Dimen-
nal Processing Magazine (2011). 5, 13 sion projection matrix/tree: Interactive subspace visual explo-
ration and analysis of high dimensional data. IEEE Transactions
[WAG05] W ILKINSON L., A NAND A., G ROSSMAN R.: Graph- on Visualization and Computer Graphics 19, 12 (2013), 2625–
theoretic scagnostics. In IEEE Symposium on Information Visu- 2633. 4, 11
alization (2005), vol. 0, p. 21. 7, 11, 12
[YWR02] YANG J., WARD M. O., RUNDENSTEINER E. A.: In-
[WAG06] W ILKINSON L., A NAND A., G ROSSMAN R.: High- terring: An interactive tool for visually navigating and manipulat-
dimensional visual analytics: Interactive exploration guided by ing hierarchical structures. In IEEE Symposium on Information
pairwise views of point distributions. IEEE Transactions on Vi- Visualization (2002), IEEE, pp. 77–84. 9
sualization and Computer Graphics 12, 6 (2006), 1363–1372. 7,
13 [ZCQ∗ 09] Z HOU H., C UI W., Q U H., W U Y., Y UAN X., Z HUO
W.: Splatting the lines in parallel coordinates. Computer Graph-
[War94] WARD M. O.: Xmdvtool: Integrating multiple methods ics Forum 28, 3 (2009), 759–766. 11, 13
for visualizing multivariate data. In Proceedings of IEEE Visual-
ization (1994), pp. 326–333. 3 [ZJGK10] Z IEGLER H., J ENNY M., G RUSE T., K EIM D.: Visual
market sector analysis for financial time series data. In IEEE
[War08] WARD M. O.: Multivariate data glyphs: Principles and Symposium on Visual Analytics Science and Technology (2010),
practice. In Handbook of Data Visualization. Springer, 2008, pp. 83–90. 3
pp. 179–198. 8
[Zom05] Z OMORODIAN A. J.: Topology for Computing (Cam-
[Wat05] WATTENBERG M.: A note on space-filling visualizations bridge Monographs on Applied and Computational Mathemat-
and space-filling curves. In IEEE Symposium on Information Vi- ics). Cambridge University Press, 2005. 6
sualization (2005), pp. 181–186. 9
[ZSWR06] Z AIXIAN X., S HIPING H., WARD M., RUNDEN -
[WB94] W ONG P. C., B ERGERON R. D.: 30 years of multi- STEINER E.: Exploratory visualization of multivariate data with
dimensional multivariate visualization. In Proceedings of Sci- variable quality. In IEEE Symposium on Visual Analytics Science
entific Visualization, Overviews, Methodologies, and Techniques and Technology (2006), pp. 183–190. 14
(1994), pp. 3–33. 1
[WBP07] W EBER G., B REMER P.-T., PASCUCCI V.: Topolog-
ical landscapes: A terrain metaphor for scientific data. IEEE
Transactions on Visualization and Computer Graphics 13, 6
(2007), 1416–1423. 9
[WBP12] W EBER G. H., B REMER P.-T., PASCUCCI V.: Topo-
logical cacti: Visualizing contour-based statistics. Topological
Methods in Data Analysis and Visualization II Mathematics and
Visualization (2012), 63–76. 9
[Wea09] W EAVER C.: Conjunctive visual forms. IEEE Trans-
actions on Visualization and Computer Graphics 15, 6 (2009),
929–936. 3

c The Eurographics Association 2015.


S. Liu, D. Maljovec, B. Wang, P.-T Bremer & V. Pascucci / Visualizing High-Dimensional Data:Advances in the Past Decade

Brief Biographies of the Authors ments. Valerio earned a Ph.D. in computer science at Purdue
University in May 2000, and a EE Laurea (Master), at the
Shusen Liu received his bachelor degree in Biomedical University “La Sapienza” in Roma, Italy, in December 1993,
Engineering and Computer Science from Huazhong Univer- as a member of the Geometric Computing Group. His recent
sity of Science and Technology, China, in 2009, where he research interest is in developing new methods for massive
worked at Wuhan National Laboratory for Optoelectronic on data analysis and visualization.
GPU accelerated biophotonics applications. Currently he is
a PhD student at University of Utah. His research interests
lie primarily in high-dimensional data visualization and mul-
tivariate volume visualization.
Dan Maljovec is a graduate student working on his PhD
from the School of Computing at the University of Utah. Dan
has been a research assistant at the University of Utah’s Sci-
entific Computing and Imaging Institute since 2012. He re-
ceived his B.S. in computer science from Gannon University
in 2009. His research focuses on analysis and visualization
of high-dimensional scientific data and intuitive visualiza-
tion.
Bei Wang received her Ph.D. in Computer Science from
Duke University in 2010. She is currently a Research Sci-
entist at the Scientific Computing and Imaging Institute,
University of Utah. Her main research interests are com-
putational topology, computational geometry, scientific data
analysis and visualization. She is also interested in computa-
tional biology and bioinformatics, machine learning and data
mining. She is a member of the IEEE Computer Society.
Peer-Timo Bremer is a member of technical staff and
project leader at the Center for Applied Scientific Comput-
ing (CASC) at the Lawrence Livermore National Laboratory
(LLNL) and Associated Director for Research at the Center
for Extreme Data Management, Analysis, and Visualization
at the University of Utah. His research interests include large
scale data analysis, performance analysis and visualization
and he recently co-organized a Dagstuhl Perspectives work-
shop on integrating performance analysis and visualization.
Prior to his tenure at CASC, he was a postdoctoral research
associate at the University of Illinois, Urbana-Champaign.
Peer-Timo earned a Ph.D. in Computer science at the Uni-
versity of California, Davis in 2004 and a Diploma in Math-
ematics and Computer Science from the Leipniz University
in Hannover, Germany in 2000. He is a member of the IEEE
Computer Society and ACM.
Valerio Pascucci is the founding Director of the Center
for Extreme Data Management Analysis and Visualization
(CEDMAV) of the University of Utah. Valerio is also a Fac-
ulty of the Scientific Computing and Imaging Institute, a
Professor of the School of Computing, University of Utah,
and a DOE Laboratory Fellow, of the Pacific Northwest Na-
tional Laboratory. Previously, Valerio was a Group Leader
and Project Leader in the Center for Applied Scientific Com-
puting at the Lawrence Livermore National Laboratory, and
Adjunct Professor of Computer Science at the University of
California Davis. Prior to his CASC tenure, he was a senior
research associate at the University of Texas at Austin, Cen-
ter for Computational Visualization, CS and TICAM Depart-

c The Eurographics Association 2015.

You might also like