DVT Unit2 1
DVT Unit2 1
Visualization stages:
eight stages of visualization can vary depending on the framework or methodology being used.
However, a commonly referenced framework is the one proposed by Colin Ware in his book
"Information Visualization: Perception for Design." Here are the eight stages according to Ware:
➢ Data Acquisition: This stage involves gathering the raw data that you want to visualize. This
could involve collecting data from databases, spreadsheets, sensors, or other sources.
➢ Data Filtering and Cleaning: Once you have acquired the data, it often needs to be processed
to remove noise, errors, or irrelevant information. This step ensures that the data is accurate
and ready for visualization.
➢ Data Representation: Here, you decide how to represent the data visually. This could involve
choosing appropriate charts, graphs, maps, or other visual forms based on the nature of the
data and the insights you want to convey.
➢ Visual Mapping: In this stage, you map the data attributes to visual properties such as size,
color, shape, position, and motion. This mapping is crucial for effectively communicating the
information encoded in the data.
➢ Interaction Design: Interactive visualizations allow users to explore the data dynamically. This
stage involves designing interactive features such as tooltips, filters, zooming, panning, and
brushing to enhance user engagement and understanding.
➢ Rendering: Once the visualization design is finalized, the data and visual mappings are rendered
into graphical elements on the screen. This involves programming and rendering techniques to
display the visual representation of the data accurately and efficiently.
➢ Knowledge Generation: The primary purpose of visualization is to help users gain insights and
make informed decisions. In this stage, users interact with the visualization to extract
meaningful patterns, trends, and relationships from the data.
➢ Presentation: Finally, the insights derived from the visualization need to be communicated
effectively to stakeholders or decision-makers. This could involve creating reports, dashboards,
or presentations that convey the key findings and recommendations.
These stages are iterative and often involve feedback loops, where insights gained from one stage
inform decisions in subsequent stages.
Semiology of Graphical Symbols:
We consider a visual object called a graphical symbol. Figure 4.4(a) is an example. Such
symbols are easily recognized. They often make up parts of visualizations (arrows, labels, . . . ). We will
look at how a graphical object or representation can be well designed, and how it is perceived. The
science of graphical symbols and marks is called semiology. Every possible construction in the Euclidean
plane is a graphical representation made up of graphical symbols. This includes diagrams, networks,
maps, plots, and other common visualizations. Semiology uses the qualities of the plane and objects on
the plane to produce similarity features, ordering features, and proportionality features of the data that
are visible for human consumption.
Important:
o Without external (cognitive) identification, a graphic is unusable. The external identification
must be directly readable and understandable. Since much of our perception is driven by
physical interpretations, meaningful images must have easily interpretable x-, y-, and z-
dimensions and the graphics elements of the image must be clear.
o Discovery of relations or patterns occurs through two main steps. The first is a mapping
between any relationship of the graphic symbols and the data that these symbols represent. In
other words, any pattern on the screen must imply a pattern in the data. If it does not, then it is
an artifact of the selected representation (and is disturbing). This can happen. Similarly, any
perceived pattern variation in the graphic or symbol cognitively implies such a similar variation
in the data. The same is true for ordering. Any perceived order in graphic symbols is directly
correlated with a perceived.
Features of Graphics:
Rules of a graphic.
All graphics are represented on the screen. All objects will be interpreted as flat (in 2D) or
as physical objects (in 3D). As we saw in Chapter 3, any perceptual interpretation of the
data will assume that the graphic represents parts of a 3D scene. So 3D is the medium by
which we need to interpret the graphic.
3. Within the (x, y, z)-construction, permutations and classifications solve the problem
of the upper level of information;
4. Every graphic with more than three factors that differs from the (x, y, z)-construction
destroys the unity of the graphic and the upper level of information; and 5. Pictures
must be read and understood by the human
Analysis of a graphic.
we discussed perception and cognition. When analyzing a graphic, we first perceive groups of
objects (preattentively). We then attempt to characterize these groups (cognitively). Finally, we examine
special cases not within the groups or relationships between the groups (combination of both). This
process can be done at many levels and with many different visualizations. Supporting analysis plays a
significant role (for example, we can cluster the data and show the results of the computation, hence
speeding up the likely perception of groups).
o Rendering transformations.
The final stage involves mapping from geometry data to the image. This includes interfacing
with a computer graphics Application Programmer’s Interface (API). We need to select the viewing
parameters, shading technique if 3D, device transformations (for display, printers, . . . ). This stage of the
pipeline is very dependent on the underlying graphics library. In Appendix C and on the book’s web site,
we have provided examples using OpenGL, Processing, Java, and Flex. There are many others. We have
already precisely defined measures and distance metrics. We now define two measures of visualizations
mathematically. Such measures and modifications can actually be applied at all stages of the pipeline.
This is becoming increasingly important as we want to measure information transfer. The measures of
visualization are:
1) Expressiveness.
An expressive visualization presents all the information, and only the information. Expressiveness thus
measures the concentration of information. Given information that we actually display to the user, we
can define one measure of expressiveness as the ratio Mexp of that information, divided by the
information we want to present to the user. We have 0 ≤ Mexp ≤ 1. If Mexp = 1, we have ideal
expressiveness. If the information displayed is less than that desired to be presented, then Mexp < 1. If
Mexp > 1, we are presenting too much information. Expressing additional information is potentially
dangerous, because it may not be correct and may interfere with the interpretation of the essential
information. Such a measure of expressiveness may be extended to include various sets of information,
in which case it becomes a function on sets (see projects).
2) Effectiveness.
A visualization is effective when it can be interpreted accurately and quickly and when it can be
rendered in a cost-effective manner. Effectiveness thus measures a specific cost of information
perception. We can define a measure of effectiveness Meff as some ratio similar to that for
expressiveness. However, it is a bit more complex. What we want is a measure such that for
small data sets we measure interpretation time (since rendering is usually very fast) and when
that time increases, either due to the increasing complexity or the size of the data set, Meff
decreases, emphasizing the rendering time. We define
We then have 0 < Meff ≤ 1. The larger Meff is, the greater the visualization’s effectiveness. If
Meff is small, then either the interpretation time is very large, or the rendering time is large. If
Meff is large (close to 1), then both the interpretation and the rendering time are very small.
Figures 4.3(a) and 4.3(b) show displays for which Eexp can be considered very close, if not identical, for
the task of presenting the car prices and mileage for 1979; both display all the information, and only the
information, and both can be rendered quickly (there’s very little data to be displayed). However, Eeff is
different. The information in Figure 4.3(b) can:
(a) Scatterplot using plus as symbol provides good query-answering capabilities, but is slower for
simple one-variable queries.
(b) Bar charts clearly display cost and mileage, but don’t provide as much flexibility in answering
some other queries.
In total there are eight ways in which graphical objects can encode information, i.e., eight
visual variables: position, shape, size, brightness, color, orientation, texture, and motion. These eight
variables can be adjusted as necessary to maximize the effectiveness of a visualization to convey
information
1) Position
2) Mark
3) Size (Length, Area, and Volume)
4) Brightness
5) Color
6) Orientation
7) Texture
8) Motion
Position:
➢ The first and most important visual variable is that of position, the placement of
representative graphics within some display space, be it one-, two-, or three-dimensional.
Position has the greatest impact on the display of information, because the spatial
arrangement of graphics is the first step in reading a visualization. In essence, the
maximization of the spread of representational graphics throughout the display space
maximizes the amount of information communicated, to some degree. The visualization
display with the worst case positioning scheme maps all graphics to the exact same
position; consequently, only the last-drawn graphic is seen, and little information is
exchanged. The best positioning scheme maps each graphic to unique positions, such that
all the graphics can be seen with no overlaps. Interestingly, for the standard computer
screen with a resolution setting of 1024 by 768, the maximum number of individual pixels is
only 786,432; hence, if each data representation is mapped to a unique pixel, we are still
not able to display a million values. And since most graphics employed to represent data
take up considerably more visual real estate than a single pixel, the actual number of
displayable marks diminishes rapidly
(a) using position to convey information. Displayed here is the minimum price versus the maximum price
for cars with a 1993 model year. The spread of points appears to indicate a linear relationship between
minimum and maximum price;
(b) another visualization using a different set of variables. This figure compares minimum price with
engine size for the 1993 cars data set.
Unlike (a), there does not appear to be a strong relationship between these two variables. logarithmic
scale that is used to map exponentially increasing variables into more compact ranges.
Mark:
➢ The second visual variable is the mark or shape: points, lines, areas, volumes, and their
compositions. Marks are graphic primitives that represent data. For example, both
visualizations in use the default point to display individual values. Any graphical object can
be used as a mark, including symbols, letters, and words When working purely with marks,
it is important not to consider differences in sizes, shades of intensity, or orientation, as
these are additional visual variables that will be described later.
➢ When using marks, it is important to consider how well one mark can be differentiated
from other marks. Within a single visualization there can be hundreds or thousands of
marks to observe; therefore, we try not to select marks that are too similar.
➢ Example :
➢ Another is the set (T and L) or (+ and −), that harnesses our perceptual systems The
goal is to be able to easily distinguish between different marks quickly, while
maintaining an overall view of the projected data space. Also, different mark shapes
in a given visualization must have similar area and complexity, to avoid visually
emphasizing one or more of them inadvertently
➢ The previous two visual variables, position and marks, are required to define a
visualization. Without these two variables there would not be much to see. The
remaining visual variables affect the way individual representations are displayed;
these are the graphical properties of marks other than their shape.
➢ The third visual variable and first graphic property is size. Size determines how small
or large a mark will be drawn . Size easily maps to interval and continuous data
variables, because that property supports gradual increments over some range. And
while size can also be applied to categorical data, it is more difficult to distinguish
between marks of near similar size, and thus size can only support categories with
very small cardinality.
➢ A confounding problem with using size is the type of mark. However, when marks
are represented with graphics that contain sufficient area, the quantitative aspects
of size fall, and the differences between marks becomes more qualitative.
Brightness:
➢ The fourth visual variable is brightness or luminance. Brightness is the second visual
variable used to modify marks to encode additional data variables. While it is possible
to use the complete numerical range of brightness values, as discussed in Chapter 3,
human perception cannot distinguish between all pairs of brightness values.
➢ Color The fifth visual variable is color; see Chapter 3 for a detailed discussion of color
and of how humans perceive color. While brightness affects how white or black
colors are displayed, it is not actually color. Color can be defined by the two
parameters, hue and saturation.
➢ The use of color to display information requires mapping data values to individual
colors. The mapping of color usually entails defining color maps that specify the
relationship between value ranges and color values. Color maps are useful for
handling both interval and continuous data variables, since a color map is generally
defined as a continuous range of hue and saturation values, as illustrated in Figure
4.15 and Figure 4.16. When working with categorical or interval data with very low
cardinality, it is generally acceptable to manually select colors for individual data
values, which are see
Orientation:
➢ The sixth visual variable is orientation or direction. Orientation is a principal graphic component
behind iconographic stick figure displays.
➢ tied directly to preattentive vision (see Chapter 3). This graphic property describes how a mark
is rotated in connection with a data variable. Clearly, orientation cannot be used with all marks;
for instance, a circle looks the same under any rotation. The best marks for using orientation
are those with a natural single axis; the graphic exhibits symmetry about a major axis. These
marks can display the entire range of orientations.
Texture :
➢ The seventh visual variable is texture. Texture can be considered as a combination of many
of the other visual variables, including marks (texture elements), color (associated with
each pixel in a texture region), and orientation (conveyed by changes in the local color).
Dashed and dotted lines, which constitute some of the textures of linear features, can be
readily differentiated, as long as only a modest number of distinct types exist. Varying the
color of the segments or dots can also be perceived as a texture.
➢ Texture is most commonly associated with a polygon, region, or surface. In 3D, a texture
can be an attribute of the geometry, such as with ridges of varying height, frequency, and
orientation. Similarly, it can be associated with the color of the graphical entity, with
regular or irregular variations in color with different ranges and distributions. In fact,
geometric textures can be readily emulated with color textures, with color variations similar
to those obtained via lighting effects. Finally, the distribution and orientation of marks
themselves can form regions of texture.
Motion :
➢ The eighth visual variable is motion. In fact, motion can be associated with any of the other
visual variables, since the way a variable changes over time can convey more information.
One common use of motion is in varying the speed at which a change is occurring (such as
position change or flashing, which can be seen as changing the opacity). The eye will be
drawn to graphical entities based not only on similarities in behavior, but also on outliers.
The other aspect of motion is in the direction; for position, this can be up, down, left, right,
diagonal, or basically any slope, while for other variables it can be larger/smaller,
brighter/dimmer, steeper/shallower angles, and so on. As conveying this concept is difficult
on the static page, we’ve included some examples on the book’s web site.
Different visual variables can serve different purposes. We can categorize these purposes in a
variety of ways. Below, we give one such categorization, provide some examples, and indicate which
visual variables are effective for the purpose.
Selective visual variables.
After coding with such variables, different data values are spontaneously divided by the human
into distinguished groups
(e.g., for visualizing nominal values).
• Size (length, area/volume);
• Brightness;
• Texture
• Color (only primary colors): varies with the brightness value;
• Direction/orientation.
Taxonomies, within the context of data visualization techniques, have a history intertwined with
the evolution of information visualization and graphic representation. Here's a historical perspective on
taxonomies in data visualization:
✓ Scientific Revolution:
The Scientific Revolution of the 16th and 17th centuries brought about
advancements in data collection and analysis. Taxonomies were used to classify scientific
phenomena, such as Linnaeus' taxonomy of plants and animals, which influenced how data was
organized and represented.
✓ Statistical Graphics:
The 18th and 19th centuries saw the emergence of statistical graphics, pioneered
by individuals like William Playfair and Florence Nightingale. Playfair introduced techniques such
as line charts and bar graphs to represent statistical data visually, while Nightingale used polar
area diagrams to illustrate the causes of mortality in the Crimean War. These early examples laid
the foundation for taxonomies in statistical visualization.
✓ Information Age:
The advent of the Information Age in the 20th century brought about a proliferation
of data and the need for more sophisticated visualization techniques. Taxonomies became
essential for organizing complex datasets and facilitating data exploration and analysis.
✓ Digital Visualization:
The rise of computers and digital technologies revolutionized data visualization.
Taxonomies were used to categorize visualization techniques based on their purpose, function,
and design principles. For example, Jacques Bertin's seminal work "Semiology of Graphics"
introduced a taxonomy of visual variables, such as shape, size, color, and texture, which
influence how data is encoded and perceived in visualizations.
✓ Contemporary Practices:
In contemporary data visualization practice, taxonomies are used to classify
visualization techniques, tools, and methods. Ta+xonomies help practitioners and researchers
navigate the vast landscape of visualization possibilities and understand the strengths and
limitations of different approaches. They also facilitate communication and collaboration within
the visualization community.
1. **Semiotic Analysis of Visual Elements**: In data visualization, visual elements such as colors,
shapes, sizes, and positions serve as signs that convey information to users. Experimental semiotics can
be used to analyze how these visual elements are perceived and interpreted by users. For example,
researchers may conduct experiments to investigate how different color schemes or symbol shapes
affect users' understanding of data in visualizations.
3. **Cognitive Semiotics**: Cognitive semiotics explores how signs are processed and understood
by the human mind. In the context of data visualization, cognitive semiotics can be used to investigate
how users' cognitive processes, such as attention, memory, and decision-making, interact with visual
representations of data. For example, researchers may use eye-tracking techniques to study how users
scan and process information in visualizations.
4. **Symbolic Representation**: Data visualizations often use symbolic representation to convey
complex information in a concise and intuitive manner. Experimental semiotics can be used to study
how symbolic representations, such as icons, graphs, and maps, are interpreted by users from different
cultural or linguistic backgrounds. For instance, researchers may examine how cultural norms and
conventions influence the interpretation of color symbolism in visualizations.
5. **Perceptual Salience**: Perceptual salience refers to the degree to which a visual element
stands out from its surroundings. Experimental semiotics can be used to investigate how perceptual
salience influences users' attention and comprehension in data visualizations. For example, researchers
may manipulate the salience of visual elements in visualizations to study its impact on users' perception
and interpretation of data.
Overall, experimental semiotics based on perception provides a valuable framework for studying how
users perceive, interpret, and interact with visual representations of data, shedding light on the
underlying cognitive and perceptual processes involved in data visualization.
3. **Action-Effect Coupling**: Gibson's theory emphasizes the coupling between perception and
action, suggesting that perception is inherently linked to the possibilities for action in the environment.
In data visualization, this principle can be applied to how users interact with visualizations to extract
meaning or insights from the data. For example, users may interact with interactive visualizations by
manipulating parameters or selecting data points to observe how changes affect the visualization.
4. **Information Pickup**: Gibson proposed that perception involves the pickup of information
that is directly available in the environment, without the need for complex cognitive processing. In data
visualization, this concept can be seen in how users quickly perceive patterns, trends, and relationships
in visualizations based on their visual properties, such as color, shape, and position.
5. **Perceptual Learning**: Gibson argued that perception is shaped by experience and learning in
specific environments. In the context of data visualization, users' perceptual processing of visualizations
may be influenced by their familiarity with certain types of visualizations or their domain expertise. For
example, individuals with expertise in data analysis may perceive and interpret visualizations differently
than novices.
Overall, Gibson's affordance theory provides a useful framework for understanding how users perceive
and interact with data visualizations, highlighting the importance of the affordances offered by
visualizations and the coupling between perception and action in the interpretation of visual data.
A model of perceptual processing can be described as a structured way to understand how humans
perceive, interpret, and understand visual information. When applying this to data visualization
techniques, several key stages and principles from perceptual psychology can be incorporated to
optimize how information is presented and comprehended. Here’s a concise outline of such a model:
1. *Sensation*
This is the initial stage where the sensory organs (eyes, in this case) receive visual stimuli from the
environment.
2. *Pre-Attentive Processing*
This involves the automatic detection of basic visual features without focused attention.
3. *Attention*
Focused attention allows for the selective concentration on specific visual elements while ignoring
others.
4. *Perception*
This stage involves organizing and interpreting sensory information to recognize patterns and objects.
- *Key Factors*: Gestalt principles (proximity, similarity, continuity, closure, and figure-ground).
- *Visualization Techniques*: Apply Gestalt principles to organize information intuitively (e.g., grouping
related data points).
5. *Short-Term Memory*
Information that is attended to is temporarily stored for further processing.
7. *Decision Making*
The final stage where perceived and processed information leads to judgments and decisions.