0% found this document useful (0 votes)
17 views

Visualization Combined

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Visualization Combined

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 447

DADS304: Visualization Manipal University Jaipur (MUJ)

MASTER OF BUSINESS ADMINISTRATION


SEMESTER 3

DADS304
VISUALIZATION

Unit 1 : Introduction to Data Visualization 1


DADS304: Visualization Manipal University Jaipur (MUJ)

Unit 1
Introduction to Data Visualization
Table of Contents

SL Fig No / Table SAQ /


Topic Page No
No / Graph Activity
1 Introduction to Data Visualization - -
3
1.1 Objectives - -
2 Emerging Need of Data Visualization in
1, 1 4-6
Business Analytics
3 Data Visualization in Data Analytics Lifecycle 2 2 7 - 10
4 Audiences of Data Visualization - I 11 - 12
5 Techniques of Data Visualization 3, 4, 5, 6,7,8 3,4,5 II ,III
13 - 27
,9,10,11,12 ,IV,V,VI
6 Pros and Cons of Data Visualization - - 28
7 Summary - - 29
8 Glossary - - 29
9 Terminal Questions - - 30
10 Answers - - 30 - 33
11 Case Study 13 - 34
12 References - - 35

Unit 1 : Introduction to Data Visualization 2


DADS304: Visualization Manipal University Jaipur (MUJ)

1. INTRODUCTION TO DATA VISUALIZATION

The graphical representation of data and information is known as data visualization. Data
visualization tools offer an easy approach to observe and analyze trends, outliers, and
patterns in data by utilizing visual elements like charts, graphs, and maps. Additionally, it
offers a great tool for employees or business executives to clearly deliver data to non-
technical audiences.

A bar graph, pie chart, line chart, or other type of visual representation may be used to
graphically display data in data visualization. Visual insights that cannot be produced with
other data presentation methods are made possible by this style of representation. It
facilitates the brain’s ability to comprehend patterns, trends, and outliers and gain insights
into them. The written language or text is not how humans are wired to digest information.
Once we become familiar with a pattern, the brain is very good at recognizing

it. Research and data analysis are greatly accelerated by visualizations, which are also
effective communication tools.

1.1 Objectives:

After studying this unit, you should be able to:

❖ Define what data visualization is


❖ Discuss the need for data visualization
❖ Discuss why data visualization is important in business analytics
❖ Illustrate where data visualization stands in the lifecycle of data analytics
❖ Learn various data visualization techniques
❖ Explain the different pros and cons of data visualization.

Unit 1 : Introduction to Data Visualization 3


DADS304: Visualization Manipal University Jaipur (MUJ)

2. EMERGING NEED OF DATA VISUALIZATION IN BUSINESS ANALYTICS

Most organizations influence data visualization for decision-making process. Since


businesses can now comprehend data in graphical or pictorial forms, they can now spot
trends more quickly.

Nowadays data analysis has become very popular because it gives more importance to
visualization.

Data
Analysis Discover
Better Trends
Market
Analysis

Role of
Data
Visualizat
ion in
Improve Business
Customer Find
Acquisition Analytics Patterns

Create
Decision
Impact on
Making
Audience

Fig 1.1: Illustrating role of data visualization in business analytics

Here are listed some of the scenarios where data visualization plays an important role as
shown in Fig.1.1.

1. Better approach for data analysis: Business stakeholders can concentrate on areas
that need attention by analyzing visualization reports containing different graphs,
charts and tables for data comparison and analysis. These visual mediums aid analysts

Unit 1 : Introduction to Data Visualization 4


DADS304: Visualization Manipal University Jaipur (MUJ)

in comprehending crucial information required for their line of work. Whether it's a
sales report or a marketing strategy, a visual depiction of the data helps businesses
make better analyses and decisions that further enhance business revenues.
2. Quick decision making: People process images more quickly than they do long,
laborious tabular forms or reports. Decision-makers can move fast based on fresh data
insights if the data is well-communicated, accelerating both decision-making and
corporate growth.
3. Discover patterns and trends in data: Business users can utilize data visualizations
to understand their massive data sets. Data analysts benefit from data visualizations
because it helps them spot new patterns and mistakes in the data. The users can focus
on areas that show error flags or progress by making sense of these patterns. This
procedure then propels the company forward. Finding correlations between
independent variable relationships is difficult without data visualization. We can
improve our business decisions if we can make sense of those independent variables.
Although it would seem like an obvious application for data visualizations, this is
actually one of its most beneficial uses. Without the required knowledge from the past
and the present, it is impossible to make forecasts. Trends over time show us where we
have been and where we might go.
4. Highlighting and strengthening impact of message for audiences: Data
visualization increases the impact of your message on your target audiences and
delivers the findings from data research in the most convincing way. It combines the
messaging platforms used by all organizational groups and departments. With the use
of visualization, you may quickly and more effectively make sense of large amounts of
data. It aids in better comprehending the data to assess its impact on the business and
visually conveys the information to all stakeholders, internal participants and external
audiences.
5. Formulating better customer acquisition strategies: Frequency is closely tied to
patterns over time. We can get a clearer sense of how potential new consumers would
behave and respond to various marketing and customer acquisition efforts by looking
at the rate, or how frequently, they make purchases and when they do so. This helps
business personnel in formulating better approaches for customer acquisition.

Unit 1 : Introduction to Data Visualization 5


DADS304: Visualization Manipal University Jaipur (MUJ)

6. Improving market examination, analysis and reaction: Data visualization uses


information from many markets to provide insights into the audiences you should pay
attention to and those you should ignore. By presenting this information in charts and
graphs, the potential inside those markets can be seen more clearly. Because we must
analyze complex spreadsheets and figures, without data visualization techniques,
analyzing value and risk measurements requires skill. When data is visualized, we may
then identify regions that could or might not need action. Businesses can act and
respond to findings fast and avoid mistakes when they have access to information
quickly and readily with data shown clearly on a useful dashboard.

Self-Assessment Questions - 1

1. What do you mean by data visualization?


a) Graphical representation of data and information
b) Numerical representation of data and information.
c) Character representation data and information.
d) None of the above
2. For data visualization, which one in the following is true?
a) It helps users to analyze large amount of data in a simpler way.
b) It makes the complex data more understandable and usable.
c) It is a graphical representation of data.
d) All the above.
3. The primary purpose of data visualization tool is to provide an easier way to
analyze and understand .................... in data.
a) Trends
b) Patterns
c) Outliers
d) All the above

Unit 1 : Introduction to Data Visualization 6


DADS304: Visualization Manipal University Jaipur (MUJ)

3. DATA VISUALIZATION IN DATA ANALYTICS LIFECYCLE


The data analytics lifecycle includes the following steps based on the CRISP-DM methodology
as shown in Fig.1.2. CRISP-DM, which stands for Cross-Industry Standard Process for Data
Mining, is an industry-proven way to guide your data mining efforts. The data analytics
lifecycle has the following parts:

1. Understanding business issues


2. Understanding your data set
3. Preparing the data
4. Performing exploratory analysis and modeling
5. Validating your data
6. Visualizing and presenting your findings

1. Business
Understanding

6. Data 2. Data
Visualization Understanding

3. Data
5. Data Validation Preparation

4. Explortory
Data Analysis
and Data
Modeling

Fig. 1.2: Data visualization in data analytics lifecycle

Unit 1 : Introduction to Data Visualization 7


DADS304: Visualization Manipal University Jaipur (MUJ)

1. Understanding business issues: Understanding your goals from a business standpoint


is the first step in the CRISP-DM process. Your organization may have conflicting goals
and restrictions that need to be carefully managed. Finding significant aspects that
might have an impact on the project's outcome is the aim of this phase of the process.
If this phase is skipped, a lot of time and energy may be spent trying to answer the
appropriate questions with the proper answers. It requires learning more specifically
about all the resources, restrictions, assumptions, and other aspects that we may need
to take into account when developing our data analysis goal and project plan. It also
involves learning what the project's desired outcomes were.
Additionally, it also entails creating a project plan that should outline the procedures
to be followed throughout the remainder of the project, including the initial tool and
technique selection.
2. Understanding your data set: You must obtain the data indicated in the project
resources in order to go on to the second step of the CRISP-DM procedure. If data
loading is required for data comprehension, it is included in this first collection. It
makes perfect sense to load your data into a program you use specifically for
comprehending data, for instance. Consider how and when you will combine these
sources if you acquire various data sources. It includes following tasks:
a. Describe acquired data: Examining the properties of the acquired data and
creating a Data Description Report (DDR) on the results.
b. Data Exploration: In this subtask, we need to look at the relevant data exploratory
questions using querying, data visualization and reporting techniques. These may
include seeing the key data attributes, relationships between pairs of attributes,
simple data aggregations and basic statistical analyses
c. Confirm data quality: It involves examining the quality of the data to find out if
the data is complete or not, is it correct, or does it contain errors, if there are some
values missing in the data or not.
d. Formulating Data Quality Report (DQR): This report lists the results of the above
steps of data quality validation and suggests probable solutions for data quality
issues.

Unit 1 : Introduction to Data Visualization 8


DADS304: Visualization Manipal University Jaipur (MUJ)

3. Preparing the data: This task entails improving the data quality to the standard
required by the analytic methods you've chosen. In order to do this, one may choose
clean subsets of the data, insert appropriate defaults, or use more ambitious strategies,
including modeling to estimate missing data.
4. Performing exploratory analysis and modeling: Exploratory data analysis is the
crucial process in the preliminary analysis of data in order to find patterns, identify
anomalies, test hypotheses with the help of graphical representations. After completing
Exploratory Data Analysis you will choose the actual modeling technique that will be
employed as the initial modeling phase. It may happen that a tool has already been
chosen during the business understanding step, but now you will have to choose the
specific modelling technique, such as decision-tree construction using C5.0 or neural
network formation with back propagation.
5. Validating your data: The accuracy and generality of the model were two issues that
were addressed in earlier steps of the lifecycle. In this step, you'll evaluate how well the
model satisfies your business objectives and look for any commercial reasons why the
model may be flawed. If time and financial limitations allow, another alternative is used
to test the model(s) or test applications in the real application. The evaluation process
also entails evaluating any additional data mining findings you may have produced. The
outcomes of data mining include models that must be tied to the original business goals
and all other discoveries that may not be related to the goals but may also reveal new
problems, details, or hints for future endeavors.
6. Visualizing and Presenting your findings: The last step of the CRISP-DM lifecycle is
to communicate the evaluated results. It involves determining the best method to
present the insights based on analysis and concerned audience. It begins by creating
dynamic dashboards that highlight the business analysis and creates a greater impact
about the business problems on the audiences. Next, it involves combining insights to
form a compelling story for the business problem and suggesting relevant
recommendations at the end.

Understanding your audience before selecting a visual chart or graph can help you select the
one that will effectively convey your message. The findings you want to communicate to your

Unit 1 : Introduction to Data Visualization 9


DADS304: Visualization Manipal University Jaipur (MUJ)

audience will have a direct impact on the chart you use. For this some relevant questions can
be put forward, such as:

❖ Do you wish to demonstrate how combining data columns can result in insightful
information?
❖ Do you wish to display some dataset’s data patterns?
❖ Do you wish to demonstrate the comparison of different data variables?
❖ Would you like to illustrate the connections between the data variables?

Selecting a few of these can assist in determining which charts are most appropriate for you.
Choosing the best chart usually needs some experimentation with various charts.

Self-Assessment Questions - 2

4. Can you use data visualization during exploratory data analysis step?
a) Yes
b) No

Unit 1 : Introduction to Data Visualization 10


DADS304: Visualization Manipal University Jaipur (MUJ)

4. AUDIENCES OF DATA VISUALIZATIONS

There are various participants in the process of Data Visualization. But there are some
important audiences for whom the visualization outputs are most important and crucial
because these audiences are directly related to the process of decision-making. These
visualizations help them achieve the goal of quick and reliable decision-making for business
progress. These data visualization audiences are listed below:

1. C-Suite/Executive Management (Chief Executive Officer, Chief Operations Officer,


Chief Financial Officer, etc.): Unless one works in mid-level data management or is a
senior analyst, one is unlikely work with this group. This group of management is
typically older, extremely experienced, and less technically skilled than analysts. Since
they are the ones running the business, the executive management requires concise
summaries. The majority of the time, they just need the high-level picture to make
judgments quickly, yet occasionally they might want to go deep into the numbers. The
ability to quickly create reports rather than intricate dashboards makes Tableau, Power
BI, R-Shiny, etc. a suitable set of solutions for these reports.
2. Upper-Level Management (Vice President of
Study Notes Marketing, Vice President of Sales, Director of IT,

Did you know: the popular tools that etc.): More often than the executive management,
are used for data visualization? These this group will likely be the one you work with if you
are: Tableau Dundas BI, JupyteR, Zoho are leading a team of Data Analysts. This
Reports, Google Charts, Visual.ly, RAW. demographic is often experienced, middle-aged, and
IBM Watson, Sisense, Plotly, Data technical skill-above-average. They enjoy going a
Wrapper, Highcharts, Fusioncharts, little deeper into the numbers, but they generally
Power BI and QlikView.
prefer to maintain a high-level perspective. This
category of managers might benefit from using
Tableau since they enjoy exploring the data on their own.
3. Mid-Level Management (Marketing Automation Manager, Sales Development
Manager, etc.): One of the groups you will work with the most frequently is this one if
you are the beginning of your career or have been promoted to work in Data Analytics
team. This group is more technically skilled, younger and comparatively less

Unit 1 : Introduction to Data Visualization 11


DADS304: Visualization Manipal University Jaipur (MUJ)

experienced in business management. They are more interested in delving into the
figures than other groups because they grew up at the dawn of the information age and
are more keen to learn new technologies for their career progress.
4. Specialized Positions/Individuals (BI Developer, Web Analyst, Customer
Development Representative, etc.): This group is very inexperienced and young, but
they have strong technical aptitude. All reporting and visualizations tools are effective
since they are likely to be understandable to them. They are specialists of tools as they
continuously work with new tools based on the projects and scenarios given to them to
achieve a particular solution.

Activity I
Assume that you are a project manager in company ABC. Do you think data
visualization is more attractive than tables and other format? Justify your answers.

------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------
---

Unit 1 : Introduction to Data Visualization 12


DADS304: Visualization Manipal University Jaipur (MUJ)

5. TECHNIQUES OF DATA VISUALIZATION

Charts and graphs are crucial components of working with data because they allow for the
condensing of large amounts of data into a format that is simple to comprehend. Data
visualizations can communicate findings to people who won't see the raw data as well as
reveal insights to someone looking at the data for the first time. There are innumerable chart
types, each with a unique set of applications. Choosing the right sort of chart for the task at
hand is frequently the most challenging aspect of developing a data visualization. The type
of chart you choose will rely on a number of variables: the categories of metrics,
characteristics, or other variables you intend to plot; the type of inferences you want the
audience to draw; and so on.

The following types of data visualization techniques may help:

1. Bar Chart: We can compare numerical quantities like percentages and integers using
bar charts. The value of each variable is represented by the length of each bar. Using
basic, evenly spaced bars or rectangles, bar charts, for instance, display differences in
categories or subcategories scaling width or height. Quantitative measures can be
displayed in bar charts either vertically, on the y-axis, or horizontally, on the x-axis. The
style is determined by the data and the issues the visualization attempts to solve. The
qualitative dimension will follow the axis that runs counterclockwise to the
quantitative metric. Usually, the baseline of a bar chart is zero. To avoid deceiving the
viewer, the axis should be clearly identified if a different beginning point is chosen.

There are plenty of additional bar chart variations. Bar charts that are stacked, side-by-
side, or grouped/ clustered bar charts. Labels and legends enable the audience to
understand and interpret the details represented in bar charts as shown in Fig.1.3.

A great bar chart will abide by these guidelines:

❖ The base is zero.


❖ The axes have distinct labels.
❖ Colors are predictable and well-defined.
❖ There aren't too many bars in the bar chart.

Unit 1 : Introduction to Data Visualization 13


DADS304: Visualization Manipal University Jaipur (MUJ)

Things not to be done while making a bar chart is:

❖ Make the bars of varying widths.


❖ Overcrowd category with a lot of bars
❖ Keep the axes' labels off

Fig. 1.3: A vertical bar chart showing the top most populated countries as per 2020 World
Population Data

Fig. 1.4: A horizontal bar chart showing top most populated countries as per 2020 World
Population Data

Unit 1 : Introduction to Data Visualization 14


DADS304: Visualization Manipal University Jaipur (MUJ)

Activity II
Activity: Make a bar chart that represents exotic pet ownership in UK. The data are:

8,000,000 fish,1,500,000 rabbits,1,300,000 turtles, 1,000,000 poultry 900,000 hamsters.

Activity III
Activity: Consider that you have drawn a bar chart (overview) to compare your
product with another’s product. Is it correct? Justify the answer.

Fig.1.5: Product comparison bar chart

2. Pie Chart: A pie chart is useful for organizing and displaying data by percentage of
the total. In keeping with its moniker, this type of visualization uses a circle to
represent the entire thing and slices, or ‘pies’, of that circle to symbolize the various
categories that make up the whole. A user can compare the relationship between
various dimensions (such as categories, products, people, countries, etc.) within a
particular context using this sort of chart. The numerical data (measure) is typically
divided into percentages of the overall sum on the chart. Each slice is a representation
of the value's percentage, and should be measured as such as shown in Fig.1.6.

Unit 1 : Introduction to Data Visualization 15


DADS304: Visualization Manipal University Jaipur (MUJ)

Analyses supported by pie charts:

❖ Pie charts should be used to illustrate how various components relate to the overall.
❖ They perform best when applied to dimensions with a small number of category
options.
❖ A pie chart can help the data story shine if you need to show that one part of the entire
is overrepresented or underrepresented.
❖ Pie charts are ineffective for comparing precise figures.

Apply a pie chart when:

❖ You have a sum that can be divided into two to five groups.
❖ There is a big difference between the weight of each category.

Avoid using pie charts when:

❖ There are too many categories or varieties in your dimension.


❖ There are comparable percentages or figures between various values within the
selected dimension.
❖ The percentages do not add up to 100 per cent or the data does not represent a
consistent ‘whole’.
❖ Your measure value contains negative values or complicated fractions.

Best practices for pie charts:

❖ Each pie slice needs to be properly labelled and have the correct number or percentage
assigned to it.
❖ To make it simple for the user to compare the slices, the slices should be arranged
according to size, either smallest to biggest or biggest to smallest.
❖ When possible, labels must be given to the slices. Try not to make the visual display too
complex.
❖ If the chart includes more than five slices, make sure to use a legend, list, or table to
provide the reader more context.
❖ If there needs to be a comparison between several categories, think about using a line
chart. Line charts offer a quick overview of the patterns and trends present in various
data sets as well as how they interact with one another.

Unit 1 : Introduction to Data Visualization 16


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig. 1.6: Pie chart showing land area and density comparison

Self-Assessment Questions - 3

5. Can you draw a pie chart for multiple groups with more than 100 category?

a) Yes
b) No

Unit 1 : Introduction to Data Visualization 17


DADS304: Visualization Manipal University Jaipur (MUJ)

3. Line Chart: A line chart, also known as a line graph or a line plot, uses a line to link a
group of data points as shown in Fig.1.7. This type of graph uses sequential values to
show trends. The x-axis (horizontal axis) often shows a succession of numbers in a
consecutive order. The values for a chosen metric across that progression are then
provided on the y-axis (vertical axis). When you need to illustrate data across time,
this basic graphic works wonderfully. To create forecasts for the coming year, one use
case would involve tracking customer interest in a particular category of good or
service over the course of the year.

Analyses supported by line charts:

A line graph makes it possible to track a set of data's behavior as these graphs can be used
for purposes other than observing change through time.

These graphs also aid in bringing out variations and connections in your data. A line chart
can also assist a viewer in forecasting potential future events.

Case 1: Consider a line graph that shows how the real estate market in India is seasonal. This
knowledge could be used by a user to research many things before making a property
purchase. They can try to determine the ideal time to buy or sell a house or how a recession
might affect the availability of homes.

Case 2: Consider a stock market line graph for a particular company’s stocks. To help users
make purchasing and selling decisions, they frequently employ line charts. Line graphs can
display how a value has changed over time from yearly to minute-by-minute.

Unit 1 : Introduction to Data Visualization 18


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig.1.7: Line chart for stock price monitoring over years

Activity IV
The following table provides the information on the favorite colors with the group of
people. Draw a line graph for the information

4. Tree Map: The tree map serves as a rectangle-nested visualization. These rectangles are
arranged in a hierarchy, or ‘tree’, to represent specific categories within a chosen
dimension as shown in Fig.1.8. In a limited chart space, quantities and patterns can be
compared and displayed. Tree maps show relationships between parts and wholes. This
particular visualization was created by University of Maryland computer science
professor Ben Shneiderman to maximize available space.

Best practices of tree maps:

❖ With the help of tree maps, readers may quickly and easily analyze their data.

Unit 1 : Introduction to Data Visualization 19


DADS304: Visualization Manipal University Jaipur (MUJ)

❖ Dimensions (such as categories) or measurements can be represented by color (such


as KPIs).
❖ If a KPI is being shown, a darker color may draw attention to extremes, either high or
low. A user may utilize a categorical palette for dimensions, designating a different
color for each dimension.

Tree Map showing year wise sales

Fig.1.8: Tree map showing region wise sales nested year wise

Case 1: A user may utilize a category palette for measurements, designating a different color
for each delivery option. A continuous color palette for measures would display a business's
sales figures or profit. The largest box displays the largest portion of the entire, and the
smallest box displays the smallest portion, while looking for insights in a tree map. These
boxes can be nested to show various categories for a more in-depth investigation. The ‘Total
Sales’ data set, for instance, might have a field that says ‘Region wise Sales’. That box may
show ‘Year wise Sales’ in a box that is nested inside it.

5. Histogram: Histograms, a particular type of bar chart, offer a way to display data
distributions as shown in Fig.1.9. A histogram displays the various values of a single
piece of data as a network of interconnected bars. A single continuous measure is

Unit 1 : Introduction to Data Visualization 20


DADS304: Visualization Manipal University Jaipur (MUJ)

divided into groups or bins by histograms that each reflect a particular range of values.
Then, these evenly sized bins are filled with data points. The bins are then graphically
represented as bars that are piled on top of one another. The number of occurrences
within each range of values is used to quantify bins. Depending on where the data's
values are concentrated, this count will change how the view looks. Skew is the term
used to describe when values are concentrated on either side of the midpoint.

Type of displays in histograms:

❖ Bimodal distribution – It has two peaks.


❖ Plateau Distribution – It rises to a few levels and sustains there for most of the bins.
❖ Edge Peak Distribution – Similar to normal frequency distribution. But, here one bin
at the end is (higher) greater than the rest, serves as a sort of tail

Fig.1.9: Histogram showing stock price distribution. Maximum count in bin 300-400

Unit 1 : Introduction to Data Visualization 21


DADS304: Visualization Manipal University Jaipur (MUJ)

Self-Assessment Questions - 4

6. Binning is group of _____________

a) Continuous value
b) Categorical value
c) All the above
d) Skewed value

6. Map: The majority of the data gathered contain a location variable, making map
visualization simple as shown in Fig.1.10.

Case 1: For instance, a map visualization might show how many clients there are in
each country of the world, with each country standing in for a certain number of
customers. Businesses can expand in an area where they have not yet dispersed as
much as in other places with the aid of location information.

Unit 1 : Introduction to Data Visualization 22


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig.1.10: A geological map showing countries in lower middle income group

7. Scatter Plot: Another name for a scatter plot is an XY graph, scatter chart, or
scattergram. The scatter diagram displays the relationship between pairs of
numerical data by graphing them with one variable on each axis as shown in Fig.1.11.
The following situations call for the use of scatter plots:
❖ In the case of paired numerical data
❖ When more than one value of the dependent variable is associated with a
particular value of the independent variable
❖ When determining the relationships between variables, it might be helpful to look
for potential problem-solving causes, see if two products that seem connected
both have the same root cause, and so on.

Unit 1 : Introduction to Data Visualization 23


DADS304: Visualization Manipal University Jaipur (MUJ)

Correlation in a Scatter Plot:

We are aware that the correlation is a statistical indicator of the relationship between the
relative motions of the two variables. If the variables are correlated, a line or curve will be
formed by the points. If the points touch the line more closely, it indicates better correlation.

Correlation types:

The correlation between two characteristics or variables is explained by the scatter plot. It
shows how closely related the two variables are. To determine the relationship between the
two variables, there are three possible scenarios:

• Positive/ Favorable Correlation


• Negative Correlation
• No Correlation

Positive Correlation: The scatter plot reveals a positive association when the graph's points
are increasing and travelling from left to right. It indicates that one variable's values are
rising in relation to another.

Negative Correlation: A negative correlation is present when the points in the scatter graph
go decreasing from left to right. It indicates that one variable's values are falling in relation
to another.

No Correlation: There is no association between the variables if the points are dispersed
around the graph and it is impossible to determine whether the values are rising or falling.

Activity V
Consider a scenario that you wanted to buy secondhand car. Consider the two
variables: age of car vs price. What kind of correlation is this?

Justify it.

------------------------------------------------------------------------------------------------------------
----------

Unit 1 : Introduction to Data Visualization 24


DADS304: Visualization Manipal University Jaipur (MUJ)

Figure 1.11 given below describes the detailed view of types of correlations with respect to
the strength of correlation as well. Figure 1.12 shows Scatter Plots showing the profit
distribution of category of expenses region wise along with correlations shown by trend
Lines

Fig.1.11: Types of Correlation (Source: Wikipedia)

Fig.1.12: Scatter plots showing the profit distribution of category of expenses region wise
along with correlations shown by trend Lines

Unit 1 : Introduction to Data Visualization 25


DADS304: Visualization Manipal University Jaipur (MUJ)

Activity VI
Assume that you are a data analyst. You have been given 2 features (variables) namely
color of the house and price of the house.

What kind of correlation occurs in between these variables? Justify your answers
------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------

Unit 1 : Introduction to Data Visualization 26


DADS304: Visualization Manipal University Jaipur (MUJ)

Self-Assessment Questions - 5

7. Which chart will be helpful for making comparison?


a) Bar chart
b) Pie chart
c) Tree map
d) All the above
8. The data visualization tool which updates in real time also provides multiple
output called as__________________
a) Data dashboard
b) Data table
c) Metrics table
d) None of the above
9. _______________ provides approximate relationship between variables.
a) Trend line
b) Grid line
c) Spark line
d) All the above
10. Benefits of data visualizations are____________
a) Better analysis
b) Pattern identification
c) Exploring business insights
d) All the above

Unit 1 : Introduction to Data Visualization 27


DADS304: Visualization Manipal University Jaipur (MUJ)

6. PROS AND CONS OF DATA VISUALIZATION


Pros:

• A larger audience can rapidly access it.


• In a short amount of space, it says a lot.
• Your report becomes more aesthetically appealing as a result.

Cons:

• If an improper visual representation is created, it may misrepresent the information.


• If the visual information is skewed or used excessively, it may be distracting.

Unit 1 : Introduction to Data Visualization 28


DADS304: Visualization Manipal University Jaipur (MUJ)

7. SUMMARY
Now let's review the main ideas covered in this chapter:

❖ The act of converting raw data into visual representations in the form of charts, graphs,
and dashboards is known as data visualization.
❖ The idea behind using data visualization is to simply and quickly understand data.

The following are some uses for data visualization:

❖ To recognize patterns, such as whether sales are declining


❖ To quickly comprehend complex information, such as through the use of dashboards
❖ To explore trends and patterns, assess risks, and take care of problems before they
arise
❖ Tell a business story highlighting issues and recommending relevant solutions

8. GLOSSARY

Visualization - Graphical representation of data

Histograms - Show distributions of variables

Bar chart - Used to compare variables.

Pie chart - Show a comparison of the part to a whole.

Line chart -Tracking changes over short and long time.

Tree map - Capture relative size of data categories.

Scatterplot - Determines the relationship between variables.

Unit 1 : Introduction to Data Visualization 29


DADS304: Visualization Manipal University Jaipur (MUJ)

9. TERMINAL QUESTIONS

1. Explain the need for data visualization?


2. What is correlation? Explain it types in detail?
3. When do you use scatter plot?
4. Explain the types of displays in histogram?
5. Explain the various participants of data visualization?
6. Explain how does data visualization play an important role in business analytics?
7. Discuss in detail about data visualization in data analytics lifecycle?
8. What is pie chart? When do you apply it? Discuss in detail.
9. What is a dashboard? Explain its benefits?
10. What is line chart? Discuss in detail with example?

10. ANSWERS
Self-Assessment Questions Answers

1. a. Graphical representation of data and information


2. d. All the above
3. d. All the above
4. a. Yes
5. b. No
6. c. All the above
7. a. Bar chart
8. a. Dashboard
9. a. Trend Line
10. d. All the above

Terminal Questions Short Answers

1. Need for visualization

Most organizations use data visualization for decision-making process. Because


businesses can now comprehend data in graphical or pictorial forms, they can now spot
trends more quickly.

Unit 1 : Introduction to Data Visualization 30


DADS304: Visualization Manipal University Jaipur (MUJ)

Nowadays data analysis has become popular that gives more importance to
visualization.

2. The correlation represents the relationship between variables


• Positive Correlation - It indicates that one variable's values are rising in relation to
another.
• Negative Correlation - It indicates that one variable's values are falling in relation
to another.
• No Correlation – No relationship between the variables
3. The scatter plot could be used in the following scenarios
• In the case of paired numerical data
• When more than one value of the dependent variable is associated with a particular
value of the independent variable
• When determining the relationships between variables, it might be helpful to look
for potential problem-solving causes, see if two products that seem connected both
have the same root cause, and so on.
4. Types of displays in histogram are

Bimodal distribution –- It has two peaks.

Plateau Distribution – It rises to a few levels and sustains there for most of the bins.

Edge Peak Distribution – Similar to normal frequency distribution. But here one bin at
the end is (higher) greater than the rest, serves as a sort of tail.

Terminal Questions Long Answers

5. The various participants of data visualization are


• C-Suite/Executive Management (Chief Operations Officer, Chief Executive Officer,
Chief Financial Officer, etc.)
• Upper-Level Management (Vice President of Sales, Vice President of Marketing,
Director of IT, etc.)
• Mid-Level Management (Marketing Automation Manager, Sales Development
Manager, etc.)

Unit 1 : Introduction to Data Visualization 31


DADS304: Visualization Manipal University Jaipur (MUJ)

• Specialized Positions/Individuals (BI Developer, Web Analyst, Customer


Development Representative, etc.) For more details refer Section 1.4 Audiences of
Data Visualizations.
6. Data visualization plays an important role in business analytics.

The following are the scenarios where data visualization plays an important role

• Better way for data analysis


• Quick decision making
• Discover patterns and trends in data
• Highlighting and strengthening impact of message for audiences
• Formulating better customer acquisition strategies
• Improving market examination, analysis and reaction. For mor details refer Section
1.10 Emerging Need of Data Visualization in Business Analytics.
7. Data visualization in data analytics lifecycle

It includes the following steps.

• Understanding Business Issues


• Understanding your Data Set
• Preparing the Data
• Performing Exploratory Analysis and Modeling
• Validating Your Data
• Visualizing and Presenting Your Findings (For more details refer Section 1.3 Data
Visualization in Data Analytics Lifecycle.)
8. Pie chart

A pie chart is useful for organizing and displaying data by percentage of the total. In
keeping with its moniker, this type of visualization uses a circle to represent the entire
thing and slices, or ’pies’, of that circle to symbolize the various categories that make
up the whole. A user can compare the relationship between various dimensions (such
as categories, products, people, countries, etc.) within a particular context using this
sort of chart. The numerical data (measure) is typically divided into percentages of the

Unit 1 : Introduction to Data Visualization 32


DADS304: Visualization Manipal University Jaipur (MUJ)

overall sum on the chart. Each slice is a representation of the value's percentage, and
should be measured as such.

Apply a pie chart when:

• You have a sum that can be divided into two to five groups.
• There is a big difference between the weight of each category. (For more details
refer Section 1.5 Techniques of Data Visualization.)
9. Dashboard
• It is a visual display of all your data. The primary intention of dashboard is to
provide information at-a-glance (KPIs)
• It allows all kinds of professionals to easily monitor performance and then create
a report. The benefits are dashboards are
✓ The ability to identify trends
✓ An easy way to measure efficiency
✓ Provides a detailed report with a single click
✓ Helps in making decisions
✓ Easy to identify data outliers and correlations
10. Line chart

A line chart, also known as a line graph or a line plot, uses a line to link a group of data
points as shown in Fig.1.13. This type of graph uses sequential values to show trends.
The x-axis (horizontal axis) often shows a succession of numbers in a consecutive
order. The values for a chosen metric across that progression are then provided on the
y-axis (vertical axis). When you need to illustrate data across time, this basic graphic
works wonderfully. To create forecasts for the coming year, one use case would involve
tracking customer interest in a particular category of good or service over the course
of the year.

(For more details refer Section 1.5 Techniques of Data Visualization)

Unit 1 : Introduction to Data Visualization 33


DADS304: Visualization Manipal University Jaipur (MUJ)

11. CASE STUDY

Case study: Britons Diet Data Visualization

Briton’s diet: This data shows how Briton’s diet changed over past decades.

It shows using trending lines that more fatty foods are being consumed and healthy foods
are being consumed less. It is more understandable and easier to analyze this data using
trending lines.

Fig.1.13. Britons Diet data

Unit 1 : Introduction to Data Visualization 34


DADS304: Visualization Manipal University Jaipur (MUJ)

12. REFERENCES:

Recommended Readings

• Hoelscher, J., & Mortimer, A. (2018). Using Tableau to visualize data and drive decision-
making. Journal of Accounting Education, vol. 44, pp. 49-59.
• Friendly, M. (2008). A brief history of data visualization. In Handbook of data
visualization (pp. 15-56). Springer, Berlin, Heidelberg.
• Healy, K. (2018). Data visualization: a practical introduction. Princeton University
Press.

Unit 1 : Introduction to Data Visualization 35


DADS304: Visualization Manipal University Jaipur (MUJ)

MASTER OF BUSINESS ADMINISTRATION


SEMESTER 3

DADS304
VISUALIZATION

Unit 2 : Basic Visualization Using R 1


DADS304: Visualization Manipal University Jaipur (MUJ)

Unit 2
Basic Visualization Using R
Table of Contents

SL Fig No / Table SAQ /


Topic Page No
No / Graph Activity
1 Introduction - -
3
1.1 Objectives - -
2 Features of R Software - - 4
3 Steps to Install R 1 to 9 - 5-9
4 Basic commands using R 10, 11, 12 1 10 -11
5 Basic Statistical Commands using R 13 to 14 2 12
6 Basic Plotting using R 15 to 22 3 13 -17
7 Advanced Plotting using R 23 to 30 4 18 - 21
8 Activities - - 22
9 Summary - - 23
10 Glossary - - 23
11 Study Notes and Did You Know - - 24
12 Case Study - - 24
13 Terminal Questions - - 25
14 Answers Self Assessment Questions - - 25
15 Terminal Answer Key - - 26 - 28
16 Concept map - - 29
17 References - - 29

Unit 2 : Basic Visualization Using R 2


DADS304: Visualization Manipal University Jaipur (MUJ)

1. INTRODUCTION

There is an enormous amount of data available in the market. In recent times, online
platform utilization is more than offline platforms. Data are getting stored in multiple
formats and redundancy of the data is also very high. It is getting incremented exponentially
day by day. If it keeps progressing like this, one day there will be a huge storage crisis. This
problem should be addressed immediately. Data visualization is the process through which
raw data can be visualized and some inference can be derived and based on that inference,
other methods of data pre-processing can be applied. The pre-processed data is the ready
data made available for further analysis to get proper inference from the data. It is in turn
helpful in clearing the storage to a certain extent, so that it can be utilized further in an
effective manner. There are many platforms to visualize data. In this unit, R software will be
used for data visualization, as R software is a free open-source software with GNU package.
R is a statistical programming language where data can be visualized from a basic to
advanced level before and after statistical modelling.

1.1 Objectives:

After studying this unit, you should be able to:

❖ Identify basic commands using R


❖ Explain basic statistical commands using R
❖ Discuss the basics of data visualization using R
❖ Identify the advanced data visualization using R

Unit 2 : Basic Visualization Using R 3


DADS304: Visualization Manipal University Jaipur (MUJ)

2. FEATURES OF R SOFTWARE

A few major features and roles of R software are as follows:

2.1 Open Source


R uses open-source software environment, where packages can be added and removed as
per the requirements of the applications and projects. It is available free of cost.

2.2 Strong Graphical Environment


R can create an environment of static graphics as well as dynamic interactive graphical
atmosphere. It helps in data visualization as well as data representation.

2.3 Exclusive Active Community Support


R has strong open-source library. There are a huge number of contributors for the
development of R libraries which provides strong community support.

2.4 A Wide Selection of Packages


Comprehensive R Archive Network has more than 10,000 packages, which can help the users
solve multidimensional problem solving.

2.5 Comprehensive Environment


R is an object-oriented programming language. It has an exclusive robust package, ,Rshiny,
which can be used to develop Web apps.

2.6 Cross Platform Support


R is machine independent. It can run on any operating system and can support cross platform
activities.

2.7 Support Other Programming Languages


R can support other programming languages. Many of the functions in R are written with C,
C++, Fortran.

2.8 Advanced Statistical Calculation

R supports basic statistical calculations to advanced statistical calculations to get proper and
effective inference from the data.

Unit 2 : Basic Visualization Using R 4


DADS304: Visualization Manipal University Jaipur (MUJ)

3. STEPS TO INSTALL R
Step 1: Go to the website The Comprehensive R Archive Network (r-project.org)

Fig. 2.1: CRAN Project Website

Step 2: Download R as per operating system availability in the system. Follow the below links
as per OS availability.

https://cran.r-project.org/bin/linux/

https://cran.r-project.org/bin/macosx/

https://cran.r-project.org/bin/windows/

Step 3: Install R on Windows.

Step 4: Click on Base Subdirectory Link.

Step 5: Click download R version for windows.

Fig. 2.2: Download R Version

Unit 2 : Basic Visualization Using R 5


DADS304: Visualization Manipal University Jaipur (MUJ)

Step 6: Run .exe file and follow the instruction.

Step 6a: Select the desired language and click next.

Fig. 2.3: Select the desired language

Step 6b: Read the license agreement and click next.

Fig. 2.4: License Agreement

Unit 2 : Basic Visualization Using R 6


DADS304: Visualization Manipal University Jaipur (MUJ)

Step 6c: Select the components you wish to install and click next.

Fig. 2.5: Select Components

Step 6d: Enter the path you want to install R at.

Fig. 2.6: Selection of Path

Unit 2 : Basic Visualization Using R 7


DADS304: Visualization Manipal University Jaipur (MUJ)

Step 6e: Select Additional Tasks.

Fig. 2.7: Selection of Additional Task

Step 6f: Installation Process begins.

Fig. 2.8: Installation Process

Unit 2 : Basic Visualization Using R 8


DADS304: Visualization Manipal University Jaipur (MUJ)

Step 6g: Click on Finish Installation to complete the process.

Fig. 2.9: Complete Installation Process

Unit 2 : Basic Visualization Using R 9


DADS304: Visualization Manipal University Jaipur (MUJ)

4. BASIC COMMANDS USING R

This part deals with basic R commands. The R console can be visualized as

Fig. 2.10: R Console

1. Basic Command to print Welcome

Fig. 2.11: Print Command

2. c () – Enter data manually to a vector in R.

Fig. 2.12: c()Command

Unit 2 : Basic Visualization Using R 10


DADS304: Visualization Manipal University Jaipur (MUJ)

3. data () – Load (often into a data. frame) built-in dataset


4. dim () – Get or Set dimension of the specified built-in dataset
5. names () – Lists names of variables in a data. frame
6. View () – Lists names of variables in a data. frame
7. Str () – Display internal structure of an R object

Fig. 2.13: data (), name (), dim (), str (), View () Command

8. read.csv () – Used to read the .csv files


9. read. Table () – Used to read tables

Self-Assessment Questions - 1

1. ________________command lists names of variables in a data. frame.


2. _______________ command used to read .csv files.

Unit 2 : Basic Visualization Using R 11


DADS304: Visualization Manipal University Jaipur (MUJ)

5. BASIC STATISTICAL COMMANDS USING R


R is famous for statistical operations. There are multiple in-built statistical functions in R.

1. mean () – Used to identify mean of the data


2. median () – Used to identify median of the data
3. summary () – Used to get the summary of the data
4. var () – Used to identify variance of the data
5. sd () – Used to identify standard deviation of the data
6. quantile () – Used to identify quantile of the data

Fig. 2.14: mean (), median (), summary (), var (), sd (), quantile () Command

Self-Assessment Questions - 2

3. _______________ command is used to summarize the data.


4. sd () command is used for ______________ .

Unit 2 : Basic Visualization Using R 12


DADS304: Visualization Manipal University Jaipur (MUJ)

6. BASIC PLOTTING USING R

1. Bar Plot: Bar Plot conveys relational information.

Fig. 2.15: Bar Plot

2. Box Plot: It is used to demonstrate locality, spread and skewness of the data.

Fig. 2.16: Box Plot

Unit 2 : Basic Visualization Using R 13


DADS304: Visualization Manipal University Jaipur (MUJ)

3. Scatter Plot: It explains relationship between two variables in a data set.

Fig. 2.17: Scatter Plot

4. Histogram: It is used to summarize continuous or discrete data in an interval scale.

Hist ()

Fig. 2.18: Histogram

5. pnorm (), qnorm (): pnorm is cumulative density function of the normal distribution.
qnorm finds the boundary value.

pnorm (), qnorm ()

Unit 2 : Basic Visualization Using R 14


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig. 2.19: pnorm (), qnorm ()

6. Line Graph: It is used to plot lines as per the relationships between two variables.

Plot()

Fig. 2.20: Line Graph

Unit 2 : Basic Visualization Using R 15


DADS304: Visualization Manipal University Jaipur (MUJ)

7. Pie Chart: Pie chart is used to plot percentage distribution of the data.

Pie ()

Fig. 2.21: Pie Chart

8. Stacked Bar Graphs: Data visualization of bar charts use horizontal columns to exhibit
numerical comparisons between categories.

Fig. 2.22: Stacked Bar Graphs

Unit 2 : Basic Visualization Using R 16


DADS304: Visualization Manipal University Jaipur (MUJ)

Self-Assessment Questions -3

5. ________________ is a chart used to visualize a large data by connecting points in a


continuous line.
6. __________________ describes the distribution of data over a continuous interval or
particular period of time.

Unit 2 : Basic Visualization Using R 17


DADS304: Visualization Manipal University Jaipur (MUJ)

7. ADVANCED PLOTTING USING R

Data visualization is very important to identify the trend of the data. There are different
types of plots to get inference from the data. Libraries are needed to be included for advanced
plotting using R.

1. ggplot2: It is a plotting package which helps to create complex plots from various data
in a data frame. It is a more advanced plot to get programmatic interface, clear
visibility and proper inference.

Fig. 2.23: ggplot2

2. Lattice: It is a package which has graphics and data visualization property that
originated from Trellis graphics package. It can plot multivariate data. It can first plot
basic visualization and advance it based on enhanced features.

Fig. 2.24: Lattice

Unit 2 : Basic Visualization Using R 18


DADS304: Visualization Manipal University Jaipur (MUJ)

3. High charter: It is associated with java script library and its modules. It is highly flexible
and customizable. It has a very high powerful API. Chart visualization of the data is
possible using this.

Fig. 2.25: High Charter

4. Leaflet: It is an open-source java script library used to create dynamic online maps. It
can create graph through layer wise.

Fig. 2.26: Leaflet

Unit 2 : Basic Visualization Using R 19


DADS304: Visualization Manipal University Jaipur (MUJ)

5. R Color Brewer: It is an important tool for color management. It offers several color
pallets and provides some unique graphical visualization.

Fig. 2.27: R Color Brewer

6. Plotly: It is a R package which can help create interactive web services with the help
of java scripts. It is an open-source library.

Fig. 2.28: Plotly

7. Sunburst R: It is a special type of data visualization tool in R. It is customizable.

Fig. 2.29: Sunburst R

Unit 2 : Basic Visualization Using R 20


DADS304: Visualization Manipal University Jaipur (MUJ)

8. RGL: It is used to produce interactive 3-D plots. It contains high level graphic commands
along with basic commands. It is used for 3-D visualization with openGL.

Fig. 2.30: RGL

9. Dygraphs: It is a java script charting library. It creates high facilities of time series data
using R.

Fig. 2.31: Dygraphs

Self-Assessment Questions -4

7. _____________ which has graphics and data visualization property originated from
Trellis graphics package.
8. ____________ produce interactive 3-D plot.

Unit 2 : Basic Visualization Using R 21


DADS304: Visualization Manipal University Jaipur (MUJ)

8. ACTIVITIES

Activity A

Create a database of cancer patients with proper feature set and infer using data
visualization techniques.

………………………………………………………………………………………………………………………………………

……………………………………………………………………… Activity:

Identify the major symptoms of cancer and its stages and create a data set with minimum
100 cases. Visualize the data and get inference.

Activity B

Visualize Titanic dataset using data visualization command in R.

………………………………………………………………………………………………………………………………………

……………………………………………………………………… Activity:

Use the link Titanic Dataset | Kaggle

Unit 2 : Basic Visualization Using R 22


DADS304: Visualization Manipal University Jaipur (MUJ)

9. SUMMARY
Data visualization is a major technology for data analysis. Data is a very important asset of a
company. But at the same time repetitive and useless data creates unnecessary storage
overflow and leads to wrong decision-making. There are multiple tools and techniques
available for data visualization such as tableau, Microsoft Power BI, Python, Microsoft Excel,
Mongo DB, R Studio and many more. The visualization techniques are more or less the same
for all the cases. The main challenge for data visualization is to identify proper visualization
technique for respective datasets. For example, if it is required to display data based on
percentage, pie chart is the best option. Box plot is used to identify spread and skewness of
the data. There are advanced plotting techniques where libraries are needed to be added to
R and proper inferences from the data is possible. However, before undertaking any data
visualization through R, the basic commands used in the platform of R are required to be
understood. In this unit, the basic R commands, essential statistical commands, basic and
advanced data visualization techniques were discussed.

10. GLOSSARY

Braided Graph: a novel visualization technique where filled areas are sorted in depth order.

Bullet Graphs: used for comparing the performance of primary measures with other
measures.

Error Bars: used to identify estimated error in data.

Marimekko Charts: used to visualize categorical data over a pair of variables.

Unit 2 : Basic Visualization Using R 23


DADS304: Visualization Manipal University Jaipur (MUJ)

11. STUDY NOTES & DID YOU KNOW

Did You Know? About 65% of the brands use infographics for marketing. 84% of the people
accepted infographics as a powerful tool. Infographics are the fourth most used type of
content marketing. Infographics can boost website traffic by 12%.

Did You Know? 80% of the health care market players invested in big data analytics and
artificial intelligence driven by market demand.

12. CASE STUDY

Collect the Retinopathy database from your nearby health care center and visualize the
cases of positive, negative.

Q1. Identify the percentage of positive cases in your area.

………………………………………………………………………………………………………

Q2. Identify the percentage of negative cases in your area.

………………………………………………………………………………………………………

Unit 2 : Basic Visualization Using R 24


DADS304: Visualization Manipal University Jaipur (MUJ)

13. TERMINAL QUESTIONS

Short Answer Type Questions:

1. Briefly discuss about data visualization tools.


2. What is the use of box plot in data visualization?
3. What is the importance of R for data visualization in data science?

Long Answer Type Questions:

1. Explain the various basic commands in R.


2. Explain the R commands for basic data visualization.
3. Explain the R commands for advanced data visualization.

14. ANSWERS-SELF-ASSESSMENT QUESTIONS


1. View ()
2. read.csv ()
3. summary ()
4. Used to identify standard deviation of the data
5. A line graph
6. Histograms
7. Lattice
8. RGL

Unit 2 : Basic Visualization Using R 25


DADS304: Visualization Manipal University Jaipur (MUJ)

15. TERMINAL ANSWER KEY

Short Answer Type Questions Answer Key

1. Briefly discuss about data visualization tools.

Ans: There are many data visualization tools available in the market. Some features in
them are common. It is a software that can be used to visualize the data. Data
visualization is very important for the purpose of data analysis. The most popular data
visualization tools are tableau, Microsoft excel, Microsoft power BI, Dundas BI, Jupiter,
Zoho reports, Google Charts, Visual.ly, RAW, IBM Watson, Sisense, Plotly, data wrapper,
fusion charts, Qlik view, info grams, chart blocks, D3.js, Chart.js, chartist.js, sigma.js, and
polymaps.

2. What is the use of box plot in data visualization?

Ans: Box plots are used to visualize the spread of data. If the data values are beyond the
maximum and minimum boundary values, those are known as outliers and cannot be
considered to calculate the mean of the data. The median will be at the middle and the
full data sets are divide into four quartiles. The difference between lower and upper
quartile is known as Inter Quartile Range (IQR).

3. What is the importance of R for data visualization in data science?

Ans: There are several basic and advanced tools are available for data visualization
using R. So, it is very easy to visualize direct data and /or processed data after statistical
processing. R is a strong statistical tool for data visualization.

Long Answer Type Questions Answer Key

1. Explain the various basic commands and statistical commands in R.

Ans:

Basic Commands are as follows:

1. print () – to print

Unit 2 : Basic Visualization Using R 26


DADS304: Visualization Manipal University Jaipur (MUJ)

2. c ()– To enter data manually to a vector in R


3. data () – To load (often into a data. frame) built-in dataset
4. dim () – To load (often into a data. frame) built-in dataset
5. names () – Lists names of variables in a data. frame
6. View () – Lists names of variables in a data. frame
7. Str () – Displays internal structure of an R object
8. read.csv () – Used to read the .csv files
9. read. Table () – Used to read tables
10. mean () – Used to identify mean of the data
11. median () – Used to identify median of the data
12. summary () – Used to get the summary of the data
13. var () – Used to identify variance of the data
14. sd () – Used to identify standard deviation of the data
15. quantile () – Used to identify quantile of the data
2. Explain the basic data visualization using R.

Ans:

1. Bar Plot: Bar Plot conveys relational information.


2. Box Plot: It is used to demonstrate locality, spread and skewness of the data.
3. Scatter Plot: It explains relationship between two variables in a data set.
4. Histogram Hist() : It is used to summarize continuous or discrete data in an
interval scale.
5. pnorm (), qnorm (): pnorm is cumulative density function of the normal
distribution. qnorm finds the boundary value.
6. Line Graph Plot() : It is used to plot lines as per the relationships between two
variables.
7. Pie Chart Pie(): Pie chart is used to plot percentage distribution of the data.
8. Stacked Bar Graphs: Data visualization of bar charts use horizontal columns to
exhibit numerical comparisons between categories.

Unit 2 : Basic Visualization Using R 27


DADS304: Visualization Manipal University Jaipur (MUJ)

3. Explain advanced data visualization using R.

Ans:

Data visualization is an important technique to identify the trends in the data. There
are different types of plots to get inference from the data. Libraries are needed to be
included for advanced plotting using R.

1. ggplot2: It is a plotting package which helps to create complex plots from various
data in a data frame. It is a more advanced plot to get programmatic interface, clear
visibility and proper inference.
2. Lattice: It is a package which has graphics and data visualization property that
originated from Trellis graphics package. It can plot multivariate data. It can first
plot basic visualization and advance it based on enhanced features.
3. High charter: It is associated with java script library and its modules. It is highly
flexible and customizable. It has a very high powerful API. Chart visualization of the
data is possible using this.
4. Leaflet: It is an open-source java script library, used to create dynamic online maps.
It can create graph through layer wise.
5. R Color Brewer: It offers several color pallets. It is an important tool for color
management. It offers several color pallets and provides some uniqueness in
graphical visualization.
6. Plotly: It is a R package which can help to create interactive web services with the
help of java scripts. It is an open-source library.
7. Sunburst R: It is a special type of data visualization tool in R. It is customizable.
8. RGL: It is used to produce interactive 3-D plots. It contains high level graphic
commands along with basic commands. It is used for 3-D visualization with open GL.
9. Dygraphs: It is a java script charting library. It creates high facilities of time series
data using R.

Unit 2 : Basic Visualization Using R 28


DADS304: Visualization Manipal University Jaipur (MUJ)

16. CONCEPT MAP

17. REFERENCE
L. S. Hovhan, 7 Key Benefits of Interactive Data Visualization, October 2020, [online]
Available: https://infogram.com/blog/7-key-benefits-of-interactive-data-visualization/.

Likhitha Ravi, E. Kauffmann, J. Peral, D. Gil, A. Ferrández, R. Sellers and H. Mora, "A
framework for Big Data Analytics in Commercial Social Networks: A Case Study on Sentiment
Analysis and Fake Review Detection for Marketing Decision-Making", Industrial Marketing
Management, vol. 90, pp. 523–537, 2020.

K. Govindan and H. Gholizadeh, "Robust Network Design for Sustainable-Resilient Reverse


Logistics Network using Big Data: A Case Study of end-of-life vehicles", Transportation
Research Part E: Logistics and Transportation Review, vol. 149, pp. 102279, 2021.

E-References

https://onlinecourses.nptel.ac.in/noc22_mg09/

NOC | Essentials of Data Science With R Software - 1: Probability and Statistical Inference
(nptel.ac.in)

NPTEL

https://onlinecourses.nptel.ac.in/noc19_ma33

Unit 2 : Basic Visualization Using R 29


DADS304: Visualization Manipal University Jaipur (MUJ)

MASTER OF BUSINESS ADMINISTRATION


SEMESTER 3

DADS304
VISUALIZATION

Unit 3 : Introduction to R SHINY 1


DADS304: Visualization Manipal University Jaipur (MUJ)

Unit 3
Introduction to R SHINY
Table of Contents

SL Fig No / Table SAQ /


Topic Page No
No / Graph Activity
1 Introduction - - 3
2 Visual Analytics - 1 4-5
3 Introduction to R-Shiny - 2 6
4 What is R-Shiny? - 3 7
5 1, 2, 3, 4, 5, 6, 7, 4
8 - 13
8, 9
Creating basic app using R Shiny
6 Summary - - 14
7 Activity - - 14
8 Glossary - - 14
9 Study Notes and Did You Know - - 15
10 10, 11,12, 13, -
Case Study 14, 15, 16, 17, 15 - 19
18, 19
11 Terminal Questions - - 20
12 Self-Assessment Answers - - 20
13 Terminal Questions Answers - - 21 - 24
14 Concept Map 20 - 25
15 References - - 25

Unit 3 : Introduction to R SHINY 2


DADS304: Visualization Manipal University Jaipur (MUJ)

1. INTRODUCTION

In the developing field of visual analytics, interactive visual interfaces are used to facilitate
analytical reasoning. The fundamental concept is the combination of exceptional human
abilities for visual information exploration with enormous computing power to create a
potent environment for knowledge discovery.

Other visual analytics programmes like Flourish, Infogram, D3, and many others have the
issue of either being expensive to use or not being sophisticated enough to be used with more
complex statistical analysis programmes like dynamical modeling ones.

R Shiny is a free and open-source component of the R programming language, and as such, it
is integrated with R's enormous array of statistical, numerical, and computational
capabilities.

Unit 3 : Introduction to R SHINY 3


DADS304: Visualization Manipal University Jaipur (MUJ)

2. VISUAL ANALYTICS

The use of complex methods and tools to evaluate data using graphical representations of
the information is known as visual analytics. Users can see patterns and gain useful insights
by viewing the data as graphs, charts, and maps. Organizations can improve their data-driven
decisions thanks to these insights.

Benefits:

Share findings and monitor progress:Organize and share key performance indicators across
an organisation by using interactive reports and dashboards.

Take action more quickly: When working with data sets in a visual format, users can
comprehend data insights much more quickly.

Easier Data Exploration:Without requiring assistance from IT, self-service analytics


solutions that let users engage with data in a visual context enable them to find hidden links
and patterns.

Encourages Data Literacy: Data analytics becomes more accessible by making data easy to
use and comprehend, involving more individuals inside a company.

Difference between data analytics and data visualization

The term "data visualisation" typically refers to the graphical representation of data, or
representing data in bubble charts, heat maps, and other visuals to aid in understanding
patterns, relationships, trends, and other important insights in datasets. The term "visual
analytics" describes the use of an analytics tool to carry out in-depth analysis of large,
complicated datasets while enabling users to interact and explore dynamic visuals.

Visual Analytics Best Practices

• Define goals
• Integrate and manage the data
• Simplify visualizations
• Get Inspired

Unit 3 : Introduction to R SHINY 4


DADS304: Visualization Manipal University Jaipur (MUJ)

Examples:

• Marketing:By enabling the marketer in this example to see and comprehend each stage
of the customer life cycle, visual analytics helps them increase ROI.
• Supply Chain:By displaying KPIs and enabling interactive exploration, big data visual
analytics can assist supply chain managers in quickly discovering relationships across
complicated, divergent data sources.
• Sales:The clear, organised presentation of sales data helps sales managers to boost
revenue, enhance forecasting, and spot important patterns.
• Finance:A loan manager at a consumer bank can investigate how various geographic
areas, products, and loan officers fare over time and determine which factors have the
most influence on revenue and profits.
• IT :Data analytics and visualisation can be used by IT administrators to better predict
future technology requirements and spot underutilized systems and applications.

Self-Assessment Questions - 1

1. Users can see patterns and gain useful insights by viewing the data as __________,
__________ , and ________.

2. The ______________ describes the use of an analytics tool to carry out in-depth
analysis of large, complicated datasets.

Unit 3 : Introduction to R SHINY 5


DADS304: Visualization Manipal University Jaipur (MUJ)

3. INTRODUCTION TO R-SHINY
Imagine being able to create a web application using your #datascience analysis. The R Shiny
programme allows you to do exactly that.

Using the wonderful R Shiny framework, you can quickly turn your data research into a web
app. Create incredible apps that your company can utilize in a matter of hours, not weeks or
months.

An R package called Shiny makes it possible to create interactive web applications that can
run R code in the background. With Shiny, you can create dashboards, embed interactive
charts in R Markdown papers, and host standalone applications on a website. Additionally,
you can add HTML widgets, JavaScript actions, and CSS themes to your Shiny applications.

The core functionality of the Shiny web framework is the ability to gather input values from
a web page, make those inputs readily available to the application in R, and have the output
values from the R code posted to the web page. A Shiny application needs a user interface
and a server function to do calculations in its most basic version. A server script and a user-
interface definition make up Shiny applications' two parts.

TOOLS

Installing the following tools is necessary:

• R: Located here: https://cran.r-project.org/bin/windows/base/.


• R Studio We strongly advise you to download RStudio 1.4.
• Install.packages("shiny"): latest stable release
• Install "tidyverse" into your packages.
• A cutting-edge web browser It's best to use Google Chrome.

Self-Assessment Questions - 2
3. ______________ can be done using R Shiny for data science.
4. There are 2 phases of R-Shiny Applications (True/False).

Unit 3 : Introduction to R SHINY 6


DADS304: Visualization Manipal University Jaipur (MUJ)

4. WHAT IS R SHINY?

An R tool called Shiny makes it simple to create dynamic web applications directly from R.
You can create dashboards, embed standalone apps in R Markdown papers, or host them on
a website.

Installation:

Installing the package is a prerequisite for using it in R programming.


Install.packages("packagename") is the command that can be used to do this task. Type this
to install the entire Shiny package:

install.packages(“shiny”)

FEATURES

• Create straightforward web applications without JavaScript


• Similar to how spreadsheets are automatically "active." Changes to the input will result
in an automated update.
• Pretty default UI style based on Twitter Bootstrap; works in any R environment.
• Display plots, tables, and printed output from R objects using pre-made and editable
output widgets.

Self-Assessment Questions - 3
5. We install Shiny by using the command ____________________________________.
6. R-Shiny helps to create straightforward web apps without ___________________.

Unit 3 : Introduction to R SHINY 7


DADS304: Visualization Manipal University Jaipur (MUJ)

5. CREATING BASIC R SHINY APP


Shiny programmes have a .R extension at the end, much like R files do. There are three parts
to the app structure, which are as follows:

1. a user interface object (ui)


2. a server function
3. A function call to shinyApp

Radio buttons, panels, and selection boxes are all managed by the user interface (ui) object,
which is used to manage the app's overall design and layout.

UI OBJECT

The fluidPage() method is used to construct the app's layout for the default app's user
interface. The fluidPage layout will adapt to changes in browser size automatically. The
fluidPage() uses a sidebarLayout() to divide the page into two sections: the mainPanel(),
which houses the histogram output in the example app, and the sidebarPanel(), which
houses the app's input (the slider). This is a reasonably standard layout for a Shiny
programme; take note that nothing prevents us from adding additional output before or after
the sidebarLayout(), such as the titlePanel(). Each page component, with fluidPage() at the
top, is an argument to a function.

There are more advanced layout options that are also more flexible, like navBarPage(), to
create a page with a navigation bar. A user interface can also be created entirely from scratch
using HTML, CSS, etc.

The web application's code is located in the server function. A render...() function
corresponds to each...Output() function in the user interface. The R code to create the object
we wish to render is provided as the first argument to the render...() function. For instance,
the code to create a histogram is located in the renderPlot() function of the example app.
Since this must be a single expression, it is usually found in s.

Unit 3 : Introduction to R SHINY 8


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig.1 Implementation of titlePanel()

The value of the slider as defined in the user interface will be contained in input$bins (which
was called bins).

Fig.2 Implementation of sliderInput()

A SERVER FUNCTION

The web application's code is located in the server function. A render...() function
corresponds to each...Output() function in the user interface. The R code to create the object
we wish to draw is provided as the first parameter to the render...() function. For instance,
the code to create a histogram is located in the renderPlot() method of the sample app. Since
this must be a single expression, it is usually seen in s.

Fig.3 code for plotting

Unit 3 : Introduction to R SHINY 9


DADS304: Visualization Manipal University Jaipur (MUJ)

REACTIVITY-Responding events by Shiny

Reactive programming is the method used by Shiny. This implies that everything that
depends on something will immediately be updated when that item changes (such the slider
being adjusted). The distPlot graph is the only component of the sample app that is
dependent on the slider; as a result, if we change the slider, this will be redrew.

The graph is also influenced by several other, less obvious aspects of the programme, such
the window size. The graph will be re-drew if the window is resized since Shiny is aware that
the graph depends on that aspect of the app.

SAMPLE CODE OF A GAME:

UI:

First page can be done using simple HTML features like:

Fig.4 sample code for image insertion

The second panel, which will house all of our visualisation work, will then be defined. We
must specify our main content and sidebar content before combining them in our second
tabPanel ().

We'll add a choose widget to our sidebar so the user may choose the Y variable for our plot.
This choose widget will be given the name "y var," and we'll use it later to modify the plot on

Unit 3 : Introduction to R SHINY 10


DADS304: Visualization Manipal University Jaipur (MUJ)

our server. R file. We use the term "plot" in our primary material but define it afterwards on
our server ,also a R file.

Call select values = colnames to set the select values variable for the selectInput choices
(data).

Fig.5 sample code

Design the second panel also using HTML features.

Fig.6 panel using HTML

Finally we combine the pages in the navbar object.

Unit 3 : Introduction to R SHINY 11


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig.7 combined pages in navbar object

SERVER:

1. Load data files and libraries.


2. Gather data
3. Create graphs and charts using the server function.

The function that assigns values to the 'output', as shown below, will be created in the
server.R file. It will accept input values specified by the UI.

To match the "plot" label we wrote in our UI main panel, we want to specify output$plot in
our server function as follows: plotOutput("plot"). RenderPlot() will be called to construct
the plot in order to initialise this.

Fig.8 output for function

Unit 3 : Introduction to R SHINY 12


DADS304: Visualization Manipal University Jaipur (MUJ)

The main aspect is incorporating your supplied variables. We created the UI input variable y
var to be utilised in choosing which variable appears in the y-axis (in a vertical bar plot, it
will show as the x-axis). This variable, which you may refer to as input$y var, is used to
organise, label, and show data in your plot.

The connections have been made, so you may view and use your visualisation now.

The main aspect is incorporating your supplied variables. We created the UI input variable y
var to be utilised in choosing which variable appears in the y-axis (in a vertical bar plot, it
will show as the x-axis). This variable, which you may refer to as input$y var, is used to
organise, label, and show data in your plot.

The connections have been made, so you may view and use your visualisation now.

Fig.9 demonstration of mario kart 8 driver

Self-Assessment Questions - 4
7. The server phase in R-Shiny Load data files and libraries , gathers data and
_________________________________________________________.
8. _____________________ features are used to design the UI interface.

Unit 3 : Introduction to R SHINY 13


DADS304: Visualization Manipal University Jaipur (MUJ)

6. SUMMARY

In the developing field of visual analytics, interactive visual interfaces are used to facilitate
analytical reasoning. The fundamental concept is the combination of exceptional human
abilities for visual information exploration with enormous computing power to create a
potent environment for knowledge discovery.

An R package called Shiny makes it possible to create interactive web applications that can
run R code in the background. With Shiny, you can create dashboards, embed interactive
charts in R Markdown papers, and host standalone applications on a website. In addition ,
one can add HTML widgets, JavaScript actions, and CSS themes to your Shiny applications.

7.ACTIVITY

Take the gapminder data, examine it, then make an amusing app using r shiny to show the
dataset for fun.

Dataset can be downloaded from kaggle.

8. GLOSSARY

R-Shiny:The open source R package that offers a beautiful and robust web framework for
creating online apps.

Visual Analytics: The use of advanced tools and procedures to evaluate datasets using visual
representations of the data is known as visual analytics.

Unit 3 : Introduction to R SHINY 14


DADS304: Visualization Manipal University Jaipur (MUJ)

9. STUDY NOTES & DID YOU KNOW


• Shiny is a web app tool, joining others like Streamlit and Dash.
• Currently in alpha, Shiny for Python will likely undergo significant changes over the
next few months. However, we can already test it out and see how it performs.

10. CASE STUDY


Let's look at this case study in order to better comprehend R shine.

Here, we examine data from the Consumer Product Safety Commission's (CPSC) National
Electronic Injury Surveillance System (NEISS).

Fig.10 importing modules

Fig.11 To load data

Unit 3 : Introduction to R SHINY 15


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig.12 Sample data in the dataset

Fig.13 Pairing of two more data for context

Unit 3 : Introduction to R SHINY 16


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig.14 Exploration of the data

Fig.15 Summary

Unit 3 : Introduction to R SHINY 17


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig.16 Estimated number of injuries according to age of both male and female

Fig.17 Prototype

Unit 3 : Introduction to R SHINY 18


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig.18 Resulting App

Fig.19 initial version of the NEISS exploration app

Unit 3 : Introduction to R SHINY 19


DADS304: Visualization Manipal University Jaipur (MUJ)

11. TERMINAL QUESTIONS

Short Type Questions:

1. What is R-Shiny?
2. What is reactivity?

Long Type questions:

1. What is Visual Analytics?


2. What is R-Shiny?Explain the two phases of R Shiny.

12. SELF ASSESSMENT ANSWERS

1. graphs, charts,maps
2. Visual Analytics
3. Web app
4. True
5. install.packages(“shiny”))
6. Javascript
7. Create graphs and charts using the server function.
8. HTML

Unit 3 : Introduction to R SHINY 20


DADS304: Visualization Manipal University Jaipur (MUJ)

13. TERMINAL QUESTIONS ANSWERS


Short Type Questions

1. What is R-Shiny?

An R package called Shiny makes it possible to create interactive web applications that
can run R code in the background. With Shiny, you can create dashboards, embed
interactive charts in R Markdown papers, and host standalone applications on a website.
Additionally, you can add HTML widgets, JavaScript actions, and CSS themes to your
Shiny applications.

2. What is reactivity?

Reactive programming is the method used by Shiny. This implies that everything that
depends on something will immediately be updated when that item changes (such the
slider being adjusted). The distPlot graph is the only component of the sample app that
is dependent on the slider; as a result, if we change the slider, this will be redrew.

Long Type Questions

1. What is Visual Analytics?

The use of complex methods and tools to evaluate data using graphical representations
of the information is known as visual analytics. Users can see patterns and gain useful
insights by viewing the data as graphs, charts, and maps. Organizations can improve
their data-driven decisions thanks to these insights.

Benefits:Share findings and monitor progress:Organize and share key performance


indicators across an organisation by using interactive reports and dashboards.

Take action more quickly: When working with data sets in a visual format, users can
comprehend data insights much more quickly.

Easier Data Exploration:Without requiring assistance from IT, self-service analytics


solutions that let users engage with data in a visual context enable them to find hidden
links and patterns.

Unit 3 : Introduction to R SHINY 21


DADS304: Visualization Manipal University Jaipur (MUJ)

Encourages Data Literacy: Data analytics becomes more accessible by making data easy
to use and comprehend, involving more individuals inside a company.

Difference between data analytics and data visualization-The term "data visualisation"
typically refers to the graphical representation of data, or representing data in bubble
charts, heat maps, and other visuals to aid in understanding patterns, relationships,
trends, and other important insights in datasets. The term "visual analytics" describes
the use of an analytics tool to carry out in-depth analysis of large, complicated datasets
while enabling users to interact and explore dynamic visuals.

Visual Analytics Best Practices

• Define goals
• Integrate and manage the data
• Simplify visualizations
• Get Inspired

Few examples are

Marketing:By enabling the marketer in this example to see and comprehend each stage
of the customer life cycle, visual analytics helps them increase ROI.

Supply Chain:By displaying KPIs and enabling interactive exploration, big data visual
analytics can assistsupply chain managers in quickly discovering relationships across
complicated, divergent data sources.

Sales:The clear, organised presentation of sales data helps sales managers to boost
revenue,

enhance forecasting, and spot important patterns.

Finance:A loan manager at a consumer bank can investigate how various geographic
areas, products, and loan officers fare over time and determine which factors have the
most influence on revenue and profits.

2. What is R-Shiny?Explain the two phases of R Shiny.

Unit 3 : Introduction to R SHINY 22


DADS304: Visualization Manipal University Jaipur (MUJ)

WHAT IS R SHINY?

An R tool called Shiny makes it simple to create dynamic web applications directly from
R. You can create dashboards, embed standalone apps in R Markdown papers, or host
them on a website.

Installation:Installing the package is a prerequisite for using it in R programming.

Install.packages("packagename") is the command that can be used to do this task. Type


this to install the entire Shiny package:

install.packages(“shiny”))

FEATURES

• Create straightforward web applications without JavaScript


• Similar to how spreadsheets are automatically "active." Changes to the input will
result in an automated update.
• Pretty default UI style based on Twitter Bootstrap; works in any R environment.
• Display plots, tables, and printed output from R objects using pre-made and editable
output widgets.

UI OBJECT

The fluidPage() method is used to construct the app's layout for the default app's user
interface. The fluidPage layout will adapt to changes in browser size automatically. The
fluidPage() uses a sidebarLayout() to divide the page into two sections: the
mainPanel(), which houses the histogram output in the example app, and the
sidebarPanel(), which houses the app's input (the slider). This is a reasonably standard
layout for a Shiny programme; take note that nothing prevents us from adding
additional output before or after the sidebarLayout(), such as the titlePanel(). Each
page component, with fluidPage() at the top, is an argument to a function.

There are more advanced layout options that are also more flexible, like navBarPage(),
to create a page with a navigation bar. A user interface can also be created entirely from
scratch using HTML, CSS, etc.

Unit 3 : Introduction to R SHINY 23


DADS304: Visualization Manipal University Jaipur (MUJ)

The web application's code is located in the server function. A render...() function
corresponds to each...Output() function in the user interface. The R code to create the
object we wish to render is provided as the first argument to the render...() function.
For instance, the code to create a histogram is located in the renderPlot() function of
the example app. Since this must be a single expression, it is usually found in s.

The value of the slider as defined in the user interface will be contained in input$bins
(which was called bins).

A SERVER FUNCTION

The web application's

code is located in the server function. A render...() function corresponds to


each...Output() function in the user interface. The R code to create the object we wish
to draw is provided as the first parameter to the render...() function.

For instance, the code to create a histogram is located in the renderPlot() method of the
sample app. Since this must be a single expression, it is usually seen in s.

Unit 3 : Introduction to R SHINY 24


DADS304: Visualization Manipal University Jaipur (MUJ)

14. CONCEPT MAP

Fig 20.Mindmap of Rshiny

15. REFERENCES

• https://mastering-shiny.org/basic-case-study.html
• https://www.analyticsvidhya.com/blog/2021/05/build-interactive-models-with-r-
shiny/
• https://shiny.rstudio.com/articles/build.html
• https://www.qlik.com/us/data-visualization/visual-analytics

Unit 3 : Introduction to R SHINY 25


DADS304: Visualization Manipal University Jaipur (MUJ)

MASTER OF BUSINESS ADMINISTRATION


SEMESTER 3

DADS304
VISUALIZATION

Unit 4 : Dashboard Design using R-Shiny 1


DADS304: Visualization Manipal University Jaipur (MUJ)

Unit 4
Dashboard Design using R-Shiny
Table of Contents

SL Fig No / Table SAQ /


Topic Page No
No / Graph Activity
1 Introduction - -
3
1.1 Objectives - -
2 Dashboard design using R-Shiny - 1 4
3 Creation of R-Shiny dashboard 1 to 20 2, 3, 4 5 - 18
4 Adding features in R-Shiny dashboard 21 to 51 5 19 - 32
5 Summary - - 33
6 Glossary - - 33
7 Concept Map - - 34
8 Study notes and did you Know - - 34
9 Case study 53 to 54 - 35 - 36
10 Terminal Questions - - 36
11 Self-Assessment answer - - 37
12 Terminal Questions Answers - - 37 - 41
13 References - - 41

Unit 4 : Dashboard Design using R-Shiny 2


DADS304: Visualization Manipal University Jaipur (MUJ)

1. INTRODUCTION

The R-Shiny framework is a package from R Studio that makes it easy to build interactive
web applications straight from R. R-Shiny offers powerful analysis tools and data
manipulation or wrangling. It also offers advanced forecasting packages. and statistical
modeling. Through the web applications, a user can visualize data like metadata,
bibliographic data, etc… and to create efficient data reports. These visualization tools can
stimulate other users to create open repositories and connect either regional, national or
international repositories networks. It compiles the user code into the HTML, JavaScript and
CSS needed to display users’ application on the web.

1.1 Objective

After studying, you should be able to:

❖ Discuss the dashboard design using R-Shiny.


❖ Discuss the creation of R-Shiny dashboard.
❖ Identify the features in the R-shiny dashboard.

Unit 4 : Dashboard Design using R-Shiny 3


DADS304: Visualization Manipal University Jaipur (MUJ)

2. DASHBOARD DESIGN USING R-SHINY

2.1 Dashboard

Dashboards are tools that offer current information while employing images to explain the
narratives underlying the data. They assist decision-makers in understanding the
connections in complex, massive data. They display images in a useful arrangement that
makes it easier for the organization to understand and appreciate the data.

2.2 Dashboard Design using R-Shiny

Shiny dashboards give users inside the R environment access to a full web application
framework. You may quickly turn your R work, analysis, and visualizations, machine
learning models, among other things, into web applications that benefit companies. End-
users can use it as a complete application without having any prior knowledge of R. Deliver
a comprehensive, user-friendly, and interactive product that enhances your business
operations.

You may use Shiny as a dashboard development platform to access a variety of R packages
for data research, including the Tidyverse. For the visualization of data and models, you can
access advanced graphical features. Add responsiveness and engagement by embedding
these images in Shiny dashboards. This can be done by using an interface that R has allowed
to communicate with JavaScript-based charting packages.

The administration and structure of code is made easier by taking a look at the Shiny
dashboard's design, making use of functions, modules, and packages, and using rapid
prototyping. Simple source code controls and smaller, more manageable dashboard
components are both possible.

Self-Assessment Questions - 1
1. __________ environment is used in R-Shiny.
2. __________ assist decision-makers in understanding the connections in complex,
massive data.

Unit 4 : Dashboard Design using R-Shiny 4


DADS304: Visualization Manipal University Jaipur (MUJ)

3. CREATION OF R-SHINY DASHBOARD

If you need to summarize or display a lot of information on a single webpage so that


everything can be read in one window, a dashboard can be a smart option. Access to a full
web application framework is made possible within the R environment via shiny
dashboards. A directory containing a R script saved as app. R makes up a Shiny application
project. The script is made of code that describes the user interface object and the server
function. These two are both supplied as arguments to the shinyApp method. resulting in the
construction of either a web app or dashboard as a Shiny app object.

To ensure that your Shiny apps are not only user-friendly but also offer a pleasurable
experience for your users, follow the 7 steps listed below.

• Begin with the cause


• Use paper and a pen.
• Use prepared solutions.
• Get motivated
• Prioritize data storytelling
• Test first, then repeat the process.
• Look past UI

Self-Assessment Questions – 2
3. To display a lot of information on a single webpage we use ___________
4. The script is made of code that describes _________ and ____________

3.1 Begin With The Cause:

Start by considering the following issues while choosing the best UX design for your Shiny
app:

• What was the primary impetus behind developing the app?


• Who utilizes the services?

Unit 4 : Dashboard Design using R-Shiny 5


DADS304: Visualization Manipal University Jaipur (MUJ)

• What are the users going to be able to do with your app? What is the company's
vision?
• How have they accomplished it thus far? Are there any tools or processes to which
they are already accustomed?

Prior to building the Shiny app, it's critical to know the answers to these questions in order
to ensure that the interface will support the main functionality. Knowing the users' identities
will enable you to highlight the important information and conceal or even skip the rest. The
best way to provide the results—in a table, as a downloadable file, or as a graph—depends
entirely on what the users need to complete. Observing how users now complete that task
will help you better understand the entire process or possibly identify a potential
competitive advantage.

Use paper and a pen:

Nevertheless, it's a wise idea to initially sketch out your ideas before you begin creating your
app. The explanation is straightforward: redesigning the wireframes is significantly simpler
than altering the UI code.

Although there are many tools available (such as Figma) for creating wireframes and
mockups, it is perfectly acceptable to start with simply a pen and paper.The UI components
may easily be removed and moved about the page to observe how they work together.

Don't stick with the first design; try different ones, get input from the intended audience, and
refine the design.

Use prepared solutions:

There are many tried-and-true packages that can give your Shiny app a polished appearance.
You may utilize Appsilon's shiny.fluent to create business applications, especially for
environments that lean heavily toward Microsoft.

Unit 4 : Dashboard Design using R-Shiny 6


DADS304: Visualization Manipal University Jaipur (MUJ)

Get motivated:

Try finding some inspiration if you're having trouble with the page layout. To see how similar
capabilities are handled on websites you enjoy, visit them and learn how to navigate easily.
You can peruse Shiny demos if you're in the mood for more application cases. Then, you can
go to websites that irritate you or on which you simply cannot seem to get the information
you seek, and try to comprehend the underlying issue that is affecting user experience.

A good user experience design should adhere to certain principles in UX design if you require
a more "formal" strategy.

Prioritise data storytelling:

The majority of Shiny programs include data visualization, making R an excellent choice for
data processing and analysis. Although data visualization design is a large subject, there are
a few crucial points to take into account to get us started.

4 rapid data display tips:

Type of graph

Use a line graph if the information is primarily concerned with changes over time. Use a bar
chart when you want to demonstrate how the levels of various categories differ from one
another (and don't forget to always start the bar chart at 0!).

Colors

Choose your colors carefully because an overuse of them might make your graph clumsy to
look at and hard to read.

Axes

The axis should be adjusted to correspond to the data; otherwise, it will be challenging to
identify the data's variance.

Unit 4 : Dashboard Design using R-Shiny 7


DADS304: Visualization Manipal University Jaipur (MUJ)

Labels

Consider the labels and consider whether the precise value is crucial. If so, it's a good idea to
make the label on the graph clear. However, we can hide the labels and make the graph easier
to understand if the relationship between both the series is more significant.

Test first, then repeat the process:

The best approach to determine whether your Shiny app satisfies user expectations is
through in-depth user interviews. They are extended user sessions when we ask them to
utilise the product to complete a number of tasks. In this manner, we may determine whether
there are any persistent issues with the navigation or the usability in general.

The great news is that the app layout can be tested at any point throughout development,
even before it begins. One can manually alter the "screens" as the user "performs an action"
in the app using wireframes or mockups. Just keep in mind not to put off testing until the last
minute. Rebuilding the UI when app development is complete will be expensive.

Look past UI:

It's simple to overlook the fact that the user experience of an app includes more than just
how it appears and functions. Check out the following items to get your Shiny app off to a
good start:

• Is the design of the app responsive?


• When a user encounters an error message, do they understand what went wrong?
• Is the user notified about the app's status, such as when they are waiting for a
calculation and wondering when it will be finished?

Remember that this is simply a starting point for you. In addition to the aesthetic element of
the user interface, there are more UX touch-points to take into account the more complicated
the app and user flow.

Unit 4 : Dashboard Design using R-Shiny 8


DADS304: Visualization Manipal University Jaipur (MUJ)

3.2 Creation Of Simple Dashboard:

Installation:

Shiny 0.11 or later is needed for shinydashboard.Run: to install.

Fig.1. Installing r-Shiny

Basics:

A dashboard is composed of a body, a sidebar, and a header. Here is the dashboard page's
user interface at its most basic.

Fig.2. Basic syntax for dashboard.

Blank dashboard:

Using the shinyApp() function, you may immediately view it at the R console. (This code can
also be used to create a single-file app.)

Unit 4 : Dashboard Design using R-Shiny 9


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig.3.Blank dashboard program.

Fig.4.Output of above program.

Self-Assessment Questions – 3
5. A dashboard is composed of ________ , ________ , and ___________
6. Blank dashboard can be created by _________ function_

Basic dashboard:

The blank dashboard is obviously not very helpful. We'll need to include elements that are
functional. We can include content-filled boxes in the body.

Unit 4 : Dashboard Design using R-Shiny 10


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig.5. Basic Program for dashboard.

Fig.6.Output of the above program.

The sidebar's content can then be added next. We'll add menu options that function like tabs
for this example. These work similarly to Shiny's tab Panels in that they display a distinct set
of material in the main body when you click on a certain menu item.

There are two tasks that must be completed. You must first add sidebar menu items with the
proper tabNames.

Unit 4 : Dashboard Design using R-Shiny 11


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig.7. Creating Sidebar.

Add tabItems with the corresponding values for tabName to the body:

Fig.8. Creating a Body of the dashboard.

The default display, which also appears when the menu item "Dashboard" is clicked:

Fig.9. Default display of dashboard menu.

And this is what appears when you click "Widgets":

Unit 4 : Dashboard Design using R-Shiny 12


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig.10. Widgets menu.

3.3 Boxes:

The primary components of dashboard pages are boxes. With the box() function, a simple
box may be built, and (most) any Shiny UI element can be used as its content.

Also with title and status settings, boxes can also have titles and different colored header
bars.

Fig.11. Program for boxes.

Fig.12. Dashboard Page of boxes.

Unit 4 : Dashboard Design using R-Shiny 13


DADS304: Visualization Manipal University Jaipur (MUJ)

tabBox:

Use a tabBox if you need a container to just have tabs for showing various content sets.

Fig.13. Implementation of tabBox code-1

Fig.14. Implementation of tabBox code-2

Unit 4 : Dashboard Design using R-Shiny 14


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig.15. Dashboard of tabBox.

infoBox:

Simple numerical or text values are typically displayed in a particular type of box with an
icon.

Fill=FALSE is the default setting for the initial row of infoBoxes, while fill=TRUE is used for
the second row. Shiny Dashboard includes the auxiliary functions infoBoxOutput and
renderInfoBox for dynamic content because infoBox content is typically dynamic.

Fig.16. Implementation of infoBox code-1.

Unit 4 : Dashboard Design using R-Shiny 15


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig.17. Implementation of infoBox code-2.

Fig.18. Dashboard of infoBox.

ValueBoxes:

ValueBoxes resemble infoBoxes but are visually distinct from them. The following code will
create these valueBoxes. Some of these valueBoxes are static and some are dynamic, just
like the infoBoxes mentioned before.

Unit 4 : Dashboard Design using R-Shiny 16


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig.19. Implementation of valueBoxes code.

Fig.20. Dashboard of valueBoxes.

Unit 4 : Dashboard Design using R-Shiny 17


DADS304: Visualization Manipal University Jaipur (MUJ)

Self-Assessment Questions – 4
7. A simple box can be created using ________ function.
8. Boxes design can be classified as ________ , ________ and ___________

Unit 4 : Dashboard Design using R-Shiny 18


DADS304: Visualization Manipal University Jaipur (MUJ)

4. ADDING FEATURES IN R-SHINY DASHBOARD


We first need to understand how a Shiny UI is constructed and how it interacts with the
HTML of a web page in order to comprehend how the components of a dashboard function
together.

Like div() and p(), the HTML tag methods in Shiny return objects that can be rendered as
HTML. For instance, when you issue the following instructions at the R terminal, HTML is
printed out:

Fig.21. HTML tag in R-Shiny.

These parts of HTML are used to create the UI for a Shiny app. A collection of utilities created
to construct HTML that will generate a dashboard are offered by the shinydashboard
package. The dashboard will print out HTML if you copy the UI code for a page in the
dashboard (above) and put it into the R console.

4.1. Structure Overview

Three components make up a "shinydashboard": a header, a sidebar, and a body.

Let's look at a straightforward dashboard. You can see that the title is contained in
dashboardHeader () and that dashboardSidebar () contains a sidebarMenu(). Output is
contained in the dashboardBody ().

The dashboardPage () function's six key components are as follows:

Unit 4 : Dashboard Design using R-Shiny 19


DADS304: Visualization Manipal University Jaipur (MUJ)

• Skin
• Header
• Sidebar
• Body
• Controlbar
• Footer

Fig.22. Sales Revenue Dashboard Application - UI interface.

We'll now examine each of the six elements that make up a shinydashboard.

• skin()

The color theme is the skin. The backdrop of the sidebar will be light if the skin is light.
Depending on the type of app you build, it is simple to select the appearance you like.
The plot color should complement the skin you select for your application. The list of
skin tones is provided below.

Fig.23. List of skin tones.

Unit 4 : Dashboard Design using R-Shiny 20


DADS304: Visualization Manipal University Jaipur (MUJ)

• header()

Dropdown menus and titles are both possible in a header. Here's an illustration:

Fig.24. Creating title in header.

The dropdownMenu() function creates the dropdown menus. There are three different types
of menus: messages, notifications, and tasks, and each one needs a specific kind of material
to be filled with.

Message menus

Values for from and message are required for a messageItem in a message menu. The icon
and a notification time string are also under your control. Any text can be the time string.

Fig.25. Message Menu.

Dynamic content

You'll want to make the content dynamic in the majority of circumstances. To put it another
way, the HTML content is created on the server and delivered to the client to be rendered.

Fig.26. Dynamic content.

Unit 4 : Dashboard Design using R-Shiny 21


DADS304: Visualization Manipal University Jaipur (MUJ)

And on the server side, you would create a renderMenu to build the complete menu, as in:

Fig.27. Implementation of RenderMenu.

Notification menus

Fig.28. Notification menus.

A notification contains a notificationItem that contains a text notification. The user can also
control the status color and icon.

Unit 4 : Dashboard Design using R-Shiny 22


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig.29. Implementation of Notification menus.

Task menus

Fig.30. Task menus.

Progress bars and text labels are shown on task items. The bar's color can also be chosen.

Unit 4 : Dashboard Design using R-Shiny 23


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig.31. Implementation of Task menus.

Disabling the header

You can prevent a header bar from appearing by using the following command:

Fig.32. Disabling the header.

• sidebar()

Usually, a sidebar is used for rapid navigation. It may also have Shiny inputs like sliders
and text inputs, as well as menu items that function similarly to tabs in a tabPanel.

Unit 4 : Dashboard Design using R-Shiny 24


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig.33. sidebar

Sidebar menu items and tabs

Links in the sidebar can be utilized similarly to Shiny's tabPanels. In other words, when you
click a link, the dashboard's body will change to show new content. Here is an illustration of
a basic tabPanel:

Fig.34. Sidebar menu tabs.

The main body's content changes when the user selects one of the menu items:

Unit 4 : Dashboard Design using R-Shiny 25


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig.35. Sidebar menu items.

The sidebarMenu() function is used to insert the menu items. Make sure that the tabName
values for a menuItem and a tabItem match in order to connect them.

Unit 4 : Dashboard Design using R-Shiny 26


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig.36. Implementation of sidebar.

If you specify a value for href, a menuItem has additional capabilities beyond controlling tabs.
It also has the ability to link to other materials. These external links typically open in a new
tab or window in the browser; the new tab option allows you to change this behavior.

Fig.37. Program for menuItem.

Unit 4 : Dashboard Design using R-Shiny 27


DADS304: Visualization Manipal University Jaipur (MUJ)

Bookmarking and restoring selected tabs

Shiny now offers the ability to bookmark and restore an application's state as of version 0.14.
In a shinydashboard-built project, you must call sidebarMenu() with an id in order to
bookmark and restore the currently selected tabItem. For instance:

Fig.38. Bookmarking and restoring selected tabs.

Dynamic content

RenderMenu and sidebarMenuOutput enable the dynamic generation of a sidebar menu.


Here is an illustration of a server-generated sidebar application.

Fig.39. Dynamic content

Unit 4 : Dashboard Design using R-Shiny 28


DADS304: Visualization Manipal University Jaipur (MUJ)

Inputs in the sidebar

A sidebar can also consist of ordinary inputs, like sliderInputs and textInputs.

Fig.40. Inputs in the sidebar.

A sidebarSearchForm, which is visible at the top in the image above, is another unique sort
of input available in shinydashboard. This is essentially a text input that has been particularly
formatted, together with an actionButton that looks like a magnifying glass (the icon can be
changed with the icon argument).

Fig.41. Implementation of sidebarSearchForm.

Disabling the sidebar

If the user don’t want a sidebar, user can disable it with:

Fig.42. Disabling the sidebar.

Unit 4 : Dashboard Design using R-Shiny 29


DADS304: Visualization Manipal University Jaipur (MUJ)

• controlbar()

Fig.43. Example of controlbar().

dashboardControlbar build a right sidebar container.

Fig.44. Implementation of controlbar().

• The Output Interface

body()

Fig.45. Implementation of Output interface.

Unit 4 : Dashboard Design using R-Shiny 30


DADS304: Visualization Manipal University Jaipur (MUJ)

The below figure is the example of Stock Market Forecasting Application - Interface UI

Fig.46. Stock Market Forecasting Application - Interface UI.

These templates for stock symbols include the stock, most recent trade price, price change
and percentage price change, return date, and volume.

Fig.47. The Output Templates.

The charts that follow are those for individual stock symbols. By using the Study button, the
user can select another sticker to explore. We will learn how to use the gradient colour for
this. For those plots, I used ggplot2 to make a static plot and Plotly to build an interactive
plot.

Fig.48. The Individual Output.

Unit 4 : Dashboard Design using R-Shiny 31


DADS304: Visualization Manipal University Jaipur (MUJ)

The final graph is a line graph that contrasts every stock symbol.

Fig.49. Stock Symbols Comparison.

• The Footer

footer()

Fig.50. The Application Footer.

Let's look at yet another approach to adding a footer to this programme.

Fig.51. Implementation of footre().

Self-Assessment Questions – 5
9. The __________ function creates the dropdown menus.
10. ____________ and ___________ enable the dynamic generation of a sidebar menu.

Unit 4 : Dashboard Design using R-Shiny 32


DADS304: Visualization Manipal University Jaipur (MUJ)

5. SUMMARY
In summary, business analytics and statistics are greatly aided by dashboards. Through a
wide range of sources, sizes, and types of data, dashboards offer a window into the
understanding and tracking of business indicators. Dashboards make it easier for users to
collaborate and make decisions. Data scientists don't need to spend much time presenting
the findings because the data story is easily understood by the general public. Additionally,
Shiny dashboards' availability on web and mobile platforms increases users' accessibility
and mobility.

R-based data specialists can easily incorporate Shiny into their development workflow.
Shiny dashboards are adaptable, dynamic, and simple to tailor to each customer's unique
requirements. This is partially attributable to the web framework's support for web
technologies including HTML, CSS, SCSS, JavaScript, and others. Utilizing Shiny makes it
possible to employ modularized codes and functions, quickly prototype ideas, and easily
manage dashboards using smaller components.

6. GLOSSARY
Dashboard: All of the user data is shown visually on a dashboard. Although it has a wide
range of applications, its main purpose is to present information quickly, like KPIs. The
information for a dashboard often comes from a linked database and is shown on its own
page.

R-Shiny: The open source R package Shiny offers a beautiful and robust web framework for
creating online apps. With the aid of Shiny, you can transform your studies into interactive
web applications without having to grasp HTML, CSS, or JavaScript.

Shinydashboard: An R tool called Shiny makes it simple to create dynamic web applications
directly from R. Because they are good at assisting businesses in drawing conclusions from
the data already available, dashboards are widely used.

Unit 4 : Dashboard Design using R-Shiny 33


DADS304: Visualization Manipal University Jaipur (MUJ)

7. CONCEPT MAP

Fig.52: Anatomy of a shiny app.

8. STUDY NOTES & DID YOU KNOW

Did you know?: You may access a full web application framework within the R environment
with shiny dashboards. You may quickly turn your R work, analysis, and visualizations,
machine learning models, among other things, into web applications that benefit companies.

Unit 4 : Dashboard Design using R-Shiny 34


DADS304: Visualization Manipal University Jaipur (MUJ)

9.CASE STUDY
Shiny dashboards in healthcare:

The following Shiny app was created by Christian Luz "in the setting of a 1339-bed academic
tertiary referral hospital to process the data of over 180,000 admissions." Users of the
software can filter patients using one of 17 distinct criteria. Antimicrobial resistance,
microbiological tests, and their application can all be researched by users. The
investigation's findings can be quickly classified and stratified "to compare predefined
patient groups based on specific patient attributes."

Fig.53: Dashboard for healthcare.

Voter profile:

The Voters Profiles dashboard is a classic illustration of R Shiny in elections. It offers access
to graphs, maps, and a novel way to look into the voting profiles in the 2014 elections in
Brazil. It received honourable mention status in the 2019 Shiny competition.

Government representatives and others can readily assess the Brazilian elections of 2014
thanks to this dashboard. The results for both the first and second rounds are available by
state and city. The number of votes cast for each contender is also displayed as a bar graph.

Unit 4 : Dashboard Design using R-Shiny 35


DADS304: Visualization Manipal University Jaipur (MUJ)

Votes for governors and senators are displayed on the second tab, State level, while votes for
the president are displayed on the first tab, Federal level.

Fig.54: Dashboard for Voter Profile.

10.TERMINAL QUESTIONS
Short Answer Type:

1. List out the 7 basic steps that one should consider when designing an dashboard.
2. Write a basic syntax for creating an dashboard.
3. Explain the skin component of dashboardPage() function.
4. Discuss the footer of dashboardPage() function.

Long Answer Type:

1. Write a program for creating an tabBox in an dashboard


2. Write a program for the dynamic content , notification and task menus.

Unit 4 : Dashboard Design using R-Shiny 36


DADS304: Visualization Manipal University Jaipur (MUJ)

11. SELF-ASSESSMENT ANSWERS

1. R
2. Dashboards.
3. Dashboards.
4. user interface object and the server function
5. body, a sidebar, and a header.
6. shinyApp()
7. box()
8. tabBox,infoBox,ValueBox
9. dropdownMenu()
10. RenderMenu and sidebarMenuOutput

12. TERMINAL QUESTIONS ANSWERS


1. The 7 basic steps are:

• Begin with the cause


• Use paper and a pen.
• Use prepared solutions.
• Get motivated
• Prioritise data storytelling
• Test first, then repeat the process.
• Look past UI
2.

Unit 4 : Dashboard Design using R-Shiny 37


DADS304: Visualization Manipal University Jaipur (MUJ)

3.

skin()

The colour theme is the skin. The backdrop of the sidebar will be light if the skin is light.
Depending on the type of app you build, it is simple to select the appearance you like. The
plot colour should complement the skin you select for your application. The list of skin
tones is provided below.

4.

footer()

Let's look at yet another approach to adding a footer to this programme.

Long Answer Type:

1. The primary components of dashboard pages are boxes. With the box() function, a
simple box may be built, and (most) any Shiny UI element can be used as its content.

tabBox:

Use a tabBox if you need a container to just have tabs for showing various content sets.

Unit 4 : Dashboard Design using R-Shiny 38


DADS304: Visualization Manipal University Jaipur (MUJ)

Unit 4 : Dashboard Design using R-Shiny 39


DADS304: Visualization Manipal University Jaipur (MUJ)

2.

Dynamic content

Notification menus

A notification contains a notificationItem that contains a text notification.the user can also
control the status color and icon.

Unit 4 : Dashboard Design using R-Shiny 40


DADS304: Visualization Manipal University Jaipur (MUJ)

Task menus

13. REFERENCES
• https://appsilon.com/dashboards-in-rshiny/
• https://appsilon.com/dashboards-in-rshiny/#using
• https://rstudio.github.io/shinydashboard/structure.html
• https://bookdown.org/loankimrobinson/rshinybook/stock-front-footer.html

Unit 4 : Dashboard Design using R-Shiny 41


DADS304: Visualization Manipal University Jaipur (MUJ)

MASTER OF BUSINESS ADMINISTRATION


SEMESTER 3

DADS304
VISUALIZATION

Unit 5 : Creating Advanced Dashboard and Visualization 1


DADS304: Visualization Manipal University Jaipur (MUJ)

Unit 5
Creating Advanced Dashboard and
Visualization
Table of Contents

SL Fig No / Table SAQ /


Topic Page No
No / Graph Activity
1 Introduction 1 -
3-5
1.1 Objectives - -
2 Creation of Advanced Interactive R-Shiny
2 to 19 1 6 - 14
Dashboard
3 Addition Of Advanced Visualizations In R-
20 - 32 2 15 - 23
Shiny Dashboard
4 Activity - - 24
5 - - 24
Summary
6 Glossary - - 25
7 Concept Map 33 to 34 - 25
8 Study Notes & Did You Know - - 26
9 Case Study 35 to 36 - 26 - 27
10 Terminal Questions - - 28
11 Self-Assessment Answers - - 28
12 Terminal Questions Answers - - 28 – 39
13 References - - 39

Unit 5 : Creating Advanced Dashboard and Visualization 2


DADS304: Visualization Manipal University Jaipur (MUJ)

1. INTRODUCTION

What is a dashboard?

Dashboards are tools that offer current information while providing images to explain the
information about the data. They assist decision-makers in understanding the connections
in complex, massive data. They display images in a useful arrangement that makes it easier
for the organization to understand and appreciate the data.

Shiny

The RStudio PBC team created the open source R package known as Shiny. To offer a
beautiful and simple web framework for creating online apps in R, RStudio created Shiny. R
users can build amazing apps, interactive maps, and dashboards using Shiny. And to
construct it, you don't need significant web development abilities!

Why Do We Need a Database?

• Timely information - Compress and organize important information. Assisting in the


identification of the most effective responses to queries connected in decision-making.
• Access critical Insights- Follow the trends in sales data and KPIs, which are all derived
from different data sources.
• Increase performance - It provides a window for trend visualization and business
metric tracking.
• Customization and scale - Adapt to the requirements of certain user roles, departmental
demands, or a complete organization. Boost operational effectiveness while preserving
brand coherence.
• Collaboration - Create a user interface for multi-user collaboration. Encourage
collaboration among teams.

Unit 5 : Creating Advanced Dashboard and Visualization 3


DADS304: Visualization Manipal University Jaipur (MUJ)

Why do we build dashboards in R Shiny?

Access to a full web application framework is made possible within the R environment via
shiny dashboards. You can quickly create web applications that benefit organizations from
your R work, including analyses, visualizations, machine learning models, and more. End-
users can use it as a complete application without knowing how to use R. Deliver a
comprehensive, user-friendly, and interactive product that enhances how you conduct
business.

Custom Shiny dashboards

The dashboard may be easily customized using unique HTML, CSS, SCSS, Javascript, and
other languages thanks to Shiny's web framework. With other BI software suites, it is
impossible to develop a distinctive, customized dashboard with this level of flexibility.
Include elements like colors, logos, typefaces, and others that better reflect your company.

Cost

Comparing Shiny to competitors like Power BI and Tableau, the latter two are less expensive
and open source. On the Appsilon blog, you can examine a detailed comparison of Shiny to
Power BI and Shiny to Tableau.

Open Source and Accessibility

One may use Shiny as a dashboard development platform to access a variety of R packages
for data research, including the Tidyverse. For the visualization of data and models, you can
access advanced graphical features. Add reactivity and interaction by embedding these
images in Shiny dashboards. This can be done by using an interface that R has allowed to
communicate with JavaScript-based charting packages.

Unit 5 : Creating Advanced Dashboard and Visualization 4


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 1: R Shiny Dashboard for Healthcare

1.1 Objectives

You ought to be able to after studying this chapter :

❖ Create Advanced Interactive R-Shiny Dashboard


❖ Addition of Advanced Visualization in R-Shiny dashboard

Unit 5 : Creating Advanced Dashboard and Visualization 5


DADS304: Visualization Manipal University Jaipur (MUJ)

2. CREATION OF ADVANCED INTERACTIVE R-SHINY DASHBOARD

R Shiny is a fantastic tool for quickly producing aesthetically stunning and practical
dashboards, and it's not too difficult to master. Deploying more sophisticated solutions,
nevertheless, faces two significant obstacles.

First off, without a basic understanding of CSS and JavaScript, implementing unique
dashboard designs could prove challenging. Second, the process of putting the software into
production also calls for more sophisticated expertise. There isn't much of a problem
because my app is never used by more than two people at once. It does need more steps,
though, to make it accessible to a few hundred individuals.

Contents involved to create a Shiny Dashboard

• Data sources
• Data transformation
• Shiny App structure
• Visualizations
• Dashboard deployment on the shinyapps.io website server.

Let us get to know each topic in detail with an example

1. Data Source:-

In this study, two datasets will be used.

➔ Data from the Graduate Employment Survey is available at data.gov.sg/ges.

Fig 2: Graduate employment dataset

Unit 5 : Creating Advanced Dashboard and Visualization 6


DADS304: Visualization Manipal University Jaipur (MUJ)

➔ Users can find the dataset for grads by university here: data.gov.sg/gra.

Fig 3: Grads Dataset by university

As previously indicated, the analysis's primary metrics are:

• Rate of Employment
• Gross Monthly Income

The R packages used are

Fig 4: R packages used in creating the dashboard

2. Data transformation

Fig 5: Workflow of Data Transformation

Unit 5 : Creating Advanced Dashboard and Visualization 7


DADS304: Visualization Manipal University Jaipur (MUJ)

In the code given below (DataWrangling.R), we will essentially attach the dataset, remove
NAs, fill missing values, alter university names to make it more readable, and then store the
data frames in.rds format, which is much quicker to read at the subsequent steps.

Fig 6: Code

There are certain names that need cleaning if users look at the names of the school
unique(data e$school) in the dataset, for instance:

Fig 7: Code

Numerical transformation of factor variables:

Fig 8: Numerical transformation

Lastly, save the cleaned data frame with the Shiny.rds extension:

Fig 9: Cleaning Data

Unit 5 : Creating Advanced Dashboard and Visualization 8


DADS304: Visualization Manipal University Jaipur (MUJ)

3. Shiny App structure

To run a Shiny App you need to install the package & import the library:

Fig 10: Packages

Shiny apps have two crucial parts, which I refer to as the front-end user interface and the
back-end server. R.

To design the front end of our web application, we use the ui.R programme.

This is a basic illustration of illustration of our dashboard’s user interface:

Fig 11: UI Dashboard

Unit 5 : Creating Advanced Dashboard and Visualization 9


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 12: Code for front end

Fig 13: Server code

Unit 5 : Creating Advanced Dashboard and Visualization 10


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 14: Result for the code

4. Visualizations

The majority of visualizations are created using Plotly, and Kable is used to present the
data tables.

Unit 5 : Creating Advanced Dashboard and Visualization 11


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 15: Visualization code

Result:

Fig 16: Monthly median Salary

5. Deployment of dashboard on https://www.shinyapps.io/ server.

Steps to create deploy on dashboard are:

By assuming that server.R and ui.R are located in the same folder.

Unit 5 : Creating Advanced Dashboard and Visualization 12


DADS304: Visualization Manipal University Jaipur (MUJ)

1. Run the following programme from your R file:


install.packages('rsconnect')
library (rsconnect)
2. Create an account on shinyapps.io. Please be aware that all of your apps on shinyapps.io
will use your account name as the domain name.
3. Obtain the website-generated token when you log in to shinyapps.io:

Fig 17: Website generated

4. Set up your account so the rsconnect package may utilise it. Bringing up the website Click
the display button on the token page to start. A popup displaying the whole command to set
up your account with the correct arguments for the rsconnect::setAccountInfo function will
appear. To use this command, copy it to your clipboard, paste it into RStudio's command line,
and press enter.

Fig 18: Setting up account

Unit 5 : Creating Advanced Dashboard and Visualization 13


DADS304: Visualization Manipal University Jaipur (MUJ)

5. Release the app. Use the following codes to deploy your application:

library(rsconnect)

deployApp()

Next, in R, select the Publish button (from ui.R or server.R).

Fig 19: Publish

Congratulations when the deployment is complete! Your initial Shiny App has been released.
You can publish the app again after making modifications to the server.R or ui.R files.

Self-Assessment Questions - 1
1. The majority of visualizations are created using _________, and __________ is used to
present the data tables.
2. Codes to deploy your application _____________ and _____________.

Unit 5 : Creating Advanced Dashboard and Visualization 14


DADS304: Visualization Manipal University Jaipur (MUJ)

3. ADDITION OF ADVANCED VISUALIZATIONS IN R-SHINY DASHBOARD


Data Visualization

Today's world has an exponentially growing amount of data, making it impossible to tell tales
without them. Even while there are specialized tools available, such as Tableau, QlikView,
and d3.js, nothing can substitute a modeling and statistics tool with strong visualization
capabilities. Both feature engineering and any type of exploratory data analysis benefit
greatly from it. R is a huge assistance in this regard.

To create visualizations and show data, R Programming provides a sufficient selection of


built-in functions and libraries (such as ggplot2, leaflet, and lattice).In this there will be brief
explanation about the basic and advanced visualizations in R-Shiny Dashboard.

Basic Visualization

➔ Histogram

➔ Bar Chart

➔ Line Chart

Advanced Visualization

➔ Heat Map

➔ Map Visualization

➔ Correlogram

BASIC VISUALIZATION

1. HISTOGRAM

The most popular graph to represent continuous data is a histogram. It is a bar plot that
shows the measurements' frequencies of appearance and counts the number of
observations that fall within each interval. Additionally, the height is influenced by the
ratio of frequency to interval width.

Unit 5 : Creating Advanced Dashboard and Visualization 15


DADS304: Visualization Manipal University Jaipur (MUJ)

Example:The code below creates a function called binner, which takes a vector named
var, and saves the histogram app as that. The function sends shinyApp its var
parameter, which causes shinyApp to start an app that displays var.

Fig 20 : Code to implement Histogram

Output:

binner(faithful$eruptions)

Fig 21: Histogram 1

Unit 5 : Creating Advanced Dashboard and Visualization 16


DADS304: Visualization Manipal University Jaipur (MUJ)

binner(iris$Sepal.Length)

Fig 22: Histogram 2

2. BAR CHART

Rectangular bars with lengths proportionate to the values of the variables are used in
bar charts to display data. Bar charts are made in R using the function barplot(). R could
produce both vertical and horizontal bars in a bar chart. Each bar in a bar graph can
have a distinct color.

Unit 5 : Creating Advanced Dashboard and Visualization 17


DADS304: Visualization Manipal University Jaipur (MUJ)

Example:

Fig 23 : Code to implement Bar chart

Output:

Fig 24 : Bar chart

Unit 5 : Creating Advanced Dashboard and Visualization 18


DADS304: Visualization Manipal University Jaipur (MUJ)

3. LINE CHART

A line graph, also known as a line plot or a line chart, connects each individual data
point with a line. In a line graph, numbers are represented as a function of time.

Example:The code illustrates how to give each line on the chart a distinct color.

Fig 25 : Code to implement different color in line chart-1

Fig 26 : Code to implement different color in line chart-1

Unit 5 : Creating Advanced Dashboard and Visualization 19


DADS304: Visualization Manipal University Jaipur (MUJ)

Output:

Fig 27 : Line Chart

ADVANCED VISUALIZATION

1. HEAT MAP

A heatmap is a graphical depiction of data that uses color coding to illustrate different
values. Although heatmaps can be used for many different types of analytics, they are
most frequently used to display user behavior on certain webpages or web page
layouts.

Example:By using iris dataset the Heat map will be visualized.

Unit 5 : Creating Advanced Dashboard and Visualization 20


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 28 : Code to implement different color in Heat map

Output:

Fig 29 : Heat Map

2. MAP VISUALIZATION

Through map visualisation, spatially pertinent data is examined, shown, and presented.
This type of data expression is more comprehensible and transparent. The distribution
or percentage of data in each area can be seen visually.

Example:The UScensus2010 was used to compile the dataset counties.rds, which


contains demographic information for every county in the country.

Unit 5 : Creating Advanced Dashboard and Visualization 21


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 30 : Code to implement Map visualization

Output:

Fig 31 : Map Visualization

Unit 5 : Creating Advanced Dashboard and Visualization 22


DADS304: Visualization Manipal University Jaipur (MUJ)

3. CORRELOGRAM

A correlation matrix is represented as a correlogram. Highlighting the variables in a


data table that are highly connected is quite helpful. According to their values,
correlation coefficients are coloured in this graph. The correlation matrix can also be
rearranged in accordance with how strongly two variables are related.

Example:By using Iris dataset the Correlogram has been visualized.

Fig 32 : Correlogram

Self-Assessment Questions - 2
3. _______ provides a sufficient selection of built-in functions and libraries (such as
ggplot2, leaflet, and lattice).
4. A ________ is represented as a correlogram.

Unit 5 : Creating Advanced Dashboard and Visualization 23


DADS304: Visualization Manipal University Jaipur (MUJ)

4. ACTIVITY
Activity A

In the Chinese city of Wuhan, incidences of severe respiratory sickness started to be reported
in December 2019. These were brought on by a novel coronavirus, and the illness is now
generally known as COVID-19. Midway through January, the number of COVID-19 cases
began to increase more swiftly, and the virus soon expanded outside of China. Since then,
this story has quickly developed, and every day we are presented with unsettling stories
about the outbreak's current situation.

These headlines can be challenging to understand on their own. How quickly is the virus
circulating? Work being done to control the disease? How does the current scenario differ
from past epidemics?

……………………………………………………………………………………………………………………………………….

The objective is to develop a visualization which updates data like number of affected
cases,deaths and mostly affected countries based on the mapping date.

5. SUMMARY
R-based data specialists can easily incorporate Shiny into their development workflow.
Shiny dashboards are adaptable, dynamic, and simple to tailor to each customer's unique
requirements. This is partially attributable to the web framework's support for web
technologies including HTML, CSS, SCSS, JavaScript, and others. Writing R codes to plot
graphs repeatedly can become very tiresome. Additionally, making an interactive
visualization for story narration is very challenging. Therefore, the issues can be easily fixed
by quickly building interactive charts in R using Shiny. Utilizing Shiny makes it possible to
employ modularized codes and functions, quickly prototype ideas, and easily manage
dashboards using smaller components.

Unit 5 : Creating Advanced Dashboard and Visualization 24


DADS304: Visualization Manipal University Jaipur (MUJ)

6. GLOSSARY

Data wrangling - Eliminating errors and integrating complex data sets to make them more
accessible and understandable.

Correlogram - Displays the correlations between each pair of variables.

7. CONCEPT MAP

Fig 33 : MAP TO CONNECT UI TO THE SERVER

Fig 34 :ANATOMY OF SHINY APPLICATION

Unit 5 : Creating Advanced Dashboard and Visualization 25


DADS304: Visualization Manipal University Jaipur (MUJ)

8. STUDY NOTES & DID YOU KNOW

Study Notes – Shiny applets like the Covid Tracker and RadaR, an R-based interactive tool
for rapid analysis of diagnostic and antimicrobial patterns, provide an example.

Study Notes – With the help of the shiny package, you can quickly create a user interface (UI)
and use R code to update the plots and analyses that are displayed to the user in response to
their selection of various UI options.

9. CASE STUDY

a) MRI images in Shiny

The use of R-Shiny Visualizations in medical imaging has significant positive effects on
the healthcare industry. A significant study in this field is Big Data Analytics in
Healthcare, which was published in BioMed Research International. The study lists
several common imaging methods, such as computed tomography, computed
radiography, mammography, and magnetic resonance imaging (MRI). The disparity in
these images' modality, resolution, and dimensions is addressed using a variety of
techniques. To enhance image quality, more effectively extract data from photos, and
offer the most accurate interpretation, many more are currently being developed. By
learning from prior cases and then suggesting better treatment options, the deep-
learning based algorithms improve diagnostic accuracy.

Using only R shine mechanisms, this software interactively visualizes 3D MRI scans.

Unit 5 : Creating Advanced Dashboard and Visualization 26


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 35 : 3-D Brain MRI

Fig 36 : 4-D Brain MRI

Unit 5 : Creating Advanced Dashboard and Visualization 27


DADS304: Visualization Manipal University Jaipur (MUJ)

10. TERMINAL QUESTIONS

Short Answers Type:

1. Draw a workflow diagram for data transformation.


2. Define Histogram.

Long Answers Type:

1. Describe the steps involved in creating a dashboard with code.


2. Explain any two of the advanced visualizations with the code using R-Shiny.

11. SELF ASSESSMENT ANSWERS


1. Plotly and Kable
2. library(rsconnect) and deployApp()
3. R Programming
4. Correlation matrix

12. TERMINAL QUESTION ANSWERS


Short Answer Type:

1.

Unit 5 : Creating Advanced Dashboard and Visualization 28


DADS304: Visualization Manipal University Jaipur (MUJ)

2. The most popular graph to represent continuous data is a histogram. It is a bar plot that
shows the measurements' frequencies of appearance and counts the number of
observations that fall within each interval. Additionally, the height is influenced by the
ratio of frequency to interval width.

Long Answer Type:

1. Contents involved to create a Shiny Dashboard


• Data sources
• Data transformation
• Shiny App structure
• Visualizations
• Dashboard deployment on the shinyapps.io website server.

Let us get to know each topic in detail with an example

1. Data Source:-

In this study, two datasets will be used.

→ Data from the Graduate Employment Survey is available at data.gov.sg/ges.


→ Users can find the dataset for grads by university here: data.gov.sg/gra.

As previously indicated, the analysis's primary metrics are:

• Rate of Employment
• Gross Monthly Income

2. Data transformation

Workflow of Data Transformation

Unit 5 : Creating Advanced Dashboard and Visualization 29


DADS304: Visualization Manipal University Jaipur (MUJ)

In the code given below (DataWrangling.R), we will essentially attach the dataset, remove
NAs, fill missing values, alter university names to make it more readable, and then store the
data frames in.rds format, which is much quicker to read at the subsequent steps.

There are certain names that need cleaning if users look at the names of the school
unique(data e$school) in the dataset, for instance:

Numerical transformation of factor variables:

Numerical transformation

Lastly, save the cleaned data frame with the Shiny.rds extension:

Cleaning Data

3. Shiny App structure

You must install the package and import the library in order to start a Shiny app:

Unit 5 : Creating Advanced Dashboard and Visualization 30


DADS304: Visualization Manipal University Jaipur (MUJ)

Packages

Shiny apps have two crucial parts, which I refer to as the front-end user interface and the
back-end server. R.

To design the front end of our web application, we use the ui.R programme.

This is a basic illustration of illustration of our dashboard’s user interface:

Code for front end

Unit 5 : Creating Advanced Dashboard and Visualization 31


DADS304: Visualization Manipal University Jaipur (MUJ)

Server code

Unit 5 : Creating Advanced Dashboard and Visualization 32


DADS304: Visualization Manipal University Jaipur (MUJ)

Result for the code

4. Visualizations

The majority of visualizations are created using Plotly, and Kable is used to present the
data tables.

Unit 5 : Creating Advanced Dashboard and Visualization 33


DADS304: Visualization Manipal University Jaipur (MUJ)

Visualization code

Result:

Monthly median Salary

5. Deployment of dashboard on https://www.shinyapps.io/ server.

Steps to create deploy on dashboard are:

By assuming that server.R and ui.R are located in the same folder.

Unit 5 : Creating Advanced Dashboard and Visualization 34


DADS304: Visualization Manipal University Jaipur (MUJ)

1. Run the following programme from your R file:

install.packages('rsconnect')

library (rsconnect)

2. Create an account on shinyapps.io. Please be aware that all of your apps on shinyapps.io
will use your account name as the domain name.
3. Obtain the website-generated token when you log in to shinyapps.io:

Website generated

4. Set up your account so the rsconnect package may utilize it. Bringing up the website
Click the display button on the token page to start. A popup displaying the whole
command to set up your account with the correct arguments for the
rsconnect::setAccountInfo function will appear. To use this command, copy it to your
clipboard, paste it into RStudio's command line, and press enter.

Setting up account

Unit 5 : Creating Advanced Dashboard and Visualization 35


DADS304: Visualization Manipal University Jaipur (MUJ)

5. Release the app. Use the following codes to deploy your application:

library(rsconnect)

deployApp()

Next, in R, select the Publish button (from ui.R or server.R).

Publish

Congratulations when the deployment is complete! Your initial Shiny App has been released.
You can publish the app again after making modifications to the server.R or ui.R files.

2. ADVANCED VISUALIZATION
1. HEAT MAP

A heatmap is a graphical depiction of data that uses color coding to illustrate different
values. Although heatmaps can be used for many different types of analytics, they are
most frequently used to display user behavior on certain webpages or web page
layouts.

Example:By using iris dataset the Heat map will be visualized.

Unit 5 : Creating Advanced Dashboard and Visualization 36


DADS304: Visualization Manipal University Jaipur (MUJ)

Code to implement different color in Heat map

Output:

Heat Map

Unit 5 : Creating Advanced Dashboard and Visualization 37


DADS304: Visualization Manipal University Jaipur (MUJ)

2. MAP VISUALIZATION

Through map visualisation, spatially pertinent data is examined, shown, and presented.
This type of data expression is more comprehensible and transparent. The distribution
or percentage of data in each area can be seen visually.

Example:The UScensus2010 was used to compile the dataset counties.rds, which


contains demographic information for every county in the country.

Code to implement Map visualization

Unit 5 : Creating Advanced Dashboard and Visualization 38


DADS304: Visualization Manipal University Jaipur (MUJ)

Output:

Map Visualization

13. REFERENCE

• https://appsilon.com/how-i-built-an-interactive-shiny-dashboard-in-2-days-without-
any-experience-in-r/
• https://towardsdatascience.com/end-to-end-dashboard-in-r-shiny-app-64c40d0351d8
• https://avikarn.com/2020-05-26-correlation_shiny/
• https://shiny.rstudio.com/gallery/covid19-tracker.html

Unit 5 : Creating Advanced Dashboard and Visualization 39


DADS304: Visualization Manipal University Jaipur (MUJ)

MASTER OF BUSINESS ADMINISTRATION


SEMESTER 3

DADS304
VISUALIZATION

Unit 6: Introduction to Tableau 1


DADS304: Visualization Manipal University Jaipur (MUJ)

Unit 6
Introduction to Tableau

Table of Contents

SL Topic Fig No / SAQ / Page No


No Table / Activity
Graph
1 Introduction
3
1.1 Learning Objectives
2 Tableau Features 4
3 What is Data Visualization? 1 5-7
4 History of Tableau 7
5 Getting Started with tableau 1, 2, 3, 4, 5, 6
8-10
5.1 Installation Tableau Desktop
6 Tableau File Types 7 2 11
7 Data: Joining and Blending 3
7.1 Data Connection with Data Sources 8
12-15
7.2 How to Load and display File in Tableau? 9
7.3 Simple Demo Graph Representation 10
8 Glossary 15
9 Terminal Questions 15-18
10 Summary 19
11 Concept map 19
12 Answers 20
13 Reference 20

Unit 6: Introduction to Tableau 2


DADS304: Visualization Manipal University Jaipur (MUJ)

1. INTRODUCTION
Tableau is a data visualization tool or business intelligence tool/software which analyzes
and displays data in a chart or report easily. It is very easy to use because it does not require
any coding skill.

Users can build and distribute an interactive and shareable dashboard, which shows the
trends, variations, and density of the data in the form of graphs and charts and tables.
Tableau can connect to files, relational and Big Data sources to acquire and preprocess data.
The software allows combining data from multiple sources and real-time collaboration,
which makes it unique. It is used by businesses, academic researchers, and many
government organizations for visual data analysis. It is also positioned in top Business
Intelligence and Analytics Platform in Gartner Magic Quadrant

Tableau offers numerous appealing and distinctive features that make it a top tool for data
visualization. You can quickly get the answers to crucial queries thanks to its robust data
finding and exploration application. Tableau’s drag and drop interface makes it simple to
explore various views, merge numerous databases, and visualize any type of data. It doesn’t
need any difficult scripting. Anyone who is familiar with the business issues can solve them
by visualizing the pertinent facts. Sharing results with others is as simple as publishing to
Tableau Server after analysis.

1.1 Learning Objectives


❖ Data visualization basics.
❖ Tableau software and its applications in different industries.

Unit 6: Introduction to Tableau 3


DADS304: Visualization Manipal University Jaipur (MUJ)

2. TABLEAU FEATURES
Tableau offers solutions for various departments, industries, and data settings. The following
special attributes allow tableau to handle a variety of scenarios.

Speed of Analysis Since it does not call for a high level of programming knowledge, any user
with access to data can begin utilizing it to extract value from the data.
• Tableau is self-sufficient in that it doesn’t require a convoluted software setup. Most
customers utilize the desktop version, which is simple to install and has all the
functionality required to begin and finish data analysis.
• The user investigates and evaluates the data using visual tools including colors, trend
lines, charts, and graphs. Almost everything can be done by drag and drop, thus there
is very little script to write.
• Integrate Diverse Data Sets Tableau enables you to instantly combine many relational,
semi-structured, and unstructured data sources without incurring high upfront
integration expenses.
• Tableau operates on all types of devices where data flows, regardless of architecture.
As a result, the user does not have to be concerned with specific hardware or software
requirements to utilize Tableau.
• Real-Time Collaboration – Tableau can embed a live dashboard in portals like
Salesforce or a SharePoint site and filter, sort, and debate data instantly. By simply
refreshing their web browser, colleagues can see the most recent data by subscribing
to your interactive dashboards and saving your perspective of the data.
• All the organization’s published data sources may be managed in one place thanks to
Tableau server. In one simple place, you can remove, modify permissions, add tags, and
manage schedules. Extract refreshes may be easily scheduled and managed on the data
server.

Unit 6: Introduction to Tableau 4


DADS304: Visualization Manipal University Jaipur (MUJ)

3. WHAT IS DATA VISUALIZATION?


Data visualization uses visual components like graphs, charts, and maps to graphically
portray quantitative information and data. Data visualization turns both huge and small data
sets into graphics that are simple for people to comprehend and process. Data outliers,
patterns, and trends can be easily understood with data visualization tools. The tools and
technology for data visualization are essential in the realm of big data because they allow for
the analysis of enormous amounts of data.

Top Data Visualization Tools


Tableau: For visual analytics, there is a tableau desktop application. You can visualize your
reports online and on mobile devices with a server option if you don't want to install tableau
software on your desktop. For individuals who desire the server solution but don't want to
set it up manually, a cloud-hosted service is also an option. Citrix, Pandora, and Barclays are
some of Tableau's clients.

Infogram: With Infogram, even non-designers can produce powerful data visualizations for
marketing reports, infographics, social media posts, maps, dashboards, and more. Infogram
is a fully featured drag-and-drop visualization tool. The following file types can be used to
export finished visualizations: PNG,.JPG,.GIF,.PDF, and.HTML. Additionally, interactive
visualizations are feasible and ideal for integrating into websites and applications.
Additionally, Infogram provides a WordPress plugin that streamlines the process of
integrating visualizations for WordPress users.

Power BI: A business intelligence (BI) platform called Microsoft Power BI gives non-
technical business people the means to gather, analyze, visualize, and share data. With its
strong interaction with other Microsoft products, Power BI is a versatile self-service tool that
requires little initial training. Its user interface is intuitive for Excel users.

Chartblocks: According to ChartBlocks, data may be loaded using their API from
"everywhere," even live streams. Even though they claim that it only takes a few clicks to
import data from any source, it is undoubtedly more difficult to use than other programs that
have automatic modules or extensions for particular data sources. The final representation
produced by the software can be heavily customized, and the chart construction wizard

Unit 6: Introduction to Tableau 5


DADS304: Visualization Manipal University Jaipur (MUJ)

assists users in selecting the ideal data for their charts before importing the data. A major
benefit for data visualization designers who wish to embed charts into websites that are
likely to be viewed on several devices is that designers may construct almost any type of
chart, and the output is responsive.

Data wrapper: To include charts and maps in news reports, Data wrapper was developed.
The produced maps and charts can be embedded on news websites and are interactive.
However, they only have a few data sources, and the main approach is to copy and paste data
into the program. Charts can be produced after data has been imported with only one click.
They use a variety of visualizations, including choropleth and symbol maps, column, line, and
bar charts, election donuts, area charts, scatter plots, and locator maps. The final visuals
resemble those that may be found on websites like the New York Times or Boston Globe. In
fact, magazines like Mother Jones, Fortune, and The Times use their charts.

Plotly: An interactive, open-source, and browser-based Python graphing toolkit is called


plotly.py.Plotly.py is a high-level, declarative charting framework that is built on top of
plotly.js. Plotly.js comes with more than 30 different chart kinds, including financial charts,
scientific charts, 3D graphs, and more. Plotly is MIT licensed software. Plotly graphs can be
seen in standalone HTML files, Jupyter notebooks, or Dash programmers’ advice, dashboard
creation, app integration, and feature requests, get in touch with us.

RAW: RAW, also referred to as RawGraphs, operates on delimited data such as TSV or CSV
files. It acts as a bridge between spreadsheets and data visualization. Despite being a web-
based application, RawGraphs offers strong data protection and offers a variety of
unconventional and traditional layouts.

Data Visualization's Value


Because of how the human brain processes information, data visualization is crucial. It is
more pleasant to study graphs and charts than spreadsheets and reports when visualizing
vast amounts of complex data sets.

An efficient and rapid technique to communicate ideas to everyone is through data


visualization. By making a small tweak, you can try out a new outline.

Unit 6: Introduction to Tableau 6


DADS304: Visualization Manipal University Jaipur (MUJ)

SELF-ASSESSMENT QUESTIONS – 1

1. Tableau is a coding language for software. True/False


2. Tableau was introduced in the year of ______________________.

4. HISTORY OF TABLEAU
Stanford University students Pat Hanrahan, Christian Chabot, and Chris Stolte launched
Tableau in 2003. Making the database industry dynamic and comprehensive was the
fundamental motivation for its inception. Tableau debuted at a time when Cognos, Microsoft
Excel, and Business Objects were already well-known brands.

The main features that led Tableau Software to achieve success are:
• VizQL is the language that powers it, increasing the flexibility to pull data from any
source.
• Provide the user with the ability to alter Tableau reports using a variety of visualization
tools.
• The drag-and-drop method can be used to create any complex graphs or maps.
• Multiple platforms allow for the insertion of Tableau data visualizations.
• Real-time data analysis and visualization are both possible.

Unit 6: Introduction to Tableau 7


DADS304: Visualization Manipal University Jaipur (MUJ)

5. GETTING STARTED WITH TABLEAU


Installation Tableau Desktop
Download Tableau Desktop’s Free Personal Edition from Tableau Desktop. To download,
you must first register with your information. You must accept the licensing agreement and
specify the installation target folder during the very simple installation process that follows
downloading. The whole setup procedure is detailed in the stages and screenshots that
follow.

1. Start installation and click on ”run” when asked

Fig 1.1: Starting installation

2. Accept the License Agreement

Fig 1.2: Accept license

Unit 6: Introduction to Tableau 8


DADS304: Visualization Manipal University Jaipur (MUJ)

3. Start Trial

Fig 1.3: Start trial

4. Provide Your Details

Fig 1.4: Details

Unit 6: Introduction to Tableau 9


DADS304: Visualization Manipal University Jaipur (MUJ)

5. Registration Complete

Fig 1.5: Installation done

6. Verify the Installation

Fig 1.6: Verification

Unit 6: Introduction to Tableau 10


DADS304: Visualization Manipal University Jaipur (MUJ)

6. TABLEAU FILE TYPES


After data processing, Tableau’s output can be saved in a variety of formats and then
delivered across several platforms. The several distinct extensions serve to distinguish the
various types of different file categories. Their length varies depending on how they are
produced and how they are used.

Fig 1.7: Types of Tableau files

SELF-ASSESSMENT QUESTIONS – 2

3. What are the different Tableau files?


4. Tableau was founded by______________________.

Unit 6: Introduction to Tableau 11


DADS304: Visualization Manipal University Jaipur (MUJ)

7. DATA: JOINING AND BLENDING


7.1 Data Connection with Data Sources
All readily available and widely used data sources can be connected to Tableau. It can
connect to text files, Excel files, PDF files, and more. Using its ODBC connector, it can also
establish connections to various databases. Tableau can connect to servers and web
connectors.

Tableau’s native connectors can connect to the following types of data sources:
• File Systems: Such as Microsoft Excel, CSV, etc.
• Cloud Systems: Such as Google big Query, Windows Azure, etc.
• Relational System: Such as Microsoft SQL Server, Oracle, DB2, etc.
• Other Sources: It uses ODBC.

Fig 1.8: Different import options

7.2 How to Load and display File in Tableau?


The steps to load and display data are as follows:
• Step 1: Select dashboard from the new dashboard option.
• Step 2: Right-click on any open tab in the workbook.
• Step 3: Select New Dashboard from the menu.

Unit 6: Introduction to Tableau 12


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 1.9: Different Views of data

7.3 Simple Demo Graph Representation

Fig 1.10: Bar Graph Comparison

Tableau Desktop
Data from several sources may be connected using Tableau Desktop to create dashboards,
stories, and workbooks. You can publish the workbooks on the Tableau website and share
all the insights with other users using the Tableau Desktop.

Without developing any code, a user of Tableau Desktop can do direct queries on the
datasets. You only need to enter in the visualizations, such as a chart, table, graph, or map,
and then write the columns you wish to include. Additionally, Tableau Desktop creates
dashboards that mix numerous views from various data sources.

Tableau Public
This Tableau version was created with budget-conscious consumers in mind. The phrase
"Public" indicates that the generated workbooks cannot be locally saved. They ought to be
stored on Tableau's public cloud, which anyone can access and observe.

Unit 6: Introduction to Tableau 13


DADS304: Visualization Manipal University Jaipur (MUJ)

The files stored in the cloud have no privacy, therefore anyone can view and download the
same information. For people who want to study Tableau and for those who wish to publish
their data with the world, this version is optimal.

Tableau Online
Although its functionality is comparable to that of the tableau server, data is kept on servers
hosted in the cloud that are managed by the Tableau group.

The data that is made available via Tableau Online can be stored indefinitely. Over 40 cloud-
hosted data sources, including Hive, MySQL, Spark SQL, Amazon Aurora, and many more, are
directly connected via Tableau Online.

The workbooks produced by Tableau Desktop must be published for Tableau Server and
Tableau Online to function. Google Analytics and Salesforce.com are two web programs that
Tableau Server and Tableau Online may access data from.

Tableau Server
The software is properly utilized to distribute workbooks and visualizations produced by
the Tableau Desktop application around the company. You must publish your worksheet in
Tableau Desktop before sharing dashboards on the Tableau Server. Only the authorized
users will have access to the worksheet once it has been uploaded to the server.

Authorized users don't necessarily need to have Tableau Server installed on their computers.
They merely need the login information in order to examine reports using a web browser.
Tableau Server's high level of security is advantageous for efficient and speedy data
exchange.

The organization's administrator has complete control over the server. Both the software
and the hardware are maintained by the organization.

Tableau Reader
We may view the visualizations and workbooks made using Tableau Desktop or Tableau
Public using the free utility Tableau Reader. Filtering the data is possible, but changes and
editing are limited. Tableau Reader has no security because anyone may use it to read
workbooks.

Unit 6: Introduction to Tableau 14


DADS304: Visualization Manipal University Jaipur (MUJ)

The recipient of the dashboards you build themselves must have Tableau Reader to read the
file.

SELF-ASSESSMENT QUESTIONS – 3

5. Tableau is _______________Software?
6. Tableau is used to find insight from data. True/False
7. Tableau integrates _______________data sources.

8. GLOSSARY
• Visualization: finding insight from data
• Import data: Load the data into tableau for processing

9. TERMINAL QUESTIONS
1. What are the five basic Features of any data visualization Software?
Tableau offers solutions for various departments, industries, and data settings. The following
special attributes allow Tableau to handle a variety of scenarios.

Speed of Analysis Since it does not call for a high level of programming knowledge, any user
with access to data can begin utilizing it to extract value from the data.

• Tableau is self-sufficient in that it doesn’t require a convoluted software setup. Most


customers utilize the desktop version, which is simple to install and has all the
functionality required to begin and finish data analysis.
• The user investigates and evaluates the data using visual tools including colors, trend
lines, charts, and graphs. Almost everything can be done by drag and drop, thus there
is very little script to write.
• Integrate Diverse Data Sets Tableau enables you to instantly combine many relational,
semi-structured, and unstructured data sources without incurring high upfront
integration expenses.

Unit 6: Introduction to Tableau 15


DADS304: Visualization Manipal University Jaipur (MUJ)

• Tableau operates on all types of devices where data flows, regardless of architecture.
As a result, the user does not have to be concerned with specific hardware or software
requirements to utilize Tableau.
• Real-Time Collaboration – Tableau can embed a live dashboard in portals like
Salesforce or a SharePoint site and filter, sort, and debate data instantly. By simply
refreshing their web browser, colleagues can see the most recent data by subscribing
to your interactive dashboards and saving your perspective of the data.
• All of the organization’s published data sources may be managed in one place thanks to
Tableau server. In one simple place, you can remove, modify permissions, add tags, and
manage schedules. Extract refreshes may be easily scheduled and managed on the data
server.
2. Explain Different types of files used in tableau.
After data processing, Tableau’s output can be saved in a variety of formats and then
delivered across several platforms. The several distinct extensions serve to distinguish the
various types of different file categories. Their length varies depending on how they are
produced and how they are used.

3. Explain Joining and blending in tableau?


All readily available and widely used data sources can be connected to Tableau. It can
connect to text files, Excel files, PDF files, and more. Using its ODBC connector, it can also
establish connections to various databases. Tableau can connect to servers and web
connectors.

Unit 6: Introduction to Tableau 16


DADS304: Visualization Manipal University Jaipur (MUJ)

Tableau’s native connectors can connect to the following types of data sources:
• File Systems: Such as Microsoft Excel, CSV, etc.
• Cloud Systems: Such as Google big Query, Windows Azure, etc.
• Relational System: Such as Microsoft SQL Server, Oracle, DB2, etc.
• Other Sources: It uses ODBC.
4. Explain Tableau and Other Visualization tools.
Data visualization uses visual components like graphs, charts, and maps to graphically
portray quantitative information and data. Data visualization turns both huge and small data
sets into graphics that are simple for people to comprehend and process. Data outliers,
patterns, and trends can be easily understood with data visualization tools. The tools and
technology for data visualization are essential in the realm of big data because they allow for
the analysis of enormous amounts of data.

Top Data Visualization Tools


• Tableau: For visual analytics, there is a tableau desktop application. You can visualize
your reports online and on mobile devices with a server option if you don't want to
install tableau software on your desktop. For individuals who desire the server solution
but don't want to set it up manually, a cloud-hosted service is also an option. Citrix,
Pandora, and Barclays are some of Tableau's clients.
• Infogram: With Infogram, even non-designers can produce powerful data
visualizations for marketing reports, infographics, social media posts, maps,
dashboards, and more. Infogram is a fully featured drag-and-drop visualization tool.
The following file types can be used to export finished visualizations:
PNG,.JPG,.GIF,.PDF, and.HTML. Additionally, interactive visualizations are feasible and
ideal for integrating into websites and applications. Additionally, Infogram provides a
WordPress plugin that streamlines the process of integrating visualizations for
WordPress users.
• Power BI: A business intelligence (BI) platform called Microsoft Power BI gives non-
technical business people the means to gather, analyze, visualize, and share data. With
its strong interaction with other Microsoft products, Power BI is a versatile self-service
tool that requires little initial training. Its user interface is intuitive for Excel users.

Unit 6: Introduction to Tableau 17


DADS304: Visualization Manipal University Jaipur (MUJ)

• Chartblocks: According to ChartBlocks, data may be loaded using their API from
"everywhere," even live streams. Even though they claim that it only takes a few clicks
to import data from any source, it is undoubtedly more difficult to use than other
programs that have automatic modules or extensions for particular data sources. The
final representation produced by the software can be heavily customized, and the chart
construction wizard assists users in selecting the ideal data for their charts before
importing the data. A major benefit for data visualization designers who wish to embed
charts into websites that are likely to be viewed on several devices is that designers
may construct almost any type of chart, and the output is responsive.
• Data wrapper: To include charts and maps in news reports, Data wrapper was
developed. The produced maps and charts can be embedded on news websites and are
interactive. However, they only have a few data sources, and the main approach is to
copy and paste data into the program. Charts can be produced after data has been
imported with only one click. They use a variety of visualizations, including choropleth
and symbol maps, column, line, and bar charts, election donuts, area charts, scatter
plots, and locator maps. The final visuals resemble those that may be found on websites
like the New York Times or Boston Globe. In fact, magazines like Mother Jones, Fortune,
and The Times use their charts.
• Plotly: An interactive, open-source, and browser-based Python graphing toolkit is
called plotly.py.Plotly.py is a high-level, declarative charting framework that is built on
top of plotly.js. Plotly.js comes with more than 30 different chart kinds, including
financial charts, scientific charts, 3D graphs, and more. Plotly is MIT licensed software.
Plotly graphs can be seen in standalone HTML files, Jupyter notebooks, or Dash
programmers’ advice, dashboard creation, app integration, and feature requests, get in
touch with us.
• RAW: RAW, also referred to as RawGraphs, operates on delimited data such as TSV or
CSV files. It acts as a bridge between spreadsheets and data visualization. Despite being
a web-based application, RawGraphs offers strong data protection and offers a variety
of unconventional and traditional layouts.

Unit 6: Introduction to Tableau 18


DADS304: Visualization Manipal University Jaipur (MUJ)

10. SUMMARY
Tableau offers numerous appealing and distinctive features that make it a top tool for data
visualization. You can quickly get the answers to significant queries because of its robust
data finding and exploration application. Any data may be visualized using Tableau's drag
and drop interface.

Explore various perspectives, and even seamlessly integrate numerous databases. It doesn't
need any difficult scripting. Anyone who is familiar with the business issues can solve them
by visualizing the pertinent facts. Sharing results with others is as simple as publishing to
Tableau Server after analysis.

11. CONCEPT MAP

Tableau Types

Tableau
Tableau For
visualization

Tableau For file


Tableau Features
types

Unit 6: Introduction to Tableau 19


DADS304: Visualization Manipal University Jaipur (MUJ)

11. ANSWERS
Self-Assessment Questions
1. False
2. 2003
3. Workbooks, Bookmarks, Packaged Workbooks
4. Chris Stolte
5. data visualization
6. True
7. Multiple

12. REFERENCE
• Practical Tableau: 100 Tips, Tutorials, and Strategies from a Tableau Zen Master
• Tableau Your Data! Fast and Easy Visual Analysis with Tableau Software

Unit 6: Introduction to Tableau 20


DADS304: Visualization Manipal University Jaipur (MUJ)

MASTER OF BUSINESS ADMINISTRATION


SEMESTER 3

DADS304
VISUALIZATION

Unit 7: Editing, Building Views and Formatting 1


DADS304: Visualization Manipal University Jaipur (MUJ)

Unit 7
Editing, Building Views and Formatting

Table of Contents

SL Topic Fig No / SAQ / Page No


No Table / Activity
Graph
1 Introduction
3
1.1 Learning Objective
2 Edit Data sources and Extracts 1, 2, 3, 4, 5, 6, 1
4-10
7, 8
3 Replace and Metadata 9, 10, 11, 12, 2
11-15
13, 14, 15
4 Building Views and Formatting Views 16, 17, 18, 19, 3
20, 21, 22, 23,
16-26
24, 25, 26, 27,
28, 29
5 Case studies 27-28
6 Terminal Questions 30, 31, 32, 33 29-31
7 Answers 32
8 Summary 32
9 Glossary 32
10 Concept map 32
11 Reference 33

Unit 7: Editing, Building Views and Formatting 2


DADS304: Visualization Manipal University Jaipur (MUJ)

1. INTRODUCTION
In Tableau Cloud or Tableau Server, you can make changes to a view if you can see the Edit
button when viewing it. You can: Depending on your permissions and access level, edit a
workbook that has already been published and include worksheets for views, dashboards,
and stories.

A new worksheet should be created and edited using a published data source.

In the web or by launching the workbook in Tableau Desktop, you can edit an existing
workbook and add worksheets. While editing, connect to several published data sources. See
Connect to Published Data Sources while Web Editing for more information.

1.1 Learning Objective


❖ Edit Data sources and Extracts
❖ Replace and Metadata
❖ Building Views and Formatting Views

Unit 7: Editing, Building Views and Formatting 3


DADS304: Visualization Manipal University Jaipur (MUJ)

2. EDIT DATA SOURCES AND EXTRACTS


Tableau can connect to all of the most popular data sources. Tableau's native connectors can
connect to the data sources listed below.
• File formats such as CSV, Excel, and others.
• Relational databases such as Oracle, SQL Server, DB2, and others.
• Cloud computing platforms such as Windows Azure, Google BigQuery, and others.
• Other ODBC-enabled sources

Live Connection
The Connect Live feature is used to analyse data in real time. Tableau connects to a real-time
data source and continues to read the data in this case. As a result, the analysis result is up
to the second, and the most recent changes are reflected in the result. However, the source
system is burdened because it must continue to send data to Tableau.

In-Memory
Tableau can also process data in-memory by caching it in memory and not connecting to the
source while analysing it. Of course, the amount of data cached will be limited by the amount
of memory available.

Combine Data Sources


Tableau can connect to multiple data sources at once. For example, you can define multiple
connections in a single workbook to connect to a flat file and a relational source. This is used
in data blending, which is a very unique Tableau feature.

Tableau data extraction creates a subset of data from the data source. This is useful for
improving performance by using filters. It also aids in applying Tableau features to data that
may not be available in the data source, such as finding distinct values in the data. The data
extract feature, on the other hand, is most commonly used to create an extract to be stored
on the local drive for offline access by Tableau.

Unit 7: Editing, Building Views and Formatting 4


DADS304: Visualization Manipal University Jaipur (MUJ)

Steps to handle extracting data are as following:


1. Creating an Extract
Data extraction is accomplished by selecting Data Extract Data from the menu. It provides
many options, such as limiting the number of rows extracted and deciding whether to
aggregate data for dimensions.

2. Applying Extract Filters


Create filters that return only the relevant rows to extract a subset of data from the data
source. Consider the Sample Superstore data set and extract it. Choose Select from list in the
filter option and check the checkbox value for which you want to pull data from the source.

3. 3.Adding New Data to the Extract


To add more data to an already created extract, use the Data Extract Append Data from File
option. In this case, browse to the data file and click OK to finish. Of course, the number and
type of columns in the file must match the existing data.

4. Extract History: You can check the history of data extracts to see how many times the
extract has occurred and when.

You can do this by going to the Data Extract History menu.

Practical steps:
You have the option to alter the workbook's data source at any time while conducting your
study.

How to edit the data source


1. Select Edit Data Source after choosing a data source from the Data menu.
2. Make the necessary modifications to the data source on the data source page.

Unit 7: Editing, Building Views and Formatting 5


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 7.1: Edit Data Source

Perhaps you should change the data source to:


1. Gather Your Data
2. Link to a Special SQL Query (Tableau Desktop)
3. Employ a stored procedure (Tableau Desktop)

How to Extract Data


1. Go to the sheet tab,
2. Click on Data
3. Select Extract Data

Unit 7: Editing, Building Views and Formatting 6


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 7.2: Edit Data Source from data

Here, you can extract data from a single table or many tables. You can also extract data based
on the number of rows.

Fig 7.3: Extract specific data from dataset

Unit 7: Editing, Building Views and Formatting 7


DADS304: Visualization Manipal University Jaipur (MUJ)

There is an option to pick the number of rows from the dataset you want to extract. check
the below diagram.

Fig 7.4: Extract 35 rows from dataset

Fig 7.5: Save the Extracted Data file

Unit 7: Editing, Building Views and Formatting 8


DADS304: Visualization Manipal University Jaipur (MUJ)

For extracted data, the symbol is modified after extraction.

Fig 7.6: Extracted data with original data

Fig 7.7: Use Extracted Data

Unit 7: Editing, Building Views and Formatting 9


DADS304: Visualization Manipal University Jaipur (MUJ)

You may now choose Extract then History to view the history of Extract.

Fig 7.8: History of Extract

SELF-ASSESSMENT QUESTIONS – 1

1. A powerful feature of Tableau is _____________.


2. _______________is the Tableau File Extension.

Unit 7: Editing, Building Views and Formatting 10


DADS304: Visualization Manipal University Jaipur (MUJ)

3. REPLACE AND METADATA


Tableau captures the metadata details of the data source, such as the columns and data types,
after connecting to it. This is what is used to generate the dimensions, measures, and
calculated fields that are used in views. You can browse the metadata and modify some of its
properties to meet your specific needs.

• Examining the Metadata:


Tableau displays all possible tables and columns in a data source after connecting to it.

• Changing the Data Type


If necessary, you can change the datatype of some of the fields. Tableau may fail to recognize
the data type from the source depending on the nature of the source data. In such cases, we
can manually change the data type.

• Renaming and concealment


The renaming option allows you to change the names of the columns. You can also hide a
column so that it does not appear in your data view. By selecting the data type icon in the
metadata grid, you can access these options.

• Column Alias
Each column in the data source can be given an alias, which aids in understanding the
column's nature.

Practical steps with example:


Drag and drop fields from the data window into the worksheet's cards and shelves to create
a data display.

Instead of manually creating views by dragging and dropping files, you can utilize show me
to do so.
1) For better understanding, you might want to automatically build views.
2) To make time.

Unit 7: Editing, Building Views and Formatting 11


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 7.9: View of data

Fig 7.10: Data tab to replace Data Source

Unit 7: Editing, Building Views and Formatting 12


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 7.11: Replace Data Source

Fig 7.12: Select other Data Source

As you can see in the graphic, the rows and columns are not identifiable after changing the
data source.

Unit 7: Editing, Building Views and Formatting 13


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 7.13: Rows and Columns not Identified

You can now alter the reference according to the requirement by going to the specific
dimension or measure.

Fig 7.14: Change the references

Unit 7: Editing, Building Views and Formatting 14


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 7.15: After Changing the references

SELF-ASSESSMENT QUESTIONS – 2

3. Can Data Blending be used to perform all kinds of joins? YES/NO


4. Is it possible to aggregate dimensions? YES/NO

Unit 7: Editing, Building Views and Formatting 15


DADS304: Visualization Manipal University Jaipur (MUJ)

4. BUILDING VIEWS AND FORMATTING VIEWS


A custom data view is used to supplement standard data views with additional features,
allowing the view to provide different types of charts for the same underlying data. You can,
for example, drill down a dimension field that is part of a pre-defined hierarchy to obtain
additional values of the measures at a different granularity. The following are some of the
most commonly used and important custom data views provided by Tableau.

Drill Down View


You usually need to know the result of analysis for the next or previous level of aggregation
for dimension fields that are part of a hierarchy. For example, if you know the result for a
quarter, you may be curious about the results for each month in that quarter, and you may
even require the result for each week. This is an example of drilling down to a finer level of
granularity from existing dimensions. Right-click a table header and select Drill Down from
the context menu to drill down and up for individual dimension members in a hierarchy.
Consider a bar chart where the dimension category is in the columns shelf and the measure
Sales is in the rows shelf.

Swapping Dimensions
By swapping the positions of the dimensions, you can create a new view from an existing
one. This has no effect on the values of the measures, but it does change their position.
Consider a view for analyzing Profit for each year for each segment and product category.
You can drag the vertical line at the end of the category column to the segment column by
clicking and dragging it. The following screenshot depicts this action.

Data Joining
Data joining is a common requirement in all types of data analysis. You may need to join data
from multiple sources or from different tables within the same source. Tableau includes the
ability to join tables by using the data pane found under Edit Data Source in the Data menu.

Data blending
Tableau's Data Blending feature is extremely powerful. It is used when you want to analyze
related data from multiple data sources in a single view. Consider the following scenario:
Sales data is stored in a relational database, and Sales Target data is stored in an Excel

Unit 7: Editing, Building Views and Formatting 16


DADS304: Visualization Manipal University Jaipur (MUJ)

spreadsheet. To compare actual sales to target sales, you can now blend the data using
common dimensions to access the Sales Target measure. Primary and secondary data
sources are the two sources used in data blending. A left join is formed between the primary
and secondary data sources, with all data rows from the primary and matching data rows
from the secondary data source.

Practical steps are as follows:


After altering the reference, the names of the columns and rows are identified, and the graph
is displayed.

Fig 7.16: Graph enabled

You can take multiple column in the column based on visualization.

Unit 7: Editing, Building Views and Formatting 17


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 7.17: Multiple attributes on the axis

In addition to this, you can make a folder for a certain kind of work. For instance, you can
make a folder for time-related information.

Fig 7.18: Create a Folder for Specific attributes

You may also change the type of data below.

Unit 7: Editing, Building Views and Formatting 18


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 7.19: Change the type of data

Self-Assessment Questions - 3
5. What Type of Join Is Used in Blending?
You can now drag several measures and dimensions (attributes) to the axis in columns or
rows.
When you select an attribute on an axis in Tableau, it will recommend the type of visuals that
would fit the data the best from the show me alternatives.
whether it is a table, pie chart, bar graph, etc.
Here you can select color from the default properties to change the color of the visuals.
You can select aggregation type min, max, sum, count for the visuals.
You can add labels to drag the specific attributes to the text marks for the visuals.

6. Explain types of data sources supported in tableau.


Live Connection
The Connect Live feature is used to analyse data in real time. Tableau connects to a real-time
data source and continues to read the data in this case. As a result, the analysis result is up
to the second, and the most recent changes are reflected in the result. However, the source
system is burdened because it must continue to send data to Tableau.
In-Memory

Unit 7: Editing, Building Views and Formatting 19


DADS304: Visualization Manipal University Jaipur (MUJ)

Tableau can also process data in-memory by caching it in memory and not connecting to the
source while analysing it. Of course, the amount of data cached will be limited by the amount
of memory available.
Combine Data Sources
Tableau can connect to multiple data sources at once. For example, you can define multiple
connections in a single workbook to connect to a flat file and a relational source. This is used
in data blending, which is a very unique Tableau feature.

7. Explain types of metadata operations supported in tableau?


Tableau captures the metadata details of the data source, such as the columns and data types,
after connecting to it. This is what is used to generate the dimensions, measures, and
calculated fields that are used in views. You can browse the metadata and modify some of its
properties to meet your specific needs.

• Examining the Metadata:


Tableau displays all possible tables and columns in a data source after connecting to it.

• Changing the Data Type


If necessary, you can change the datatype of some of the fields. Tableau may fail to recognize
the data type from the source depending on the nature of the source data. In such cases, we
can manually change the data type.

• Renaming and concealment


The renaming option allows you to change the names of the columns. You can also hide a
column so that it does not appear in your data view. By selecting the data type icon in the
metadata grid, you can access these options.

• Column Alias
Each column in the data source can be given an alias, which aids in understanding the
column's nature.

8. Explain view in tableau and which type of views are supported in tableau
A custom data view is used to supplement standard data views with additional features,
allowing the view to provide different types of charts for the same underlying data. You can,

Unit 7: Editing, Building Views and Formatting 20


DADS304: Visualization Manipal University Jaipur (MUJ)

for example, drill down a dimension field that is part of a pre-defined hierarchy to obtain
additional values of the measures at a different granularity. The following are some of the
most commonly used and important custom data views provided by Tableau.

1. Drill Down View


You usually need to know the result of analysis for the next or previous level of aggregation
for dimension fields that are part of a hierarchy. For example, if you know the result for a
quarter, you may be curious about the results for each month in that quarter, and you may
even require the result for each week. This is an example of drilling down to a finer level of
granularity from existing dimensions. Right-click a table header and select Drill Down from
the context menu to drill down and up for individual dimension members in a hierarchy.
Consider a bar chart where the dimension category is in the columns shelf and the measure
Sales is in the rows shelf.

2. Swapping Dimensions
By swapping the positions of the dimensions, you can create a new view from an existing
one. This has no effect on the values of the measures, but it does change their position.
Consider a view for analyzing Profit for each year for each segment and product category.
You can drag the vertical line at the end of the category column to the segment column by
clicking and dragging it. The following screenshot depicts this action.

Unit 7: Editing, Building Views and Formatting 21


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 7.20: Change color of visuals

Fig 7.21: Change Aggregation type

Unit 7: Editing, Building Views and Formatting 22


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 7.22: Changes saved to Data Sources

Fig 7.23: Add label to visuals

You can View the map for different regions here show me option to change the visual type.

Unit 7: Editing, Building Views and Formatting 23


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 7.24: Map Visual for Data

Fig 7.25: Horizontal bar Visual for Data

By clicking on the label and selecting "Show Mark Label" you may alter the label's size and
color as desired.

Unit 7: Editing, Building Views and Formatting 24


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 7.26: Horizontal bar Visual for Data

By clicking on the color and selecting "font and shading" you may alter the font's type and
color as desired.

Fig 7.27: Changing color of visuals

By clicking on the filter and selecting TOP you may extract top 3 or 5 data for visuals as per
condition.

Unit 7: Editing, Building Views and Formatting 25


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 7.28: Filter Selection

Fig 7.29: Filter Selection for TOP records

Unit 7: Editing, Building Views and Formatting 26


DADS304: Visualization Manipal University Jaipur (MUJ)

5. CASE STUDIES
J.P. Morgan Chase & Co. is a leading multinational bank and financial services company based
in the USA. It is the largest investment bank in the US and sixth largest in the world.

The challenge
The amount of data produced by the company increased with increasing growth and
expansion of business of the firm because of successful mergers and acquisitions. This led to
the need for a robust self-service data governance and analytics solution. Initially, JPMC had
an IT-owned analytics set-up which they planned to change into a business-owned system12.
The IT department used standard tools like Excel and SQL Server for data analytics and
reporting. But these tools were eventually proved inefficient as they caused confusion and
obscurity in data governance due to data replication.

After this, the company switched to BI tools Cognos and Business Objects, but those tools too
were not able to meet the company’s requirement. What the company needed was a data
governance tool taking care of business aspects like data access, data analysis, IT governance,
and business priorities of the team.

Implementation
JPMC deployed Tableau as its core self-service data governance tool at an enterprise level.
JPMC with the help of COE (The Center of Excellence) team facilitated the process of Tableau
adoption to their users by recruiting a team of 8 trainers who trained 1200 new developers
and analysts to work on Tableau. The training program involved online as well as classroom
learning sessions. The initial user base was 400 Tableau Server users in 2011 which has
grown into a family of 30,000 users today.

Skilled Tableau business users work in teams into different departments and sectors of
JPMC. Currently, there are about 500 teams using Tableau for analytical and data governance
purposes.

The changes
Deploying Tableau for data governance in JP Morgan Chase was a successful step as it
brought many positive changes.

Unit 7: Editing, Building Views and Formatting 27


DADS304: Visualization Manipal University Jaipur (MUJ)

• The marketing operations team could analyze customer data to track customer journey
and preferences which helped them in deciding website design, promotional materials
and launching new products like Chase mobile app.
• Financial and branch managers used Tableau apps to analyze customer data in order
to provide better customer banking experience.
• Tableau has given self-service analytical capabilities to a whole lot of people taking care
of different operations such as traders, risk analysts, compliance team, operations
analysts, sales analysts, etc.
• JPMC was able to reduce manual reporting time. Earlier, the team took months to create
reports but with the help of Tableau, detailed reports were made in weeks. Tableau has
saved a lot of the company’s valuable time and has shifted the focus from report
generation to analyzing the reports, gaining meaningful insights into data and efficient
decision-making.
• Tableau has enabled JPMC to establish stronger customer relationships by integrating
customer’s data with the line of business aspects such as products, marketing, services
and creating common data sets. Thus, Tableau acts as a front-end tool to maintain
customer relations.
• The marketing teams at JPMC can analyze population data using Tableau to determine
optimal targets to launch new campaigns.
• To ensure customer satisfaction by analyzing customer activities and behavior using
call center metrics and website analytics.
• JPMC’s retail branches also use Tableau dashboards to gain a better understanding of
the market and improve their business.
• Tableau successfully created a bridge between IT and business by providing apps and
dashboards for risk analysis and compliance data usage. It also functions as per the
government regulations.

Unit 7: Editing, Building Views and Formatting 28


DADS304: Visualization Manipal University Jaipur (MUJ)

6. TERMINAL QUESTIONS
1. How to extract Data in Tableau?
Tableau creates a subset of data from the data source using data extraction. By implementing
filters, this is helpful in boosting performance. Additionally, it aids in adding Tableau features
to data that may not be present in the data source, such as identifying different values in the
data. The data extract tool is typically used, though, to produce an extract that will be saved
to the local drive for offline access by Tableau.

By selecting Data > Extract Data from the menu, data can be extracted. It offers a variety of
options, including imposing constraints on the number of rows to be extracted and choosing
whether to aggregate data for dimensions. The Extract Data option is displayed on the
following screen.

You can design filters that will only return the pertinent rows if you want to extract a subset
of data from the data source. Create an extract using the Sample Superstore data set as our
example. Select the list under the filter option, then check the box next to the value you want
to pull from the source when the data is pulled.

2. How to edit metadata in Tableau?


Tableau records the metadata information of the data source, including the columns and
their data types, after connecting to the data source. The dimensions, measurements, and
calculated fields used in views are produced using this. You can look through the metadata
and modify some of its characteristics according to your needs.

When Tableau connects to a data source, it displays all the tables and columns that could be
included in the source. To check the metadata, think about the source "Sample Coffee shop."
Select "Connect to a data source" from the Data menu. Search for the "Sample - Coffee shop"
MS Access file. Drag the Product table onto the data canvas. The following screen, which
displays the column names and their data types, appears after selecting the file.

3. Explain Building views and formatting views in tableau.


4.

Unit 7: Editing, Building Views and Formatting 29


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 7.30: Graph enabled

5. You can take multiple column in the column based on visualization.

6.

Fig 7.31: Multiple attributes on the axis

7. In addition to this, you can make a folder for a certain kind of work. For instance, you
can make a folder for time-related information.

Unit 7: Editing, Building Views and Formatting 30


DADS304: Visualization Manipal University Jaipur (MUJ)

8.

Fig 7.32: Create a Folder for Specific attributes

9. You may also change the type of data below.

10.

Fig 7.33 Change the type of data

Unit 7: Editing, Building Views and Formatting 31


DADS304: Visualization Manipal University Jaipur (MUJ)

7. ANSWERS
Self-Assessment Questions
1. Data blending
2. twbx
3. YES
4. No
5. Left join

8. SUMMARY
All the most well-liked data sources are accessible through Tableau. The data sources listed
below can be connected to using Tableau's native connectors. file types including Excel, CSV,
and others. databases that are relational, including Oracle, SQL Server, DB2, and others.
Platforms for cloud computing include Windows Azure, Google Big Query, and others.

9. GLOSSARY
• Visualization: finding insight from data
• Import data: Load the data into tableau for processing
• Replace Data: Change the data source

10. CONCEPT MAP

Editing Views Changes in tableau


views

Building Views Formatting Views

Unit 7: Editing, Building Views and Formatting 32


DADS304: Visualization Manipal University Jaipur (MUJ)

11. REFERENCE
• Practical Tableau: 100 Tips, Tutorials, and Strategies from a Tableau Zen Master
• Tableau Your Data! Fast and Easy Visual Analysis with Tableau Software

Unit 7: Editing, Building Views and Formatting 33


DADS304: Visualization Manipal University Jaipur (MUJ)

MASTER OF BUSINESS ADMINISTRATION


SEMESTER 3

DADS304
VISUALIZATION

Unit 8: Mapping, Sorting & Filters 1


DADS304: Visualization Manipal University Jaipur (MUJ)

Unit 8
Mapping, Sorting & Filters

Table of Contents
SL Topic Fig No / SAQ / Page No
No Table / Activity
Graph
1 Introduction
4
1.1 Objectives
2 Latitude and Longitude 1 5
3 Steps to simple map Geographical data with 1, 2, 3, 4, 5 2 5-8
Tableau
4 Steps to format maps 6, 7, 8, 9 3 9-11
5 Method to use custom geocoding feature within 10, 11, 12, 13, 4
tableau 12-16
14
6 Data Blending 17
7 Sorting 15, 16, 17, 18, 5
19, 20, 21, 22, 17-24
23, 24
8 Sorting across multiple dimension 25, 26, 27, 28, 6
29, 30, 31, 32, 25-32
33, 34
9 Steps to create and use filters 35, 36, 37, 38,
33-37
39, 40
10 Quick Filters 41, 42 7 37-39
11 Important points whileusing Filters 43, 44, 45, 46,
39-43
47
12 Self Assessment Questions 8 43-44
13 Summary 44-45
14 Glossary 46

Unit 8: Mapping, Sorting & Filters 2


DADS304: Visualization Manipal University Jaipur (MUJ)

15 Caselet 46
16 References 47
17 Conceptual Map 48 48

Unit 8: Mapping, Sorting & Filters 3


DADS304: Visualization Manipal University Jaipur (MUJ)

1. INTRODUCTION
Using maps in Tableau is a very powerful feature that can quickly show the geographical data
along with the data insights in a single glance. For plotting anydata,coordinate points are
needed. The coordinates are chosen such that one of the numbers represents the vertical
position and the other number represents the horizontal position. A common choice of
coordinates for mapping is latitude, longitude and elevation. To specify a location on a two-
dimensionalmap, a map projection within Tableau is required. The map image provides the
background and the coordinates are plotted on top of it. This chapter gives us knowledge
about how to exclude certain values or a range of values for any particular field and also to
sort data.

1.1 Learning Objectives


After studying this unit, you will be able to:

❖ Explain features of Tableau


❖ Map, sort and filter within Tableau
❖ Plot latitude and longitude

Unit 8: Mapping, Sorting & Filters 4


DADS304: Visualization Manipal University Jaipur (MUJ)

2. LATITUDE AND LONGITUDE


Latitude is the angular distance in degrees, minutes and seconds of a point north or south of
the equator, where longitude is how far east or west it is from the primary DN. If the data set
contains latitude and longitude fields, tableau can automatically plot them on a map. But if
there are geographical data such as city, country tableau will determine their coordinates
based on the fields and plot them on the map using these generated values.

SELF-ASSESSMENT QUESTIONS – 1

1. Data values available for visualization are ________________________

3. STEPS TO SIMPLE MAP GEOGRAPHICAL DATA WITH TABLEAU


Step 1: Tableau automatically recognizes any field containing geographical data. There is
globe icon next to these geofence. Tableau will also generate latitude and longitude
coordinates for these geographical data and list it out in the measure area. Right click on the
field and select Geographical Role and select the closest match for the geographical
information that is contained in the field.

Unit 8: Mapping, Sorting & Filters 5


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.1 MAP GEOGRAPHICAL DATA WITH TABLEAU

Step 2: There are choices for area code, CBSA, City, congressional District, country, county,
State, zip code and choose any of the roles that matches the content in the field. Plot the states
and double click on the state geo field.

Fig8.2 MAP GEOGRAPHICAL DATA WITH TABLEAU

Step 3: Tableau will plot this on a map by automatically placing the generated latitude and
longitude fields on the columns and row shelf respectively. Alternatively, place these fields
manually by dragging the latitude and longitude from the measures area to the column and
row shelf. On the bottom right there is a warning indicating 16 unknown locations. Double
click on 16 unknown locations.

Unit 8: Mapping, Sorting & Filters 6


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.3 MAP GEOGRAPHICAL DATA WITH TABLEAU

Step 4: Tableau does not recognize these locations so click on edit locations. The issue is
because the default country is set to India, but our data set belongs to US. Change India to US
and click OK, after changing India to US all of the states are recognized and are plotted on the
map by default

Unit 8: Mapping, Sorting & Filters 7


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.4 MAP GEOGRAPHICAL DATA WITH TABLEAU

Step 5: Tableau uses the symbol maps for geographical data and a circle mark is used to
indicate the location. Change this to any other shape by choosing a different mark type. The
color can be changed as well. Drag sales to the map. There is a gradient now with the sales
value plotted onto the map. Switch the mark type to filled map. Label these states for easy
analysis. Drag and drop state onto the label field. Florida year has the highest sales compared
to Tennessee year which has the lower sales.

Fig 8.5 MAP GEOGRAPHICAL DATA WITH TABLEAU

SELF-ASSESSMENT QUESTIONS – 2

2. The best feature Tableau are except_____________-


3. By definition, Tableau displays measures over time as a ____________.

Unit 8: Mapping, Sorting & Filters 8


DADS304: Visualization Manipal University Jaipur (MUJ)

4. STEPS TO FORMAT MAPS


STEP 1: Go to the map menu and click Map Options. Set the options to allow to pan and zoom
or even to search the map. All Tableau maps are grayscale within the map layers by default.
Set the stylist to either to format the map to have a dark or a light, or also control the wash
out here.

Fig 8.6 Formatting maps

STEP 2: Change that to a dark based on how you would like to display the data on your
storyboard. The washout adds a transparency and control the transparency by moving the
slider. Select whether to repeat the background. When the repeat background option is
selected the background map may show the same area. The map layers allow you to mark
points of interest. Choose to include the coastline or state borders.

Unit 8: Mapping, Sorting & Filters 9


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.7 Formatting maps

STEP 3: Another useful feature these layers provide is the data layers. Tableau comes with
a set of predefined data layers that shows the census information Choose the per capita
income data layer and select by state and pick a color scheme. Data layer is added to the map
and there is a legend that explains the colors

Unit 8: Mapping, Sorting & Filters 10


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.8 Formatting maps

STEP 4: Change the filled map back to symbols and choose circles from the marks and pick
size from the shelf. Symbol circle represents the sales and per capita income has been color
coded in the background. Change the color of the symbol for readability. Click on sales, edit
colors and change it to sea orange and click apply. This is much easier to read and interpret
the data

Fig 8.9 Formatting maps

SELF-ASSESSMENT QUESTIONS – 3

4. Data can be visualized using___________


5. Data visualization is also an element of the broader _____________.

Unit 8: Mapping, Sorting & Filters 11


DADS304: Visualization Manipal University Jaipur (MUJ)

5. METHOD TO USE CUSTOM GEOCODING FEATURE WITHIN TABLEAU


STEP 1: Tableau requires a set of geographic rules to automatically geo code your data to
create maps. Tableau automatically recognizes countries, cities, zip codes, etc. If geographic
data does not fit into a built-in role, then you will need to create new roles and assign them
to the geo data fields. Tableau does not recognize street address as a geographic role, so
create a custom geographical role. There is no geographic role to match the street address
for the stores. So, create a CSV file to map the latitude and longitude for these store addresses
across these cities.

Fig 8.10 CUSTOM GEOCODING WITHIN TABLEAU

STEP 2: Using the list of store addresses, find the appropriate latitude and longitude for this
address. There are multiple websites that can help you with this task of finding the latitude
and longitude, such as latlong.net etc.CSV file with the stores and the latitude and the
longitude is obtained. Note that while creating the CSV file the file must be saved as a dot CSV
file. The new role that is the geo code should be the column header.

Unit 8: Mapping, Sorting & Filters 12


DADS304: Visualization Manipal University Jaipur (MUJ)

STEP 3: The latitude and longitude must be spelled correctly and also do make a note that
they are case sensitive. The other thing to make sure is to include at least one decimal place
when specifying the values for latitude and longitude. If there is already an existing
geographical hierarchy. Wooden Tableau makes sure that your import file contains the
columns for each level in the existing hierarchy. There is an existing hierarchy with country,
state, city. While creating the CSV file, make sure that the CSV file contains the existing
hierarchy and then the new geographical role that is being added. Here is a list of built-in
hierarchies and order in which they should be organized in your import file. To add a new
hierarchy in Tableau, create multiple import files, each file representing a level in the new
hierarchy. For example, let's say you have revenue by train stations. The sales will need to
be organized by station, country, region and city.

Fig 8.11 CUSTOM GEOCODING FEATURE WITHIN TABLEAU

STEP 4: To create a new hierarchy from these geographical roles, you will need to create
multiple import files, each representing a level in the new hierarchy. Use this for adding a
new role to an existing hierarchy. Import the CSV file into Tableau. To import this into
tableau, click on map, choose Geocoding and choose Import custom Geocoding. Choose the
folder that contains the CSV file for your new geographical role. In this case, our CSV file is
located within the CSV subfolder within the Datasets main folder.

Unit 8: Mapping, Sorting & Filters 13


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.12 CUSTOM GEOCODING FEATURE WITHIN TABLEAU

STEP 5: Note that while importing all the files within the CSV folder gets imported. So, make
sure that only the relevant files meant for the workbook in that particular folder. Click on
Import and the Custom geocoding data is imported into the workbook. This will take a couple
of minutes and once the process is complete, new geographical role is being available. Assign
the newly created geographical role to the field. To assign the newly created geographic role,
right click on the field and select the role you want to assign. The new role that you've created
is Store Address. Map this data using Tableau. Double click on store address.

Unit 8: Mapping, Sorting & Filters 14


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.13 CUSTOM GEOCODING FEATURE WITHIN TABLEAU

STEP 6: The data automatically has been plotted using the latitude and longitude that has
been input in the CSV file. Add sales and the size of the circle represents the sales for the
exact store locations. Doing this kind of mapping is particularly useful for analysis to either
open or close existing store locations based on the sales demand or demographics around
the area.

Unit 8: Mapping, Sorting & Filters 15


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.14 CUSTOM GEOCODING FEATURE WITHIN TABLEAU

SELF-ASSESSMENT QUESTIONS – 4

6. _____________ is used to query and edit graphical settings


7. ____________ groups values of a variable into larger bins

Unit 8: Mapping, Sorting & Filters 16


DADS304: Visualization Manipal University Jaipur (MUJ)

6. DATA BLENDING
Another option in Tableau to map location data that cannot be automatically geocoded in
Tableau is to use the data blending option. Data blending works great if you're adding a single
level of geographical information with a latitude and a longitude, you can use any data
source, unlike Custom geocoding where you can only use text files. However, data planting
will not allow you to add these as new roles or create new geographical hierarchies, nor will
they let you reuse the same for other workshops. These possibilities are only made possible
using the Custom Geocoding option.

7. SORTING
In Tableau, either start analyzing the data right away or start exploring by asking questions
of your data and finding answers to them. There are multiple ways available to sort data.

STEP 1: The easiest way is to click on the Quick Sort option on the Access. Hovering over the
access labels brings up the Sort icon. A single click sorts the bar in descending order and a
second click switches it to ascending order. To clear the sort, click again and the original state
is restored. There is also an option on the Sort menu to clear sorts.

Fig 8.15 Sorting data within Tableau

Unit 8: Mapping, Sorting & Filters 17


DADS304: Visualization Manipal University Jaipur (MUJ)

The Profit is not on the axis so there is no Quick Sort option. Right click on the pill directly to
sort this measure. Choose a manual sort such that 70,000 appears first.

Fig 8.16 Sorting data within Tableau

STEP 2: Use the same approach as the Quick Sort and directly use the Sort option next to the
dimension. The first sort sorts it by descending, the second sort by ascending, and the third
sort clears the sort. Note that continuous pills can be sorted using the Quick Sort option, but
do not have a Sort option in the pill dropdown.

Unit 8: Mapping, Sorting & Filters 18


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.17 Sorting data within Tableau

STEP 3: Discrete pills do have an option in the pill dropdown. With the full sorting options
available quick Sort is over limited in terms of the flexibility it offers to sort exactly. Click on
the pill to get the drop-down menu and click on Sort here. Sort either by alphabetic or manual
or also sort by a particular field.

Unit 8: Mapping, Sorting & Filters 19


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.18 Sorting data within Tableau

STEP 4: Choose Profit here for the category Sort. The items in the view will be sorted by the
profit values, even though Profit is not present in the view. Change the aggregation type here
and specify the sort order. Click on Apply. The category has been sorted by the profit values
accordingly. Drag the category up or down by dragging the headers in the bar chart.

Fig 8.19 Sorting data within Tableau

STEP 5: The items are in a legend. Color the bars by category and drag the headers in the
legend directly to order them. Move the dining all the way to the start and handbags to the
second. The ordering has happened.

Unit 8: Mapping, Sorting & Filters 20


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.20 Sorting data within Tableau

7.1 Sorting Across Multiple Dimensions


STEP 1: Drag Category and Department to the view. Drag sales to the row shelf. The sorting
gear depends on the ordered context of the blue pills set by the rows or the column shelves.
The default sorting will always sort by the innermost axis. Change the order of these pills to
sort by the dimension.

Unit 8: Mapping, Sorting & Filters 21


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.21 Sorting data across multiple dimensions

STEP 2: Note that the sorting changes once the order of these pills’ changes. Click on Sales
to sort. This sorts of sales in a descending order. Note that for the men's category, the sales
for Accessories are less than active wear. But it appears above the active wear in the view,
although it had been sorted this in a descending manner. This is because Tableau is taking
into account sales across all department for Accessories. As the sales for Accessories is
greater than sales for active wear across all departments, it sorts the bars this way. This is
where a combined field might help.

Unit 8: Mapping, Sorting & Filters 22


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.22 Sorting data across multiple dimensions

STEP 3: Choose department and category and right click and choose Create combined Field.
A combined field for Department and Category has been combined into a single field. Drag
and drop this field to the view along with the sales. Sort the sales. The bars are sorted
accordingly

Unit 8: Mapping, Sorting & Filters 23


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.23 Sorting data across multiple dimensions

STEP 4: Sort these pills in different other way and pick the department. Sort department by
sales and sort category by profit. The sorting has happened based on the profit for category
and based on sales for Department

Fig 8.24 Sorting data across multiple dimensions

SELF-ASSESSMENT QUESTIONS – 5

8. The complexity of the sorting algorithm measures the ________as a function of the
number n of items to be sorter.
9. The complexity of bubble sort algorithm is__________

Unit 8: Mapping, Sorting & Filters 24


DADS304: Visualization Manipal University Jaipur (MUJ)

8. STEPS TO CREATE AND USE FILTERS


Filtering involves deciding what should be kept and excluded from a view, from filtering by
category, date range, location, or a minimum value.

STEP 1: Filters help you to exclude certain values or a range of values for any particular field
within your view. The easiest way to add a filter is to draw, drag and drop the dimensions
onto the filter shelf. Drag and drop department, select men and women and click Apply. The
view has been filtered for men and women.

Fig 8.25 Steps to create filters

STEP 2: Take a deeper look at the various options presented while filtering either discrete
or continuous dimensions. Drag category to the filter shelf. Dialog box opens up with various
tabs. The first tab lists the values that are being filtering on. Select all or pick the values that
are interested in filtering. There isn't a search option, especially when the list of values is too
long and scroll all the way down. Filter for handbags luggage and shoes and click Apply.

Unit 8: Mapping, Sorting & Filters 25


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.26 Steps to create filters

STEP 3: The second tab here is the wildcard. The Wildcard feature is very useful, especially
when you want to filter out certain items or include certain items that match a certain
pattern. Find out all corporate contact emails or eliminate personal emails such as Gmail or
Hotmail, etc. from your list. This can be done using the Wildcard option. Enter a Wildcard
pattern that matches the category Shoes and click Apply. The view has gone ahead and
filtered for just shoes because that's the string that matches the wild card pattern that had
just been entered. There is also an option to exclude

Unit 8: Mapping, Sorting & Filters 26


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.27 Steps to create filters

STEP 4: Choose to exclude and pick all those patterns that match this and instead of
returning those, it would restrict them from appearing in the view. Click Apply, it's filtered
for luggage, handbags, and restricted shoes from appearing in the view.

Fig 8.28 Steps to create filters

Unit 8: Mapping, Sorting & Filters 27


DADS304: Visualization Manipal University Jaipur (MUJ)

STEP 5: Filter for items that satisfy a particular criterion. List all categories that have a sale
greater than 1.5 million. Pick sales and let's enter sales greater than 1.5 million as our
criteria. Click Apply. The view has filtered out and brought back those categories that have a
sale of greater than one 5 million. In this case, handbags have a sales amount of less than one
5 million and hence is no longer appearing in the view.

Fig 8.29 Steps to create filters

STEP 6: The Load button here helps to bring in the range of values that the field contains,
and a formula option here helps you to use calculations. The next tab option is top. So, Filter
also allows us to filter based on a rank for either top, end or bottom end of a field. Click apply

Unit 8: Mapping, Sorting & Filters 28


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.30 Steps to create filters

STEP 7: The view as filtered for the top five states by the Sales amount. Look at filtering for
measures or continuous dimensions. Drag a measure in this view and drag Sales. Measure as
a filter. Drag a measure specifically choose the aggregation type. Choose Sum and click on
Next. Specify the range, indicate a lower cut off and upper cut off. The default limits are in
the database, in this case, the relevant values for the particular field value. In this case I want
to restrict this to 5 million and click Apply

Unit 8: Mapping, Sorting & Filters 29


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.31 Steps to create filters

STEP 8: Going back to the options, there is at least and at most, which helps us specify a
lower or just an upper limit. There is a special option that lets us filter on nulls or non-null
values. While dragging a date field to the filter shelf, a menu pops up asking us on how to
filter the date. Either pick related date or a range of dates, or also pick the years, quarters, or
months of the particular date, or even just specific dates.

Unit 8: Mapping, Sorting & Filters 30


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.32 Steps to create filters

STEP 9: This would work just like how filtering on dimensions work. Focus on the relative
date and the range of dates. Choose Relative Date and click on Next as the name surges.
Relative Date will let you pick either the last three years or set ranges such as month to date
and by default you notice that the anchor is dynamic and is set to the current date. Change
this any time to a static date of your choice.

Fig 8.33 Steps to create filters

STEP 10: Pick for the last one year of data and click apply. The data is refreshed to bring back
data just for the last one year prior to 2015.The range of dates work as the same way as the
ranges work for measures. Specify start and end date range

Unit 8: Mapping, Sorting & Filters 31


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.34 Steps to create filters

SELF-ASSESSMENT QUESTIONS – 6

10. ________________ in business intelligence allows huge data and reports to be read in a
single graphical interface.
11. ______________function is used to create a horizontal bar chart.

Unit 8: Mapping, Sorting & Filters 32


DADS304: Visualization Manipal University Jaipur (MUJ)

9. QUICKFILTERS
STEP 1: Quick filters are a great way to add interactivity to your views. Dragging a field to
the filter shelf is the easiest way to filter, right click on the Dimension, or in this case the
Department field and click on Show Filter. Quick filter has been added to the right side of the
view.

Fig 8.35 Steps to create quick filters

STEP 2: To edit the filter, read the appearance or how it works. Click on the arrow from
within the Quick filter. A menu pops up. There is an option to apply this filter to either the
current worksheet or all worksheets. Format the filter to modify the font size, colors and
mode.

Unit 8: Mapping, Sorting & Filters 33


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.36 Steps to create quick filters

STEP 3: The customize option can be used to select all or hide or display the Search button,
or even include an Apply filter option. Click on Show Apply. Apply button now appears on
the filter. There is an option to hide or show the title or even edit it. Click on edit the filter.

Fig 8.37 Steps to create quick filters

Unit 8: Mapping, Sorting & Filters 34


DADS304: Visualization Manipal University Jaipur (MUJ)

STEP 4: Edit the title for this filter as Select Department. The title has been modified to select
Department. There is an option to customize the display whether it is a single value list or a
drop down, or a slider or a multiple value list. And there is this option to also use a wild card
display filter. Right at the end, there is an option to either display relevant values or all values
from the database.

Fig 8.38 Steps to create quick filters

STEP 5: Cascading filters are actually a set of filters in which the contents of a Quick filter is
affected by the selection in a previous filter. Choose City within a particular state. Select a
state option and then click on the city drop down. A long list appears, making it difficult for
the user to search and select. Only the relevant city must show up. Choose only relevant
values from the drop down.

Choose only relevant values for the Department filter drop down and for the category.

Unit 8: Mapping, Sorting & Filters 35


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.39 Steps to create quick filters

STEP 6: Filter Department for chess men and women. Categories for the home department
no longer appear in the category as only relevant values are chosen. These types of Cascading
filters are a great way to tidy up long list of values, to make it highly intuitive for the end user,
and also to make the views really interactive. One thing to note, however, is that performance
might be an issue given that the queries have to go back to the database to pull the relevant
data or in this case, the categories. While using these Quick filters in a Dashboard on a story,
the placement purely depends on the space and real estate available, and the usage should
be in line with the actual purpose of the Dashboard.

Unit 8: Mapping, Sorting & Filters 36


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.40 Steps to create quick filters

10. IMPORTANT POINTS WHILE USING QUICK FILTERS


STEP 1: As much as they're useful in adding interactivity to your view, they might affect the
performance of the views because after adding a Quick filter, tableau has to run a query
against the database to determine which values to display for your selected dimension.
Under more filters, the performance overhead just goes up. Use Dashboard filter actions that
could be a better option compared to the Quick filter option. The Dashboard filter actions do
not generate additional queries and can filter as many views on the Dashboard as you want.
So, unless the Quick filters are necessary to convey the story of your data, it's better to use
them sparingly. There is a third option to create filters. To create filters from within the views
itself, select a mark or a group of marks, and then select a mark or a group of marks and use
the Keep Onley or the Exclude option

Unit 8: Mapping, Sorting & Filters 37


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.41 Steps to create quick filters

STEP 2: The Keep Onely will filter those for those marks. The same technique can be used by
clicking on the headers to filter just for that particular header value, as easy as dragging the
pill off the shelf

Unit 8: Mapping, Sorting & Filters 38


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.42 Steps to create quick filters

SELF-ASSESSMENT QUESTIONS – 7

12. ___________plot is also described as five-number summery plot.


13. Data Frame in pandas is______________

11. ADDING ELEMENTS TO FILTER SHELF


STEP 1: Take an example for record level filtering. View with department, category, State,
City, and Sales. At the city level, drag a Sales filter to the filter shelf, and choose all values so
to see how record level filtering happens. Do the same with an aggregate filter. Track Sales
to the filter shelf and choose sum. Use the same 20,000 value year.

Fig 8.43 Adding elements to filter shelf

STEP 2: Tableau now filters all of those data or marks that appear in the view at the city level
that do not meet this criterion. Many more marks in the view are seen although the filter
value is the same. This is because here an aggregated filter is used and hence Tableau filters
out at the lowest granularity or the city level year. Filter out at the data source level. To create
a data source filter, right click on the data source and edit this data source filter.

Unit 8: Mapping, Sorting & Filters 39


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.44 Adding elements to filter shelf

STEP 3: Something to note here is that you would not see these filters on the filter shelves
as they are applied to the data source and hence would filter out data in all sheets and is not
restricted to any particular sheet. There is one more type of filter called context filter.
Ingeneral, the filters and the filter shelves in Tableau is completely independent of each
other. If you want to apply filters in a particular order, Filter for all items that have a sales
value greater than 50000 for the state of California. First apply the condition filter for sales.
Drag product ID to the filter shelf and filter for sales greater than 50000

Unit 8: Mapping, Sorting & Filters 40


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.45 Adding elements to filter shelf

STEP 4: Drag state filter and filter for California. The sales lesser than 50000 also show up
although it had been filtered it out in the previous step. This is because the conditional filter
is applied before the stage filter and what’s appearing is the sum of sales for all those items
that have a sale of 50000 totally. This is where a context filter can help. It will logically apply
itself before any other filters that are present. Context filter forces each query to also utilize
a sub query the sales of the items in context. In this case apply a context filter for the
California state.

Unit 8: Mapping, Sorting & Filters 41


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.46 Adding elements to filter shelf

STEP 5: The second condition is greater than 50000 Therefore you see only items that have
a sale greater than 50000 for California appearing in the view. The context filters are gray in
color and any other subsequent filters now applied on this filter. Although context filters are
operated before any other dimension or measure filter it’s also important to note that if an
extract filter or data source filter is used then they are executed first before using context
filters

Unit 8: Mapping, Sorting & Filters 42


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8.47 Adding elements to filter shelf

12. SELF ASSESMENT QUESTIONS

SELF-ASSESSMENT QUESTIONS – 8

14. For creating variable size bins we use __________

SELF ASSESMENT QUESTIONS- ANSWERS


1. Table Calculation
2. Data is more small and fit
3. Lines
4. use of common graphics
5. data presentation architecture (DPA) discipline
6. par()
7. stem()
8. running time
9. O(n)
10. Dashboard

Unit 8: Mapping, Sorting & Filters 43


DADS304: Visualization Manipal University Jaipur (MUJ)

11. barh( y )
12. box and whisker plot
13. 2 dimensional data structure
14. Calculated Fields

13. TERMINAL QUESTIONS


1. What is TABLEAU?
Tableau is the powerful and fastest visualizing tool that is used in the Business
Intelligence(BI) Industry. It simplifies the raw data into an understandable format. Analysis
of the data becomes faster with Tableau. The visualizations can be created in the form of
dashboards. The visualizations or diagrammatic representation of data can easily be
understood by the employees of the organizations who are at different levels.

2. What is data visualization?


Data visualization means the graphical representation of data or information. We can use
visual objects like graphs, charts, bars, and a lot more. Data visualization tools provide an
accessible way to see and understand the data easi

3. List out Tableau File Extensions.


The below ones are few extensions in Tableau:
• Tableau Workbook (.twb)
• Tableau Data extract (.tde)
• Tableau Datasource (.tds)
• Tableau Packaged Datasource (.tdsx)
• Tableau Bookmark (.tbm)
• Tableau Map Source (.tms)
• Tableau Packaged Workbook (.twbx) – zip file containing .twb and external files.
• Tableau Preferences (.tps).
4. What is the latest version of Tableau Desktop?
Tableau Desktop's latest version is 2021.3(as of, 7thSep 2021).

Unit 8: Mapping, Sorting & Filters 44


DADS304: Visualization Manipal University Jaipur (MUJ)

5. Give a brief about the tableau dashboard?


Tableau dashboard is a group of various views which allows you to compare different types
of data simultaneously. Datasheets and dashboards are connected if any modification
happens to the data that directly reflects in dashboards. It is the most efficient approach to
visualize the data and analyze it.

6. Define Page Shelf in Tableau?


Page shelf breaks the views into a series of pages. It displays an alternate view on each page.
Due to this feature, you can analyze the effect of each field on the rest of the data in the view.

Define shelves and sets?


Shelves: Every worksheet in Tableau will have shelves such as columns, rows, marks, filters,
pages, and more. By placing filters on shelves we can build our own visualization structure.
We can control the marks by including or excluding data.

Sets: The sets are used to compute a condition on which the dataset will be prepared. Data
will be grouped together based on a condition. Fields which is responsible for grouping are
known assets. For example – students having grades of more than 70%.

7. Explain the limitation of context filters in Tableau?


Whenever we set a context filter, Tableau generates a temp table that needs to refresh each
and every time, whenever the view is triggered. So, if the context filter is changed in the
database, it needs to recompute the temp table, so the performance will be decreased.

Unit 8: Mapping, Sorting & Filters 45


DADS304: Visualization Manipal University Jaipur (MUJ)

13. SUMMARY
Let us recapitulate the important concepts discussed in this unit:
• To analyze your data geographically, plot your data on a map in Tableau.
• Provides explanations for when and why you should use a map to visualize your data.
• Based on a measure used in the view, items can be sorted in a table.
• In a table, sorting can reveal relationships between dimensions by controlling the order
in which they appear.
• By using Tableau filters, you can minimize the size of the data, clean up underlying data,
remove irrelevant dimension members, and set measures or date ranges.

14. GLOSSARY
Let us have an overview of the important terms mentioned in the unit:
Latitude: It is the angular distance in degrees, minutes and seconds of a point north or south
of the equator.

Longitude: It is how far east or west it is from the primary DN.

Data Blending: Tableau to map location data that cannot be automatically geocoded in
Tableau Filtering: Filtering involves deciding what should be kept and excluded from a view,
from filtering by category, date range, location, or a minimum value.

Unit 8: Mapping, Sorting & Filters 46


DADS304: Visualization Manipal University Jaipur (MUJ)

15. CASELET
1. Consider the dashboard that shows order quantity, average sales, and average profit
for customers. There are three views in it. In each view, a different data source is used
as the primary data source, but they all share one field: Customer Name. Filter the view
by Customer Name.This is an interesting dashboard with a lot of great information, but
you might want to update all of the views in the dashboard at the same time by the
customer you’re analyzing. For example, maybe you want to see the average sales,
profit, and number of orders you’ve received from one of your customers, Aaron
Riggs.To do so, we can filter all three data sources on the Customer Name field

16. REFERENCES
References and Suggested Reading
1. Joshua N. Milligan, Learning Tableau 2022: Create effective data visualizations, build
interactive visual analytics, and improve your data storytelling capabilities, 5th
Edition 5th ed. Edition
2. Visual Analytics with Tableau, Paperback, 31 May 2019.

Unit 8: Mapping, Sorting & Filters 47


DADS304: Visualization Manipal University Jaipur (MUJ)

17. CONCEPTUAL MAP

Fig 8.48

Unit 8: Mapping, Sorting & Filters 48


DADS304: Visualization Manipal University Jaipur (MUJ)

MASTER OF BUSINESS ADMINISTRATION


SEMESTER 3

DADS304
VISUALIZATION

Unit 9: Other Features 1


DADS304: Visualization Manipal University Jaipur (MUJ)

Unit 9
Other Features

Table of Contents
SL Topic Fig No / SAQ / Page No
No Table / Activity
Graph
1 Introduction
3
1.1 Learning Objectives
2 Aggregating dimensionalities 1, 2, 3, 4 1 4-7
3 Steps to calculate aggregating dimensionalities 5 2 7-9
4 INCLUDE Level of detailed expressions 6, 7, 8, 9 3 9-13
5 Steps to calculate level of detail expression 10, 11, 12 4 13-15
6 Exclude level of detail expression 13 5 15-17
7 Nested lod 14, 15 6 17-18
8 Summary 19-22
9 Glossary 23
10 Caselet 23-24
11 Conceptual Map 25

Unit 9: Other Features 2


DADS304: Visualization Manipal University Jaipur (MUJ)

1. INTRODUCTION
Tableau Groups are sets of multiple members combined into a single dimension for the
purpose of creating a higher level dimension. Grouping single-dimensional members in
Tableau automatically creates a new dimension with the group name at the end. The original
dimension of the members is not altered by Tableau. Group is used to combine members
present in a field.

1.1 Learning Objectives


After studying this unit, you will be able to:

❖ Explain features of Tableau


❖ Explain Groups and calculated fields
❖ Steps to calculate sets and calculated fields

Unit 9: Other Features 3


DADS304: Visualization Manipal University Jaipur (MUJ)

2. GROUPS
A group is used to combine members in a field. Using group, you can aggregate the values of
'Furniture' and 'Office Supplies'. Using Tableau, aggregated values of 'Furniture' and 'Office
Supplies' can be shown in visuals after grouping the data. Following is a procedure for
grouping data in Tableau.

2.1 Steps to Calculate Groups in TABLEAU


Step 1: Right-click on the dimension ‘Category’. Click on ‘Create’ option. Select ‘Group’
option.

Fig 9.1 Creating groups in Tableau

Step 2: It opens the ‘Create group’ window. Type the name of the group data in Tableau.
Select the members to be grouped. Click on ‘Group ‘button.

Unit 9: Other Features 4


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 9.2 Creating groups in Tableau

Step 3: In Edit Group Window, It creates groups in Tableau of ‘Furniture’ and ‘Office
supplies’. Click on Ok to create the group.

Unit 9: Other Features 5


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 9.3 Creating groups in Tableau

Step 4: A group in Tableau with the name of Category (Group) and added in the dimension
list is created. This can be used for visualizing the group by in Tableau method for members
present in a field.

The following image explains the functionality of Tableau create group. The sum of sales is
visualized for both furniture and office supplies for grouping in Tableau.

Fig 9.4 Creating groups in Tableau

SELF-ASSESSMENT QUESTIONS – 1

1. Data Values available for the visualization______________


2. ______________is not a Trend Line model

Unit 9: Other Features 6


DADS304: Visualization Manipal University Jaipur (MUJ)

Answers
1. Table calculation
2. Binomial trend line

3. CALCULATED FIELDS AND PARAMETERS


Software or programming languages use parameters as dynamic values to replace constant
values in calculations, filters, or references. Consider a case where you want to generate a
report each month that shows the number of employees earning more than 1,00,000/- INR
per month.

Step 1: In the Data pane, right-click a field in which you want to create a parameter and then
select Create -> Parameter.

Step 2: Give the field a name and provide an optional comment so as to explain your
parameter

Step 3: Provide the type of data that it accepts

Step 4: Provide a current value, which will also be the default value of the parameter

Step 5: Provide a specific display format on the parameter control

Step 6: Based on the option selected for ‘Allowable Values’, you must provide the values to
the parameter defined

Step 7: When all the above steps are completed, click OK to complete the process. The newly
created parameter will be listed on the Parameters section (i.e., the bottom of the Data pane).

Using parameters in calculations is as simple as dragging them from the Data pane or
dropping them on the Calculation editor (either replacing a particular part of the formula or
at a new location). Click on the first parameter and click on Duplicate to create a new one
with the same configuration as the first. By renaming it, it is possible to create two
parameters with the same configuration. There are two parameters named Placeholder 1
Selector and Placeholder 2 Selector. On Analysis, select Create Calculated Field to open up
the calculation editor. Click Ok to close the calculation editor. Now create a duplicate
calculated field with the same configuration.

Unit 9: Other Features 7


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 9.5 Creating calculated fields and parameters in Tableau

After the previous stages, it is simpler to set up the view aswe can simply drag and drop these
computed fields onto the area of the tool.Drag placeholders two to the columns and one to
the rows.Drag the Customer Name field to Detail and the Region field to Color.Click on each
parameter in the Data Pane's parameters section, then select "Show Parameter Control." The
other parameter should be treated similarly. The Tableau Desktop view is now ready . It is
possible to select the data that will be displayed on your X and Y axes, respectively, using the
parameter controls. The data can be altered by creating permutations and combinations
based on the parameters provided.

SELF-ASSESSMENT QUESTIONS – 2

3. How do you find the field is continuous in tableau?


4. What percent of total profits do the top 10 customer by Sales represent?

Unit 9: Other Features 8


DADS304: Visualization Manipal University Jaipur (MUJ)

Answers
3. Green colour
4. 5.03%

4. SETS
There are two types of sets: dynamic sets and fixed sets. The members of a dynamic set
change when the underlying data changes. Dynamic sets can only be based on a single
dimension.

9.4 STEPS INVOLVED IN CREATING DYNAMIC SETS


Step 1: Right-click a dimension in the Data window and choose Create > Set.

Step 2: Set up your set in the Create Set dialogue box. The following tabs can be used to
configure your set:

Step 3: General: To choose one or more values that will be taken into account while
calculating the set, use the General tab.

Step 4: Alternately, the Use all option to always take into account all members, regardless of
whether new members are added or withdrawn can be used.

Unit 9: Other Features 9


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 9.6 Creating dynamic sets

Step 5: Condition: To specify guidelines for choosing which members to include in the
collection, use the Condition tab. For instance, establish a condition based on total sales that
only takes into account goods with sales greater than $100,000.

Unit 9: Other Features 10


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 9.7 Creating dynamic sets

Step 6: Top: Use the Top tab to define limits on what members to include in the set. For
example, specify a limit that is based on total sales that only includes the top 5 products
based on their sales. When finished, click OK.

4.1 Steps Involved In Creating Fixed Sets


Step 1: In the visualization, select one or more marks (or headers) in the view. Right-click
the mark(s) and select Create Set. In the Create Set dialog box, type a name for the set.

Unit 9: Other Features 11


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 9.8 Creating fixed sets

Step 2: The members listed in the dialogue box are included in the set by default. Instead,
choose to exclude certain members. The set will contain every member that are not selected
when to exclude. By clicking the red "x" button that shows when you hover over a column
heading, it is possible to eliminate any measurements that are not required to be taken into
account. Click the red "x" button that appears when you hover over any individual rows that
are not required to be part of the collection.

Unit 9: Other Features 12


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 9.9 Creating fixed sets

Step 3: Finally click OK.

SELF-ASSESSMENT QUESTIONS – 3

5. Power BI is a product of______________


6. Who is the parent company of Tableau?

Answers
5. Microsoft
6. Sales Force

5. TRENDS
Trend lines are used to forecast whether a certain trend in a variable will continue. By
observing the trend in both variables at once, it is also possible to determine the correlation
between them. There are numerous mathematical techniques for drawing trend lines.
Tableau offers four choices. Linear, Logarithmic, Exponential, and Polynomial are the types.
To construct a Trend Line, Tableau needs a time dimension and a measure field.

Step 1: Drag the measure Sales to the Rows shelf and the dimension Order date to the
Column shelf. Select a line chart as the chart type. Navigate to model Trend Line under the
Analysis menu. The option to add several types of trend lines appears when the button for
Trend Line is clicked. Select the linear model as displayed in the screenshot below.

Unit 9: Other Features 13


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 9.10 Creating trends

Step 2: A different trend lines after completing the aforementioned procedure is obtained.
The P-Value and R-Squared values are also displayed, along with the mathematical equation
for the correlation between the fields.

Unit 9: Other Features 14


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 9.11 Creating trends

Step 3: Right-click on the chart and select the option Describe Trend Line to get a detailed
description of the Trend Line chart. It shows the coefficients, intercept value, and the
equation. These details can also be copied to the clipboard and used in further analysis.

Fig 9.12 Creating trends

6. REFERENCE LINES AND FORECAST


A reference line in Tableau is simply a line that gets drawn on a chart that represents another
measure or point of reference. Tableau Reference lines can be useful in providing context to
the related chart. For example, a line showing the median will visually show the difference
of each mark in the chart relative to the median.

Step 1: From Dimensions, drag Year to Rows.

Step 2: From Dimensions, drag Genre to Rows to the right of the Year pill.

Step 3: From Measures, drag Worldwide Gross Amount to Columns.

Step 4: Hover over the Worldwide Gross Amount axis, and click on the sort icon once to sort
the bars in descending order.

Unit 9: Other Features 15


DADS304: Visualization Manipal University Jaipur (MUJ)

Step 5: On the sidebar, click on the Analytics tab to activate it.

Step 6: Under Custom, drag Reference Line from the Analytics tab and drop it onto the Table
placeholder.

Step 7: Set the following for this Tableau reference lines:


• Scope as Entire Table
• Under Line, setValue, set the aggregation to Average of the SUM(Worldwide Gross
Amount)
• Under Line Label, set it to Custom with Average <Value> as text; note you can use the >
button to insert values
• Under Formatting, set Line to a thick, dark red line
• Under Formatting, set Fill Above: to None
• In Under Formatting, set Fill Below: to None:

Fig 9.13 Reference lines and forecast

Step 8: Click OK when done.

Step 9: Under Custom, drag Reference Line again from the Analytics tab and this time drop
it onto Pane.

Unit 9: Other Features 16


DADS304: Visualization Manipal University Jaipur (MUJ)

SELF-ASSESSMENT QUESTIONS – 4

7. By definition, Tableau displays measures over time as a ____________.


8. Tableau File Extension is_______________

ANSWERS
7. Time
8. twbx

7. CREATING CALCULATED FIELDS


Step 1: Create the calculated field
1. In a worksheet in Tableau, select Analysis > Create Calculated Field.
2. In the Calculation Editor that opens, give the calculated field a name.

In this example, the calculated field is called Profit Ratio.

Step 2: Enter a formula


1. In the Calculation Editor, enter a formula.

This example uses the following formula:

SUM([Profit])/SUM([Sales])

Formulas use a combination of functions, fields, and operators. When finished, click OK.

The new calculated field is added to the Data pane. If the new field computes quantitative
data, it is added to Measures. If it computes qualitative data, it is added to Dimensions.

You are now ready to use the calculated field in the view.

Unit 9: Other Features 17


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 9.14 Creating calculated fields

SELF-ASSESSMENT QUESTIONS – 5

9. The icon associated with the field that has been grouped is a ________.
10. Tableau was introduced in the year of________

ANSWERS
9. paper clip
10. 2003

Unit 9: Other Features 18


DADS304: Visualization Manipal University Jaipur (MUJ)

8. TABLE CALCULATIONS
Quick table calculations allow you to quickly apply a common table calculation to your
visualization using the most typical settings for that calculation type. This article
demonstrates how to apply a quick table calculation to a visualization using an example.

The following quick table calculations are available in Tableau for you to use:
• Running total
• Difference
• Percent difference
• Percent of total
• Rank
• Percentile
• Moving average
• YTD total
• Compound growth rate
• Year of year growth
• YTD growth

Step 1: Open Tableau Desktop and connect to the Sample-Superstore data source, which
comes with Tableau.

Step 2: Navigate to a new worksheet.

Step 3: From the Data pane, under Dimensions, drag Order Date to the Columns shelf.

Step 4:From the Data pane, under Dimensions, drag State to the Rows shelf.

Step 5: From the Data pane, under Measures, drag Sales to Text on the Marks Card.

Step 6: From the Data pane, under Measures, drag Profit to Color on the Marks Card.

Step 7:On the Marks card, click the Mark Type drop-down and select Square.

Unit 9: Other Features 19


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 9.15 Creating table calculations

SELF-ASSESSMENT QUESTIONS – 6

11. Sets can be created on Measures. ________


12. For creating variable size bins we use _________

ANSWERS
11. True
12. Calculated Fields

TERMINAL QUESTIONS
1. What is a Calculated Field, and How Will You Create One?
A calculated field is used to create new (modified) fields from existing data in the data source.
It can be used to create more robust visualizations and doesn’t affect the original dataset.

For example, let’s calculate the “average delay to ship.”

The data set considered here has information regarding order date and ship date for four
different regions. To create a calculated field:

Unit 9: Other Features 20


DADS304: Visualization Manipal University Jaipur (MUJ)

1. Go to Analysis and select Create Calculated Field.


2. A calculation editor pops up on the screen. Provide a name to the calculated field:
Shipping Delay.
3. Enter the formula: DATEDIFF (‘day’, [Order Date], [Ship Date])
4. Click on Ok.
5. Bring Shipping Delay to the view.
6. Repeat steps 1 to 5 to create a new calculated field ‘Average Shipping Delay’ using the
formula: AVG (DATEDIFF (‘day,’ [Order Date], [Ship Date]))
2. Is There a Difference Between Sets and Groups in Tableau?
A Tableau group is one dimensional, used to create a higher level category by using lower-
level category members. Tableau sets can have conditions and can be grouped across
multiple dimensions/measures.

Example: Sub-category can be grouped by category.

Top Sales and profit can be clubbed together for different categories by creating a set:
1. Continuing with the above example of Sets, select the Bottom Customers set where
customer names are arranged based on profit.
2. Go to the ‘Groups’ tab and select the top five entries from the list.
3. Right-click and select create a group option.
4. Similarly, select the bottom five entries and create their group. Hide all the other
entries.
3. What is a Parameter in Tableau? Give an Example.
A parameter is a dynamic value that a customer could select, and you can use it to replace
constant values in calculations, filters, and reference lines.

4. How Can You Schedule a Workbook in Tableau after Publishing It?


1. When you’re signed in to Tableau Server, go to Content > data sources or Content >
Workbooks, depending on the type of content you want to refresh.
2. Select the checkbox for the data source or workbook you want to refresh, and then
select Actions > Extract Refresh.
3. In the Refresh Extracts dialog, select Schedule a Refresh, and complete the following
steps:

Unit 9: Other Features 21


DADS304: Visualization Manipal University Jaipur (MUJ)

1. Select the schedule you want.


2. If available, specify whether you want a full or incremental refresh.
4. What Are the Different Types of Tableau?
The dif ferent types of Tableau are Desktop, Prep, Online, and Server.

5. How do I make sets in tableau?


Sets are custom fields that let you compare and ask questions about a subset of data. To make
a set on a dimension, right-click on a size in the data pane and choose to create > set. On the
"General" tab, select the fields used to figure out the set. On the conditions tab, you can set
the conditions for making a set. On the top tab, you can also choose the top N members of the
dataset based on any field. When a set is made, the measure is split into two parts, "in" and
"out" of the set, based on the user's conditions.

6. How do I use groups in calculated fields in Tableau?


You can make a group by right-clicking on a field in the data pane and choosing "Create" >
"Group." Then, you can select the fields you want to group under the "General" tab and set
the criteria for grouping under the "Conditions" tab. Then, right-click on this group and
choose "create," "set," and "create a calculated field." You can then use this group as a set in
this calculated field.

7. What does "view in tableau" mean?


The term "view" refers to how data from a source is shown in a worksheet. A view can be
anything. It could be a plot, a chart, a graph, or even a table. Then, all of these points of view
are put together on a dashboard to make a single story and show how they all fit together

Unit 9: Other Features 22


DADS304: Visualization Manipal University Jaipur (MUJ)

9. GLOSSARY
Let us have an overview of the important terms mentioned in the unit:
Group: A group is used to combine members in a field.

SETS: Sets are custom fields that define a subset of data based on some conditions.

10. CASELET
Lenovo designs develop, manufactures and market its product like-PC, laptop, tablet, mobile
phones, servers, etc. Today, in 160 countries Lenovo expanded its empire.

The challenge
Creating reports in Excel was tiresome and required a team of 8 to 10 people for adoption to
other divisions and regions. In addition to this, the analytics team spent six to seven hours
creating one weekly report so, imagine what an overwhelming task it was to create 30
reports or more.

With Tableau, time spent on creating reports is much lesser than creating reports manually.
Teams were able to deliver reports much faster, sometimes even reporting on daily or hourly
basis.

The time saved is used in carrying out analysis and drawing insights from the information.

Implementation
Initially, Lenovo deployed an eight-core instance of Tableau Server which they quickly scaled
up to a 16-core server. The executives introduced Tableau in Lenovo India as a means to
analyze and govern data on a selected set of business use cases and scenarios to help in
decision making. But sooner than they realized, Tableau became an integral part of the
company’s functioning. The company experienced a cultural shift where the approach to
business and growth was more data-centric and data-driven.

The change
• Lenovo India’s BI Analytics & Visualization team created an interactive and flexible
Tableau sales dashboard for departments to use it for ad-hoc analysis and reporting.

Unit 9: Other Features 23


DADS304: Visualization Manipal University Jaipur (MUJ)

• Lenovo has over 55,000 employees and a customer base spread across 160 plus
countries. More than 10,000 users access Tableau dashboards.
• Tableau has increased Lenovo’s efficiency by 95% with approximately 3000 users are
using it in about 28 countries by now.
• Lenovo’s e-commerce team was able to analyze customer engagement patterns to
improve brand perception and increase revenues.
• The human resource department converted 100 static reports into dynamic and
interactive Tableau dashboards which gave users and analysts a new perspective into
solving matters.
• The team is easily able to connect to data sources like Amazon Web Services and
Hortonworks Hadoop Hive. Along with this, a wholesome analysis is possible through
a dashboard where data is integrated from more than 30 data sources such as social
media, customer surveys, retailer websites, online shopping sites, etc.
• Lenovo supports self-service analytics where every user can conduct an individual
analysis on the set of data concerning their domain of activity and suiting their site
roles. Every Tableau user has identification credentials stored in the local identity store
of Tableau using which they can access and work on Tableau dashboards using single
sign-on process.
• Lenovo experienced lucrative growth in e-commerce by using Tableau to analyze
customer experience by fetching data from Lenovo’s unified customer intelligence
platform; LUCI Sky.

References and Suggested Reading


1. Joshua N. Milligan, Learning Tableau 2022: Create effective data visualizations, build
interactive visual analytics, and improve your data storytelling capabilities, 5th Edition
5th ed. Edition
2. Visual Analytics with Tableau, Paperback, 31 May 2019.

Unit 9: Other Features 24


DADS304: Visualization Manipal University Jaipur (MUJ)

11. CONCEPTUAL MAP

Unit 9: Other Features 25


DADS304: Visualization Manipal University Jaipur (MUJ)

MASTER OF BUSINESS ADMINISTRATION


SEMESTER 3

DADS304
VISUALIZATION

Unit 10: Level of Detail (LOD) 1


DADS304: Visualization Manipal University Jaipur (MUJ)

Unit 10
Level of Detail (LOD)

Table of Contents
SL Topic Fig No / SAQ / Page No
No Table / Activity
Graph
1 Introduction
3
1.1 Learning Objectives
2 Aggregating dimensionalities 4
3 Steps to calculate aggregating dimensionalities 1, 2, 3, 4 1 5-8
4 INCLUDE Level of detailed expressions 9
5 Steps to calculate level of detail expression 5, 6, 7 2 9-12
6 Exclude level of detail expression 8, 9, 10, 11, 3
13-16
12
7 Nested lod 13, 14, 15 4 17-21
8 Summary 22
9 Glossary 22
10 Caselet 22
11 Conceptual Map 16 23

Unit 10: Level of Detail (LOD) 2


DADS304: Visualization Manipal University Jaipur (MUJ)

1. INTRODUCTION
INCLUDE level of detail expressions compute values using specified dimensions as well as
any dimensions in the view. You can use INCLUDE level of detail expressions when you want
to calculate at a fine level of detail in the database and then re-aggregate and show at a
coarser level in your view. Adding or removing dimensions from the view will change fields
based on INCLUDE level of detail expressions. Include Level of Detail expressions compute
values using the specified dimensions in the formula in addition to whatever dimensions are
already present in the view. These expressions are useful when we want to calculate at a
finer or a lower level of detail in the database and then reaggregate and show at a higher
level in the view.

1.1 Learning Objectives


After studying this unit, you will be able to:

❖ Explain features of Tableau


❖ Explain Include level of Detail Expressions
❖ Steps to calculate Include level of Detail Expressions

Unit 10: Level of Detail (LOD) 3


DADS304: Visualization Manipal University Jaipur (MUJ)

2. AGGREGATING DIMENSIONALITIES
This section describes how to aggregate dimensionalities other than the view level.Table
calculations can be used to roll data up to a higher level of fabrication, but this approach is a
little long winded and also slows down the performance.Additionally, table calculations are
limited to the values in the table or view.Level of detail can be helpful in this situation.Lod,
or level of detail, is a recent addition to tableau's capabilities that can also calculate values
for dimensions that do not appear in the view or table using simple formulas.

A table calculation is generated exclusively from the result of a query, whereas an Lod is
generated as part of the query sent to the database.Therefore, how a Lod works in tableau
results depends on the context.It is determined by the filters and the level of details in the
view, such as the dimensions on the rows, columns, color, size, detail, etc.An example would
be a state and a sum of sales. If a state is dropped onto a view, the sum of sales will give the
sum of all transactions. Go ahead and add department to this view. The sum of sales will give
the sales for each state by the department. So the more dimensions in our view, the results
would be more granular and less aggregated. So depending on the dimensions present, the
level of detail would vary in the visualization or our view. However, the dimensions placed
in the filter or the pages shelf do not vary the level of detail in our view, but only modifies
the data. An example to understand how a load can help has been illustrated below.

Unit 10: Level of Detail (LOD) 4


DADS304: Visualization Manipal University Jaipur (MUJ)

3. STEPS TO CALCULATE AGGREGATING DIMENSIONALITIES


The total sales value for all states against each of the state values to compare each state's
total to the overall sales has been provided. Since both are at different levels of granularity.
Table calculations can be used in this case to display these values. But let's look at how
hellods can help here. A simple expression has been created here.

Fig 10.1 total sales value for all states against each of the state values

Step 1: A Hellod expression to sum up the sales irrespective of the level of detail in the view,
which in this case is state, and replicating it for every state in the view. So therefore we have
the overall sales appearing against each of the state sales value. Now let's look at a different
scenario where we want to see what the maximum yearly sales for the state is. In this case,
the year is not present in the view, so we cannot use a table calculation for this purpose, but
we can still use the lod. A new LOD has been defined for this case.

Step 2: Calculate the maximum yearly sales by state. Go ahead and drag this to the table.

Unit 10: Level of Detail (LOD) 5


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 10.2 maximum yearly sales by state

Step 3: The maximum yearly sales showing per state row is demonstrated. Although a year
value is not provided in this table or the view in this case, the level of detail is much finer in
the calculation. Therefore, Tableau aggregates the results as needed and displays the
maximum value of the sales, in this case as a single value for each of the state. We have
created these level of detail expressions and also the options that are available within tableau
to create these expressions. We have three options called fix it, include and exclude.

Step 4: All of these three options are used to alter the scope of the expression. The syntax
structure for these lodi expressions should be understood for better perspective. First we
have the scope which could be fixed, Include or Exclude. Then the dimension, followed by
the aggregate expression for the field. The dimension is lodi will actually act upon and the
aggregate expression is actually the calculation that is required, such as the min of sales or
sum of sales, et cetera.

Unit 10: Level of Detail (LOD) 6


DADS304: Visualization Manipal University Jaipur (MUJ)

Step 5: The requirement is all the datas need to be enclosed within these braces or rather
the curly brackets. The multiple dimensions here separated by commas for aggregating at
multiple levels of detail. Fix It computes the value using the specified dimension without
necessarily having to reference to any other dimensions in the view.

Fig 10.3 Syntax of LOD

Imagine a scenario where all our customers and when they were acquired and the sales
amount and you would like to see if there is a correlation or relationship between the time
when the customer is acquired and their contribution to the sales. For this purpose we can
use the fixed scope to define the lod expression. Right click and create calculated field and
call this acquisition date.

Unit 10: Level of Detail (LOD) 7


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 10.4 Acquisition date for years

Step 1:A level of detail expression that calculates the minimum order date for a particular
customer ID is available. Click on OK. Now build the visualization. Drag and drop the order
year to the column shelf.

Step 2: Pick the sales and drag and drop to the row shelf. Next, drag the customer acquisition
date to the color shelf and change this to Bar.

Step 3: It is inferred from this type of cohort analysis which customer groups or cohorts have
made larger contribution to the sales. Most of our purchases are repeat purchases by
customers that had been acquired in 2014. A few more customers in 2015 have been
acquired who have again contributed to purchases in 2016 but however, the number of
customers acquired in 2015 is not as high as 2014, similarly for 2016. On inferring the
include scope for the level of detail expressions. It is obvious that most of the purchases are
made by customers has been acquired in 2014.

Step 4: The sales contribution made by customers in 2015 is not as high as those that are
acquired in 2014, and similarly for 2016, the return purchases again are higher by the 2014
customers. This kind of analysis is especially useful when we are trying to analyze if there is
a correlation between when we acquired a customer and the purchase patterns, et cetera.
Now that we have understood how each of these level of detail, scope, expressions work and
when these can be used, it's also important to understand how Tableau executes them, or
rather the order of execution, so that we know what to use when the fixed level of detail
expression filters are applied after the context filters and before the dimension filters,
whereas the include and exclude level of detail are applied after the dimension filter and
before the measure filter. So, in case of using the fixed level of detail expressions, we should
remember to use any dimension filter as context filter if you do not want them to be ignored,
but say if we do not prefer to use these context filters, then we will need to rewrite your
expressions using the exclude or the include keyword.

SELF-ASSESSMENT QUESTIONS – 1

1. For creating a stemplot we use__________________.

Unit 10: Level of Detail (LOD) 8


DADS304: Visualization Manipal University Jaipur (MUJ)

4. INCLUDE LEVEL OF DETAIL EXPRESSIONS


A Level of Detail expression (also called an LOD expression) computes values at both the data
source and the visualization level. In contrast, LOD expressions give you even greater control
over the level of granularity you want. The rules can be used at different levels of granularity
(INCLUDE), a lower level of granularity (EXCLUDE), or at an entirely independent level
(FIXED).

5. STEPS TO CALCULATE LEVEL OF DETAIL EXPRESSION

Fig 10.5 Sales per state of the year 2014

Unit 10: Level of Detail (LOD) 9


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 10.6 Sales per state of the year 2014 as Excel data

In Figure 10.5.1 it is obvious that the Include level of detail keyword creates an expression
that is less aggregated and more granular by adding the dimension specified in the
expression to the visualization level of detail. The requirement is to display the average sales
order amount per state or region. In Figure 10.5.1, we have a map that shows the sales per
state for the year 2014. It is inferred that the amount shows up as 7699 for connect ticket.
Let's verify if this data is correct using the same data in an Excel sheet here. The steps
required to create and use an LOD expression in Tableau has been listed below.

Step 1: In Figure 10.2, it is inferred that the total sales amount per order ID for Connectcut
taking the average year for Connecticut is 16,000 525.The calculation is computed at the
dimensionality defined by the dimensions present. So it's actually the average sales for all
the line items or rows belonging to the state level, which is the dimension.

Step 2: The granularity is the product ID and not the order level. The average value for the
data provided in Figure 10.2 is 7699. The average is computed at the product ID level for the

Unit 10: Level of Detail (LOD) 10


DADS304: Visualization Manipal University Jaipur (MUJ)

state. But the requirement is the order amount aggregated up to the order level for the state
and then the average computed at these order IDs belonging to the state level. A level of
detail that includes expression can be helpful in this situation.

Step 3: The next step is to define the detail level that includes expression here.Select Create
Calculated Field from the right click menu and call it Average Sales Order.

Fig 10.7 Average sales order

Step 4: The sum of sales per order ID is computed here.And then finally, the average to
average out these values are computed using the level of detail expression. Finally click OK.

Step 5: Reverting to the view, add the average sales order, the recently calculated degree of
detail expression, and obtain the original sales figure from the map.

Fig 10.7 Average Sales per state for the year 2014

Unit 10: Level of Detail (LOD) 11


DADS304: Visualization Manipal University Jaipur (MUJ)

Step 6: Now we have to change this to Average.

Step 7: On examining the Connect Ticket average sales order amount ,it is inferred that the
displayed value is 16,525 as expected.

Now determining LOD using build two views has been demonstrated. Both the views with
an average calculated a different level of detail. The second view, with the level of detail
expression now has the correct value aggregated at the state level and order ID level,
although the view is still at the state level. Hence the level of detail expression extends
tableau's calculation language by introducing the cable ability to define at the level the
aggregations should happen.

SELF-ASSESSMENT QUESTIONS – 2

2. The icon associated with the field that has been grouped is a ______________
3. It is possible to disable the highlight option for the entire workbook. Yes/No

Unit 10: Level of Detail (LOD) 12


DADS304: Visualization Manipal University Jaipur (MUJ)

6. EXCLUDE LEVEL OF DETAIL EXPRESION


Exclude level of detail expressions uses dimensions that needs to be omitted from the view
level of detail. These expressions are useful for calculations that are being done at a higher
level of detail than the WISD. These calculations can also be used to remove dimension from
any other level of detail expression. Consider a scenario where we want to do a comparative
analysis to understand how a state is performing compared to a particular reference state.
Initially, we wanted to create a parameter so that we can make this choice of region or state
dynamic. To create a parameter, right click on State and choose Create parameter. After
retaining the default values click OK.

STEP 1: Initially a state parameter has been created. Right click on this and select Show
parameter Control. The state parameter control is available on the right to choose
dynamically as we navigate through the visualization. Next the calculated field is defined to
compute the sales for the state. A comparative analysis with right click on State Create
calculated field can be done.

Fig 10.8 Calculating the state parameter

Unit 10: Level of Detail (LOD) 13


DADS304: Visualization Manipal University Jaipur (MUJ)

STEP 2: This is called as the reference state.

STEP 3: A condition is defined as whenever state is equal to the chosen parameter then
compute the sales value, else compute it is zero and End.

Fig 10.9 Calculate the sales value

STEP 4: A reference to Sales value that holds the value of the sales for the chosen parameter
is chosen. Click OK. Next let's define the sales for the state using Exclude lod so that it
excludes the state belonging to the row from the sales total. Once this is done, this value can
be used to repeat across all states and we can then easily calculate the difference between
each state value and the state chosen from the parameter dropdown right click on state and
choose Create calculated field. This parameter is called as Exclude state value.

Unit 10: Level of Detail (LOD) 14


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 10.10 Define sales for the state using Exclude LOD

STEP 5: Now the LOD to exclude the state and sum up the value for the reference state or
the chosen parameter state is defined.

STEP 6: After defining the exclude expression, click on OK. Now compute the difference for
the state and the reference state sales value. Right click on Sales and create calculated field.
This difference in Sales sum of Sales minus sum of the Exclude Sales value is called. A view
with the state, the sales and the difference amount to do a comparison analysis is finally built.

Fig 10.11 Graphical representation of each state with the sales amount

STEP 7: Filter for the year 2015 and click Apply.

Unit 10: Level of Detail (LOD) 15


DADS304: Visualization Manipal University Jaipur (MUJ)

STEP 8: Drag state and the sales value initially.

STEP 9: Track the difference in Sales amount.

STEP 10: Color these difference amounts.

STEP 11: A comparative analysis graph showing up here which shows that California and
florida are still doing well compared to Texas, while all other states are lagging behind.

Fig 10.12 comparative analysis

STEP 12: The exclude expression can be used to repeat a value across all the states so that
we can easily calculate the difference between each state and the state chosen from the
parameter dropdown.

SELF-ASSESSMENT QUESTIONS – 3

4. Effective tables and charts for data visualization can be designed using _____________.

Unit 10: Level of Detail (LOD) 16


DADS304: Visualization Manipal University Jaipur (MUJ)

7. NESTED LOD
If our business requirement is much more complicated and we need more than one layer of
level of detail calculation. Initially, we could start with the visualization level of detail, then
have an inner part that uses an include expression to produce a more granular risk result.
This could be then wrapped in an exclude or a fixed expression so that the inner result is
aggregated back to the outer level of detail. Finally, the calculation level of detail will be
resolved to match the level of detail of the visualization. These kind of level of detail
expressions are more commonly referred to as nested level of detail.

A business is interested in seeing how many orders per state end up being unprofitable.

Step 1: Create a nested level of detail that first calculates how many orders are not profitable
using the include keyword right click and choose Create calculated field. Type the name as
number of unprofitable orders.

Step 2:An expression that looks for negative values and uses the int function to replace the
false values with zero and the true values as one is available. Using the fixed keyword to
calculate the orders at the same level of detail as state that's been specified here. click OK.
Now that we have defined the number of unprofitable orders.

Fig 10.13 Number of unprofitable orders

Step 3:Create one more calculation for the percentage of unprofitable orders.

Unit 10: Level of Detail (LOD) 17


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 10.14 Percentage of unprofitable orders

Step 4: Divide the total number of unprofitable orders over the number of order IDs. Click
OK. Choose percentage from the default properties for this value.

Step 5: Drag and drop the state the percentage and change this to tree view.

Fig 10.15 Percentage and change

Step 6:Replace the size with the order ID count.

Step 7:The size in this view is controlled by the total number of orders for the particular
state and the color is controlled by the percentage of unprofitable orders. Here California has
a high percentage of unprofitable orders compared to Colorado which also has a

Unit 10: Level of Detail (LOD) 18


DADS304: Visualization Manipal University Jaipur (MUJ)

considerable amount of orders. But at the same time the person of orders being unprofitable
is lower than California.

SELF-ASSESSMENT QUESTIONS – 4

5. _______________ includes the data values available for visualization.


6. ____________ is the current version of Tableau.

Answers
1. stem()
2. Paperclip
3. Yes
4. Data-ink ratio
5. Table calculation
6. 2020.3

Terminal Questions and Answers


1. What is Tableau?
• Tableau is a business intelligence software.
• It allows anyone to connect to the respective data.
• Visualizes and creates interactive, shareable dashboards.
2. What are Measures and Dimensions?
Measures are the numeric metrics or measurable quantities of the data, which can be
analyzed by dimension table. Measures are stored in a table that contain foreign keys
referring uniquely to the associated dimension tables. The table supports data storage at
atomic level and thus, allows more number of records to be inserted at one time. For
instance, a Sales table can have product key, customer key, promotion key, items sold,
referring to a specific event.

Dimensions are the descriptive attribute values for multiple dimensions of each attribute,
defining multiple characteristics. A dimension table ,having reference of a product key form
the table, can consist of product name, product type, size, color, description, etc.

Unit 10: Level of Detail (LOD) 19


DADS304: Visualization Manipal University Jaipur (MUJ)

3. What is the difference between .twb and .twbx extension?


• A .twb is an xml document which contains all the selections and layout made you have
made in your Tableau workbook. It does not contain any data.
• A .twbx is a ‘zipped’ archive containing a .twb and any external files such as extracts
and background images.
4. How many maximum tables can you join in Tableau?
You can join a maximum of 32 tables in Tableau.

5. What are the different connections you can make with your dataset?
We can either connect live to our data set or extract data onto Tableau.
• Live: Connecting live to a data set leverages its computational processing and storage.
New queries will go to the database and will be reflected as new or updated within
the data.
• Extract: An extract will make a static snapshot of the data to be used by Tableau’s
data engine. The snapshot of the data can be refreshed on a recurring schedule as a
whole or incrementally append data. One way to set up these schedules is via the
Tableau server.

The benefit of Tableau extract over live connection is that extract can be used anywhere
without any connection and you can build your own visualization without connecting to
database.

6. What are shelves?


They are Named areas to the left and top of the view. You build views by placing fields onto
the shelves. Some shelves are available only when you select certain mark types.

7. What are sets?


Sets are custom fields that define a subset of data based on some conditions. A set can be
based on a computed condition, for example, a set may contain customers with sales over a
certain threshold. Computed sets update as your data changes. Alternatively, a set can be
based on specific data point in your view.

Unit 10: Level of Detail (LOD) 20


DADS304: Visualization Manipal University Jaipur (MUJ)

8. What is Tableau Data Server?


Tableau server acts a middle man between Tableau users and the data. Tableau Data Server
allows you to upload and share data extracts, preserve database connections, as well as reuse
calculations and field metadata. This means any changes you make to the data-set, calculated
fields, parameters, aliases, or definitions, can be saved and shared with others, allowing for
a secure, centrally managed and standardized dataset. Additionally, you can leverage your
server’s resources to run queries on extracts without having to first transfer them to your
local machine.

9. How to create a calculated field in Tableau?


• Click the drop down to the right of Dimensions on the Data pane and select “Create >
Calculated Field” to open the calculation editor.
• Name the new field and create a formula.

10. What is the difference between a tree map and heat map?
A heat map can be used for comparing categories with color and size. With heat maps, you
can compare two different measures together.

Unit 10: Level of Detail (LOD) 21


DADS304: Visualization Manipal University Jaipur (MUJ)

8. SUMMARY
Let us recapitulate the important concepts discussed in this unit:
• To use the level of detailed expressions to aggregate our dimensionalities other than
the view level.
• Provides explanations for when and why you should use a LOD to visualize your data.
• Provides explanation about Include Level od Detail Expression.
• Provides explanation about Include Level od Detail Expression.
• Illustration about the use of Nested LOD

9. GLOSSARY
Let us have an overview of the important terms mentioned in the unit:
LOD: Level of Detail expressions (also known as LOD expressions) allow you to compute
values at the data source level and the visualization level.

Fixed LOD: FIXED level of detail expressions compute a value using the specified dimensions,
without reference to the dimensions in the view.

INCLUDE LOD:Compute values using the specified dimensions in addition to whatever


dimensions are in the view.

10. CASELET
1. Get a Single Aggregate.
2. Isolate a Specific Value from a range of values
3. Synchronize Chart Axes

References and Suggested Reading


1. Joshua N. Milligan, Learning Tableau 2022: Create effective data visualizations, build
interactive visual analytics, and improve your data storytelling capabilities, 5th Edition
5th ed. Edition
2. Visual Analytics with Tableau, Paperback, 31 May 2019.

Unit 10: Level of Detail (LOD) 22


DADS304: Visualization Manipal University Jaipur (MUJ)

12. CONCEPTUAL MAP

Fig 10.16

Unit 10: Level of Detail (LOD) 23


DADS304: Visualization Manipal University Jaipur (MUJ)

MASTER OF BUSINESS ADMINISTRATION


SEMESTER 3

DADS304
VISUALIZATION

Unit 11: Dashboard and Story Telling 1


DADS304: Visualization Manipal University Jaipur (MUJ)

Unit 11
Dashboard and Story Telling

Table of Contents
SL Topic Fig No / SAQ / Page No
No Table / Activity
Graph
1 Introduction
3
1.1 Learning Objectives
2 Aggregating dimensionalities 1, 2, 3, 4, 5, 6,
4-12
7, 8, 9
3 Steps to calculate aggregating dimensionalities 10, 11, 12, 13 1 13-16
4 INCLUDE Level of detailed expressions 14, 15, 16, 17 2 17-21
5 Steps to calculate level of detail expression 20 21-23
6 Exclude level of detail expression 3 24-27
7 Nested lod 28
8 Summary 28
9 Glossary 28
10 Caselet 29
11 Conceptual Map 30

Unit 11: Dashboard and Story Telling 2


DADS304: Visualization Manipal University Jaipur (MUJ)

1. INTRODUCTION
A dashboard is a visual display of the most critical information needed to achieve one or
more business objectives which fits entirely on a single computer screen so that all
information can be monitored in a single glance. A story on the other hand, contains multiple
sheets or dashboards that are combined together to convey a particular business story which
shows how various facts or incidents are connected and what can be done or could have been
done to improve a business outcome. In this chapter, we are going to look at the retail
dashboard and story that have been built using the retail data set.

1.1 Learning Objectives


After studying this unit, you will be able to:

❖ Explain features of Dashboard and story


❖ Explain the steps to build dashboards
❖ Steps to build a story

Unit 11: Dashboard and Story Telling 3


DADS304: Visualization Manipal University Jaipur (MUJ)

2. DASHBOARD AND STORY USING TABLEAU


This section describes about dashboard and stories using Tableau. Dashboards are a way to
display a wide range of data visually. A dashboard conveys different, yet related information
in an easy-to-digest format. In many cases, this includes key performance indicators (KPIs)
or other important business metrics that stakeholders need to understand quickly. Due to
their highly customizable nature, dashboards are useful across a wide range of industries
and verticals. Data of all kinds can be included in these reports with varying date ranges to
help you understand: what happened, why it happened, what may happen, and what you
should do. Additionally, dashboards can be understood by those who aren't as familiar with
the data since they use visuals like tables, graphs, and charts.

STEP 1: A dashboard with a comprehensive view of the sales trend for the last two years is
shown, and then the sales broken down by the states and then which cities are the top
performing in terms of the sales.

Fig 11.1 Comprehensive view of sales in last 2 years

STEP 2: The profits for 2014 and 2015 is being displayed, and then the profits is broken
down by the state and the top performing cities by profit. All these views are built to be
interactive as they allow us to filter to a particular month or a year or even into a particular

Unit 11: Dashboard and Story Telling 4


DADS304: Visualization Manipal University Jaipur (MUJ)

state. For instance, if we click on 2015 year, then all the views filter to list the data
corresponding to the year 2015 alone. Now clicking outside will return us back to the original
state and in the same single view we can also figure out the top five products for a particular
state or even the overall sales by category and customer profile as and if more purchases are
being made by male or female and so on.

Fig 11.2 Retail dashboard

STEP 3 : A dashboard gives us so much of information and the interactive capability allows
us to derive insights and answers questions that might come up as we review this dashboard.
And with dashboards we can also have multiple tabs representing a different dimension of

Unit 11: Dashboard and Story Telling 5


DADS304: Visualization Manipal University Jaipur (MUJ)

the same business information or other aspects of the business contributing to the business
performance or outcome. On looking at the second tab, sales by category, we have the sales
broken down by the category and the subcategory, along with the delivery status, both
overall and the various location it is shipped from. This view is again interactive, allowing us
to filter by the year or the month.

Fig 11.3 Interactive view with filters

STEP 4 : And also clicking on the individual categories will allow us to monitor the delivery
status for that particular category. Now by clicking on various months or the years or the
categories, we can see if we have a problem with a particular location or a category and take
any timely decision that's necessary to resolve any issues that exist. And lastly, we have a tab
that displays the sales broken down with a customer profile, or rather gender in this case.
This again gives a comprehensive view of how gender influences the sales, both regionally

Unit 11: Dashboard and Story Telling 6


DADS304: Visualization Manipal University Jaipur (MUJ)

and by category. So we have seen a dashboard that contains all the critical information within
a single view or tabs, allowing the user to monitor his business performance effectively.

Fig 11.4 Overall sales by category and customer profile

STEP 5 : Again, each tab year that we saw is a single view of a particular business function
or area that the user is interested in analyzing and deriving Insights . In the dashboard, we
saw the way various visualizations helps the user analyze the data, but in a story, we're
actually presenting our findings from the analysis in a conclusive manner to the end user. So
different facts are knit together to form the story that conveys why a particular scenario
happened and what could have been done to prevent something bad from happening or
make some conclusive recommendation based on the facts here. From an analysis that has
been done using the retail data set, we've seen that the sales went down in August and

Unit 11: Dashboard and Story Telling 7


DADS304: Visualization Manipal University Jaipur (MUJ)

September of 2015. And our story basically is targeted at presenting our findings related to
this decline in sales and why it happened and what could have been done to prevent this
scenario.

Unit 11: Dashboard and Story Telling 8


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 11.5 Sales by category and sub-category

STEP 6 : The story that we've built, starts with displaying the sales trend across 2014 and
2015 by women, men and home. The first tab of the view shows how sales has taken a huge
dip in August and September of 2015, especially for the women category. Comparing this to
the previous year, we see that this is not a seasonal dip as previous year's sales seems to be
fine during the same time period and this exactly was conveyed as part of this initial story
view. And then we move on to the next tab where we further go into the details as to which
women's categories experienced this particular decline. Here we see that the clothing and
category sales have particularly gone down significantly in both of these months.

Unit 11: Dashboard and Story Telling 9


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 11.6 Building story

STEP 7 : Go ahead and look further at the customer sentiment for these months. Here we see
that there is significant negative sentiment factor associated with June and July of 2015 and
analyzing these sentiments by clicking on these months here further shows that a lot of it is
due to delayed or bad service. So we move on and analyze the delivery status for these
particular months. Here it is noticed that the clothing and the handbag shipments
particularly were delayed during June and July due to bad weather conditions in Chicago.
This in turn led to a lot of social media negative sentiment which we saw in the previous tab.
So this affected the sales for the next two months as customers weren't too keen on ordering
given the shipments were delayed.

Unit 11: Dashboard and Story Telling 10


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 11.7 Different facts are knit together to form the story

Fig 11.8 Sales by category analysis

Unit 11: Dashboard and Story Telling 11


DADS304: Visualization Manipal University Jaipur (MUJ)

STEP 8:
If this had been addressed sooner and the company had clarified to the customers the
reasons for the delay and how this was a temporary problem and that they were working
actively on it, it would have helped them avoid the negative impact on the sales. So in the
story that all these facts are put together the steps that could have been adopted to improve
has been illustrated.

Fig 11.9 This Story contains all facts and recommendations for an improved business
outcome

Unit 11: Dashboard and Story Telling 12


DADS304: Visualization Manipal University Jaipur (MUJ)

3. BUILDING DASHBOARDS
Once the views have been built, the next logical step would be to use a dashboard or a story
to present these data views. To create a dashboard, we can either click on the new dashboard
icon next to the ad sheet icon, or click on the dashboard link on the top menu ribbon.

Step 1: Initially the dashboard workspace has opened up. A dashboard pane on the left opens
up with the dashboard properties in place of the data pane. This pane displays all available
sheets that can be used in the dashboard.

Fig 11.10 Building dashboard and story

Step 2: The first thing while building dashboard is to set the size of the dashboard in the size
drop down. We have various options from automatic to exactly to range or various portrait

Unit 11: Dashboard and Story Telling 13


DADS304: Visualization Manipal University Jaipur (MUJ)

and landscape options and also an option for iPad. Automatic is an option that the dashboard
determines to fill on its own within the available area, exactly as the name suggests gives us
the flexibility to set some fixed size that we want the dashboard to display in. Then there is
the select range option. This option is used to set the limits or boundaries that the dashboard
can expand or shrink into. It's very important to plan ahead and set the size so that we have
to end up redesigning the layout at a later point. Add the views to the dashboard. It's as easy
as dragging and dropping a view to the dashboard layout. When we drag the sheet one, it
takes out the entire space. Go ahead and drag the next view.

Fig 11.11 Adding views to the dashboard

Step 3: Once dragged and dropped a view, a checkmark icon appears next to the view
indicating that the view is used by the dashboard. Secondly, the options on the left panel are
available to format the dashboard layout. Initially, we have a horizontal container, a vertical
container, images, web pages, text and a blank box that can be added to the dashboard.
Before we dig deeper into the containers. Tile layouts are the ones that are arranged in a
single layer grid that adjusts in size based on the total size of the dashboard objects, whereas
floating objects are the ones that can be layered on top of other objects and have a fixed
custom position and size. Drag the views using the tile layout option. The objects are
arranged next to each other using a horizontal layout container for both the top and the
bottom row. Change these views to use the floating layout.

Unit 11: Dashboard and Story Telling 14


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 11.12 Sales trend and sales by state

Step 4: The second view is layered on top of the first view. It's also possible to switch
between tiled and floating by clicking on the floating option from the shortcut menu. Change
the tile layout option for the first view. So now both views are floating. In the layout section
on the left all the items that have been added to the Dashboard are available. The order of
these items can be changed by dragging and dropping items in a different order than the
hierarchy. Tile layout items can never be reordered. Change the tile layout option for the
Sales Trend view. Now change this to floating and position each one of these on top of each
other. The order in which it is arranged in the layout on the left determines where both these
views are placed.

Step 5: The order has been changed. Move Sales by State to the top. The order has been
changed and the item Sales by Trend automatically goes to the back. The order of the items
can be changed by dragging and dropping items in a different order rather than the hierarchy
only for floating layout items, tile layout items can never be reordered. Next, is the position
and the size options. The position allows us to specify the position in pixels where they need
to be placed. And the size determines the width and height of the view for the floating objects
in the Dashboard. We want the Sales by State view to be displayed or placed in the top left
corner. Then specify the Y position as Zero for Sales by State.

Unit 11: Dashboard and Story Telling 15


DADS304: Visualization Manipal University Jaipur (MUJ)

Step 6: The Sales by state view has moved to the top left corner. The height or width for the
floating object can be controlled. Then the show title option is displayed. This can be used to
toggle the display for the title at both the sheet level and the dashboard level.

Fig 11.13 Sales by state

SELF-ASSESSMENT QUESTIONS – 1

1. _____________ is an important component in a dashboard.


2. Click on _____________ icon to build a dashboard.

Unit 11: Dashboard and Story Telling 16


DADS304: Visualization Manipal University Jaipur (MUJ)

4. FORMATTING DASHBOARDS

Fig 11.14 Sales by Dept and category

For formatting dashboards, consider an example which filters the top view, that is the top
ten cities by state to just show Los Angeles. It is noticed that the views remain set in the place
that they are placed initially, although there is enough space for the bottom view, that is the
sales per category and department view to move up now and use up all the free space. A
dashboard have the two views they placed inside the vertical layout container. Go in and
apply the same filter here. The second view at the bottom automatically moves up using the
empty space and in the process eliminating the need for a scrollbar. These container objects
helps in creating the seamless experience for users by repositioning and resizing the
dashboard objects whenever necessary.

Step 1: First, there is the horizontal layout container. This allows to group worksheets and
any other components that needs to be part of your dashboard from left to right direction.
And also using this layout, you can edit the height of all the objects that are placed within this
container at the same time. And then there are the vertical containers. Vertical container
allows to group worksheets and other dashboard components from top to bottom. And using
a vertical layout container, it is possible to edit the width of all the objects that are placed
within it at the same time.

Unit 11: Dashboard and Story Telling 17


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 11.15 Top ten cities by state

Suppose you want to have one top view and two bottom views. When this is the case, place
a vertical container at the top of your dashboard layout. Add a horizontal container at the
bottom. Drop the view to see at the top of the dashboard into the vertical container, and the
two views to see at the bottom into the horizontal container. This is now the layout for the
dashboard. So, depending on the views to be placed and where they need to be placed, it is
necessary to decide whether horizontal or vertical containers is required.

Step 2: The layout containers always have a blue border when highlighted. Layout
containers in general are always pushed to the background of view and it's very hard to
select. So an easy way to select any particular element on the dashboard or in specific the
layout container would be to click on the element in the layout section, especially when there
are multiple or nested container objects.

An easiest way to select layout container is illustrated below. Initially, select the gray
container. Then use the dropdown arrow at the top right corner and choose Select Layout
Container. Now this highlights the layout container within which the particular view has
been placed. Secondly, add a web page to dashboard. we have an option that allows to add a
web page or a live connection. This option lets to embed any web page directly within the
Dashboard, which in fact is a live connection that displays the details of the page whenever
the dashboard is open. Suppose there is a finance dashboard and the interest is in seeing the

Unit 11: Dashboard and Story Telling 18


DADS304: Visualization Manipal University Jaipur (MUJ)

daily stock market status within the Dashboard. This could be done by adding the web page
with the URL pointing to that particular web page.

Fig 11.16 Formatting Dashboard

Step 3: Initially, add Horizontal container to the bottom. Track the web page object onto this
horizontal container. Enter the webpage address for the Yahoo finance page and click OK.
This will bring in the details of the current stock market status from the Yahoo web page and
populate it within this object container. To add an image, we simply click on the image object
and drag it to the location where the image has to be added. Go ahead and add a logo for this
dashboard. Then, pick the PNG image or the logo image file that has been already saved on
the desktop. Once it's added, it is possible to center it or fit it by choosing an option from the
menu. By right clicking on the image from the menu, it is possible to add or set a URL to make
this image linkable to a webpage.

Step 4: Go ahead and create the sample dashboard for this session using our views that have
already been created on.

Unit 11: Dashboard and Story Telling 19


DADS304: Visualization Manipal University Jaipur (MUJ)

Step 5: Start with adding these views. Sales Strength, Sales by State and the top ten cities by
state. Go ahead and add a horizontal container for the dashboard title.

Step 6: Go ahead and drop a text container to add the dashboard title. Drag one more
horizontal container to a dashboard layout space.

Step 7: Go ahead and arrange all these three views that have been dropped onto our
workspace into this container. Expand the size so that all three views fit into the first row
within this container.

Step 8: Adjust the size by clicking on the arrow mark on the horizontal container and also
the views to set the size. To format the dashboard. Click on dashboard and choose Format.

Step 9: Change the default Dashboard shading to another color if preferred. Change the
dashboard title, font, alignment, shading and so on within this format. Dashboard options.
Go and format the layout containers. Finally, change the color to a medium gray.

Fig 11.17 Sample Dashboard

Step 10: To change the view colors and format them, it is necessary to directly do it in within
the sheet view. This could be done by going back to the view or by right clicking with a view
within the dashboard and choosing format.

Unit 11: Dashboard and Story Telling 20


DADS304: Visualization Manipal University Jaipur (MUJ)

SELF-ASSESSMENT QUESTIONS – 2

3. The distribution of a single continuous measure can be visualized using


___________plot and ___________.

5. ADDING INTERACTIVITY TO DASHBOARDS


The process of adding interactivity give the user more power, allowing them to make the
analysis much easier. Dashboard Actions are the elements within tableau that helps to have
context and provide interactivity to your data in your Dashboard. Various actions are
available in a dashboard. Click on dashboard and choose actions. First we have the highlight
actions. Highlight Actions allows to call attention to specific points of interest by highlighting
or coloring certain marks and dimming the others. Easiest way to do highlighting is by
manually selecting a group of marks or a particular mark to highlight.

Fig 11.18 Interactivity to dashboards

In the Fig 11.5.1 the mail and therefore only the male category is being highlighted in all the
other views. In this case, the mail has been highlighted, but it is possible to highlight a single
or even multiple items within Legend. This option can either be enabled or disabled from
within the toolbar option at the top. The steps to add interactivity dashboard is illustrated
below

Unit 11: Dashboard and Story Telling 21


DADS304: Visualization Manipal University Jaipur (MUJ)

Step 1: The option to highlight based on criteria or custom criteria has to be defined. This is
done through advanced highlight actions. On a worksheet, go ahead and click on Worksheet
Actions. In the Actions dialog box, click the Add Action button and select Highlight. Add a
name here and then select the trigger option. To trigger the action on Hover, select on Menu.
Hover is useful especially in a Dashboard when the mouse over a mark in the view and the
corresponding marks in all other views are highlighted. Or rather the particular action that
has been added is run.

Fig 11.19 Highlight actions

Step 2: Select the highlight marks based on a click action. Click on a mark, the action is run.
Finally, is the Menu option. This allows us to highlight by selecting an option from the context
menu by right clicking on a mark. Go ahead and select the target sheets.

Unit 11: Dashboard and Story Telling 22


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 11.20 Adding Highlight actions

Step 3: Select the source sheet as Sales by State year and select the target sheets as Sales by
State, sales Trend and Top Sensitives by State.

Step 4: Go ahead and complete rest of the Dashboard offline so that it is possible to review
it further. Next, check if the layout is suitable for the end presentation or device, and then we
can specify the overall size in the Dashboard menu and if it needs to be automatic, or if it
automatically needs to resize to fit the window it is displayed in.

Step 5: Check if any unwanted items are present so that they can be cleaned up.

Step 6: Shading affects the dashboard objects itself and not the view themselves.

Unit 11: Dashboard and Story Telling 23


DADS304: Visualization Manipal University Jaipur (MUJ)

6. BUILDING STORIES
A story is a sequence of worksheets or dashboards that convey information in a narrative
manner so it is possible to indicate the way plot is connected and provide some context and
make a company spelling case using a storyline so that the business decision making process
is much easier. Each individual sheet in a tableau story is called a story point and it's not a
static presentation .It is a live connection to data and can change as the underlying data
changes. Stories work great as a presentation medium, making the narrative flow easier to
present to the audience. Once a click is done on the new story window, a new layout window
opens up with the left panel listing the available worksheets and the dashboards that can be
added to the story.

6.1 Steps to Create Story Using Tableau


Step 1: Choose to display or hide the Navigator buttons and specify the size and choose to
show or hide the title. Go ahead and choose a size from the predefined sizes

Step 2: Go ahead and choose a desktop size. To edit the title, double click and choose your
report title.

Step 3: Add the sheet that wanted to be displayed as the first story point. Go ahead and add
sales trend. It is also possible to customize these sheets within the story point by selecting a
particular range of marks or by sorting them in a particular order, or by filtering it to a
particular field and retaining them as part of the story. Any customization that is done on the
Story Point sheet will not be automatically updated in the original sheet. However, edits in
the original sheet, will be carried forward to the story points.

Step 4: In order to format the story pane, click on Story and choose Format . It is possible to
change the shading, or we can even set a color and set a transparency level, or set a Navigator
shading, or set the alignment, or the font and so on. To add a caption to each one of the story
points, double click and enter the story point name.

Step 5: Create a new story point, click on the new blank point. And now drag another sheet
onto this sheet and drag another sheet onto your story point.

Unit 11: Dashboard and Story Telling 24


DADS304: Visualization Manipal University Jaipur (MUJ)

Step 6: To delete the story point, click on the delete icon that appears right next to the story
point. To rearrange or remove story points, just drag a sheet and drop it in any order is
required. In order to duplicate story points click on the duplicate icon. This will create a copy
of the story point where a separate caption is created. To navigate between the story points
or use the back forward button in case there are a lot of story points.

Step 7: If the navigator buttons are not required, choose deselect. There are different tabs
for the different story points. Once the story is created view the story or rather present using
the presentation mode on the top. This will give a magnified view of the story.

SELF-ASSESSMENT QUESTIONS – 3

4. Can a URL activity on a dashboard question be conveyed to an open a Site page


inside a dashboard instead of opening the framework's internet browser? Yes/No

Answers
1. Reporting requirements
2. Dashboard
3. Box plot and histogram
4. Yes, with the utilization of a Site page protest
5. Sum
6. Countd

TERMINAL QUESTIONS
1. Give a brief about the tableau dashboard?
Tableau dashboard is a group of various views which allows you to compare different types
of data simultaneously. Datasheets and dashboards are connected if any modification
happens to the data that directly reflects in dashboards. It is the most efficient approach to
visualize the data and analyze it.

2. Define Page Shelf in Tableau?


Page shelf breaks the views into a series of pages. It displays an alternate view on each page.
Due to this feature, you can analyze the effect of each field on the rest of the data in the view.

Unit 11: Dashboard and Story Telling 25


DADS304: Visualization Manipal University Jaipur (MUJ)

3. Define the story in Tableau?


The story can be defined as a sheet which is a collection of series of worksheets and
dashboards used to convey the insights of data. A story can be used to show the connection
between facts and outcomes that impacts the decision-making process. A story can be
published on the web or can be presented to the audience.

4.Give an overview of the fact and dimensions of the table?


Facts are numeric measures of data. They are stored in fact tables. Fact tables store that type
of data that will be analyzed by dimension tables. Fact tables have foreign keys associating
with dimension tables.

Dimensions are descriptive attributes of data. Those will be stored in the dimensions table.
For example, customer’s information like name, number, and email will be stored in the
dimension table.

5. State some ways to improve the performance of Tableau


• Use an Extract to make workbooks run faster
• Reduce the scope of data to decrease the volume of data
• Reduce the number of marks on the view to avoid information overload
• Try to use integers or Booleans in calculations as they are much faster than strings
• Hide unused fields
• Use Context filters
• Reduce filter usage and use some alternative way to achieve the same result
• Use indexing in tables and use the same fields for filtering
• Remove unnecessary calculations and sheets
6. Explain different connection types in Tableau?
There are 2 connection types available in Tableau.

Extract: Extract is a snapshot of data that will be extracted from the data source and put into
the Tableau repository. This snapshot can be refreshed periodically fully or incrementally.
This can be scheduled in Tableau Server.

Live: It creates a direct connection to the data source and data will be fetched directly from
tables. So, data will be up to date and consistent. But, this also affects access speed.

Unit 11: Dashboard and Story Telling 26


DADS304: Visualization Manipal University Jaipur (MUJ)

7. Explain how many types of filters are available in Tableau?

Filters are used to provide the correct information to viewers after removing unnecessary
data. There are various types of filters available in Tableau.

▪ Extract Filters – Extract filters are used to apply filters on extracted data from the
data source. For this filter, data is extracted from the data source and placed into the
Tableau data repository.
▪ Datasource Filters – Datasource filters are the same as extract filters. They also work
on the extracted dataset. But, the only difference is it works with both live and extract
connections.
▪ Context Filters – Context Filters are applied on the data rows before any other filters.
They are limited to views, but they can be applied on selected sheets. They define
Aggregation and Disaggregation of data in Tableau
▪ Dimension Filters – Dimension filters are used to apply filters on dimensions in
worksheets. Dimension filters are applied through the top or bottom conditions,
formula, and wildcard match.
▪ Measure Filters – Measure filters are applied to the values present in the measures.
8. Differentiate between Tiled and Floating in dashboards?
In a tiled layout, items don’t overlap. The layout will be adjusted according to dashboard size.
In the floating layout, items can be placed on some other layers. Floating items can have fixed
positions and sizes.

9. State the components of the dashboard?


The dashboard consists of 5 components.
• Web: it consists of a web page embedded in the dashboard.
• Horizontal component: it is a horizontal layout container in which we can add
objects.
• Vertical component: it is a vertical layout container in which we can add objects.
• Image Extract: it allows you to upload an image to the dashboard from a computer.
• Text: it is a small Wordpad where we can format and edit the text.

Unit 11: Dashboard and Story Telling 27


DADS304: Visualization Manipal University Jaipur (MUJ)

8. SUMMARY
Let us recapitulate the important concepts discussed in this unit:
• To use the dashboards, interactive dashboards and formatting dashboards
• Provides explanations for when and why you should use a dashboard and story to
visualize your data.
• Provides explanation about steps involved in creating a dashboard.
• Provides explanation about steps involved in creating a story.

7. GLOSSARY
Let us have an overview of the important terms mentioned in the unit:
Dashboard: A dashboard is a collection of several views, letting you compare a variety of data
simultaneously.

Story: In Tableau, a story is a sequence of visualizations that work together to convey


information.

Unit 11: Dashboard and Story Telling 28


DADS304: Visualization Manipal University Jaipur (MUJ)

8. CASELET
1. Finance: Understanding and sharing variances
Given the rapid pace of changes in the economy and business operations, it can be harder
than ever to manage budgets, expenses and forecasts. The finance team’s dashboards
provide up-to-date results and enable the team to change forecasts as uncertainty decreases.
By sharing this dashboard with leaders across the business, the finance team supports
strategic decision-making and resource allocation.

The COVID-19 pandemic has forced human resources departments to ask new questions
around employee safety and working from home. The Tableau people analytics team used
data to inform their response: They enriched their data with publicly available information
and by gathering new data from an employee survey.

References and Suggested Reading


1. Joshua N. Milligan, Learning Tableau 2022: Create effective data visualizations, build
interactive visual analytics, and improve your data storytelling capabilities, 5th Edition
5th ed. Edition
2. Visual Analytics with Tableau, Paperback, 31 May 2019.

Unit 11: Dashboard and Story Telling 29


DADS304: Visualization Manipal University Jaipur (MUJ)

9. CONCEPTUAL MAP

Fig 11.21

Unit 11: Dashboard and Story Telling 30


DADS304: Visualization Manipal University Jaipur (MUJ)

MASTER OF BUSINESS ADMINISTRATION


SEMESTER 3

DADS304
VISUALIZATION

Unit 12: Power BI - Connecting To Data Using Power Query 1


DADS304: Visualization Manipal University Jaipur (MUJ)

Unit 12
Power BI - Connecting To Data Using Power
Query
1. INTRODUCTION
MS Excel can be used to convert raw data into meaningful visualizations. Although excel has
its own limitations, there are add-ins that make excel an excellent path for data visualization.

1.1 Learning Objective:


❖ Understand the various add-ins in the Excel BI toolkit to import, model, prepare, and
visualize data.

Unit 12: Power BI - Connecting To Data Using Power Query 2


DADS304: Visualization Manipal University Jaipur (MUJ)

Connecting to Excel Files.


We need to learn how to source, load, and modify the data before we enter into a
visualization of the data. Here we are required to understand:
• Data discovery
• Data loading
• Data modification

Data Discovery
Data discovery is the process of finding data sources by connecting to various data sources
which include:
• Relational data
• Structured data
• Semi-structured data
• Nonstructured data

Data Loading
Data loading is the process of loading data from data sources into excel for analysis.

Data Modification
Data modification helps in understanding how to,
• Modify data and filter it.
• Join separate data structures.

The combined application of data discovery, data loading and data modification is also called
as ETL (Extract, Transform , Load) data.

For earlier versions of Excel (before 2016) power query was an add-in available to discover,
access, and consolidate information from various sources. Power Query now in updated
version is known as Get & Transform features which helps with the process of collecting,
combining, and refining data sources. The four phases of Power Query are: -
• Connect
• Transform

Unit 12: Power BI - Connecting To Data Using Power Query 3


DADS304: Visualization Manipal University Jaipur (MUJ)

• Combine
• Load

Fig 1: Excel File

Figure 1 shows a blank excel worksheet. Power Query (also known as Get & Transform in
Excel) can help in importing or connecting to external data and then shaping that data
according to requirement, and then loading the query into Excel to create charts and reports.

Let’s look at the process of connecting the data. Using the get and transform utility we can
connect to file data such as Excel, CSV, XML, text, JSON, etc as shown in figure 2.

Unit 12: Power BI - Connecting To Data Using Power Query 4


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 2: File Data Connection

Fig 3: Database Connection

Figure 3 shows we can also connect to databases such as SQL server database, Microsoft
Access database, Oracle database, Sybase database, SAP Hana database, and so on. Similarly,
the data connection is possible with different sources of Azure storage and their databases.

Unit 12: Power BI - Connecting To Data Using Power Query 5


DADS304: Visualization Manipal University Jaipur (MUJ)

Data collection from online services like SharePoint online list, Microsoft Exchange,
Facebook, and salesforce can be used to connect as a new query.

Fig 4: Data from Online Services

Data from other sources such as web page, SharePoint list, OData feed, Hadoop file, ODBC
service are used to connect external data to active directory.

Fig 5: Data from Other Services

Unit 12: Power BI - Connecting To Data Using Power Query 6


DADS304: Visualization Manipal University Jaipur (MUJ)

Connecting to File Data


Let’s look at the process of connecting to a file or an excel workbook.

First, select the new query option in the data tab and then click on from file option, then click
on from Excel workbook, and then select which workbook needs to open from the database
as shown in figure 6.

Fig 6: Connecting to Data from Workbook

Here we are considering the hospital patient data

Unit 12: Power BI - Connecting To Data Using Power Query 7


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 7: File Data

The selected file data shown in figure 7 is imported to the excel file which opens a navigator
window. Select the master data option and click on the load button.

Unit 12: Power BI - Connecting To Data Using Power Query 8


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 8: Navigator Window

The load option has two sub-options load and load too. The load option allows importing the
data into the current worksheet whereas the load to option will allow us to create a new
worksheet for the dataset.

Import the dataset into a new workbook by selecting the load to option and selecting only
create connection option.

Unit 12: Power BI - Connecting To Data Using Power Query 9


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 9: Load to Option

This creates a connection that gets displayed in the workbook area of the excel sheet as
shown in figure 10.

Fig 10: Connection for the Master Data

Now, to import the patient transaction details follow the same steps as per figure 8 and select
the patient transaction data and load the data. Below is figure 11 shows the transaction data
loaded into the worksheet.

Unit 12: Power BI - Connecting To Data Using Power Query 10


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 11: Patient Transaction Data

The navigator window shown in figure 11 is useful to take a quick look at the available data
models that are connected to or being loaded to data model.

In the navigator window, we can also make use of the peek pop-up feature which helps to
preview the data by hovering over the data model and we can use the scroll bar to view the
data in the model as shown in figure 12.

Unit 12: Power BI - Connecting To Data Using Power Query 11


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 12: Peek Pop Up Feature

Peek pop-up allows for data discovery without loading unnecessary data. Note that each load
is stored as an individual query and it appears on the navigator window as it is being
connected to other data sources.

In the navigator window, we can right-click on the data model we can access many options
which allow us to copy, edit, load, or perform many such operations as shown in figure 13.

Fig 13: Options Available to Make Changes to the Data

Unit 12: Power BI - Connecting To Data Using Power Query 12


DADS304: Visualization Manipal University Jaipur (MUJ)

Connecting to Data from Database


To load data from a database:

Fig 14: Data Collection from Database

Select the required database option as shown in figure 14.

For example, consider the SQL server database. Select the SQL database option which will
provide a window where we need to enter the server name and details for any specific query
or relationship column required as shown in figure 15.

Unit 12: Power BI - Connecting To Data Using Power Query 13


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 15: SQL Server Database Connection

Connecting to XML File

Fig 16: Data from File Option

Unit 12: Power BI - Connecting To Data Using Power Query 14


DADS304: Visualization Manipal University Jaipur (MUJ)

From the collection of data from file, option choose the XML option. Select the patients.xml
file which contains the patient’s data in an XML format and import the data as shown below
in figure 17.

Fig 17: Patients.xml File

Unit 12: Power BI - Connecting To Data Using Power Query 15


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 18: Patients Data from the XML File

Select the book option and load the data as shown in figure 18. This will load the data from
the XML file into the excel data model as shown in figure 19.

Fig 19: Patients Data Loaded into the Excel Data model

Unit 12: Power BI - Connecting To Data Using Power Query 16


DADS304: Visualization Manipal University Jaipur (MUJ)

Figure 19 also shows the query connection which has been established to the XML schema
on the right side of the excel sheet in the workbook queries window.

Connecting to Web and Social Media Files:


Advance data discovery can be done by looking at data sources outside or from available
data sources by connecting with external data . Consider connecting to an external web
source for a list of hospitals in a specific region (patient data is available).

To connect to web source to obtain data select the new query option in Data tab and under
‘from other sources’ select the from web option as shown in figure 20.

Fig 20: Connecting to WEB Source

Following the steps shown for figure 20 a pop-up window will appear on the worksheet
where we need to enter the URL in the empty field as shown in figure 21.

Unit 12: Power BI - Connecting To Data Using Power Query 17


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 21: Connecting to Web Using URL

For this scenario we are loading the hospital data from the California region as shown in
figure 21. Click ok to go to the navigator window where the HTML table element is loaded as
shown in figure 22.

Fig 22: HTML Element

Unit 12: Power BI - Connecting To Data Using Power Query 18


DADS304: Visualization Manipal University Jaipur (MUJ)

Figure 23 shows a data table which consists of 3 columns (maps, name, city), since we do not
require map column, we can remove the column using the query editor

Fig 23: Data Table

Since we need to remove the map column click on edit which will open the query editor
window where we can remove any unnecessary columns.

Unit 12: Power BI - Connecting To Data Using Power Query 19


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 24: Query Editor

Select the required column and click on remove columns option from the top menu as shown
in figure 24. This will delete the maps column and only the name and city column will remain
which is required for the data model.

Alternate Method to Connect to Web Data:


Alternate method to connect to web data is to use formula in the query editor to connect to
a web source. To open the query editor, select the ‘blank query’ option under from other
sources in new query ribbon from data tab as shown in figure 25.

Unit 12: Power BI - Connecting To Data Using Power Query 20


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 25: To Open Editor Window

This will open a blank query window where we need to enter the formula in the empty field
provided. Enter the formula as shown in figure 26.

Fig 26: Formula to Connect to Web Data

Note that the above shown formula is case sensitive.

Unit 12: Power BI - Connecting To Data Using Power Query 21


DADS304: Visualization Manipal University Jaipur (MUJ)

Once the formula is entered the HTML table structure will be loaded and on expanding the
data column, we will be able to see the various data attributes of the data table as shown in
figure 27.

Fig 27: Data Loaded from the Formula

In figure 27 we can see the attributes available in the data table, here we can choose which
data we need to display and as we do not require the maps data uncheck the maps attribute
and select ok.

This will show the data along with the city and name as displayed in figure 28.

Unit 12: Power BI - Connecting To Data Using Power Query 22


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 28: Displayed Data for Selected Attributes

Connecting to Online Data Sources:


Consider a scenario where we need to analyse the social media sentiment of patients, for this
purpose we are required to connect to the patients Facebook page for this scenario. For the
purpose of this demo, we have created a Facebook page as seen in figure 29.

Fig 29: Demo Facebook Page

In figure 29 we have created a demo page VV Hospital Training for the purpose of the
scenario. Here we can extract the data like posts comments likes etc. Now, in order to connect
to the above Facebook data, click on new query and under online services select ‘from
Facebook’ as shown in figure 30.

Unit 12: Power BI - Connecting To Data Using Power Query 23


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 30: Connecting to Facebook Data

Following the above step will open a Facebook window which requests an object or a
connection in the Facebook graph. Facebook graph is the primary way for any external app
to talk with Facebook. Using the Facebook graph, we can access all the information we
require that can be handled by excel. To access the required object id open the Facebook
page in the browser and copy the numbers which appears after the name in the URL at the
end of page as shown in figure 31.

Fig 31: URL of Demo Page

The highlighted value in figure 31 is the object id of the demo page which we need in order
to establish the connection from excel.

Unit 12: Power BI - Connecting To Data Using Power Query 24


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 32: Facebook Connection Window

The above figure shows the pop-up window which appears following the steps in figure 30.
The object id is entered which we obtained from the URL and the connection attributes we
need here is posts. Clicking on ok will retrieve all information related to the page as shown
in figure 33.

Fig 33: Retrieved Information from the Demo Page

Unit 12: Power BI - Connecting To Data Using Power Query 25


DADS304: Visualization Manipal University Jaipur (MUJ)

Click on edit will open the query editor and we can use the header to see additional
information that is available for that attribute as shown in figure 34.

Fig 34: Comments Data Attributes

Figure 34 shows the data attributes that can be extracted from the comments data. By
clicking on ok the data from the comments will be added to the table as shown in figure 35.

Fig 35: Extracted Data from the Comments Column

The above shown steps shows how to connect to an online Facebook source to extract data,
the same data can also be extracted using a similar formula by clicking on a new blank query
option (refer figure 25). For this we need to choose new query from data tab, then clicking
on from other sources and then on blank query, will open power query editor window.

The below figure 36 shows the formula used to connect to the Facebook id.

Unit 12: Power BI - Connecting To Data Using Power Query 26


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 36: Formula to Connect to Facebook

Once we get the power query editor window, formula is required to enter to extract the post
from Facebook graph id . By clicking enter a similar window will pop-up with all the data
that appear on that web page, here the comments data can also be extracted using the same
steps from figure 34 and the resulting data will be the same as shown in figure 35. Here we
have discussed examples to connect to various data sources.

Summary
In this topic we discussed,
• How to load data into the Excel model
• Various methods to connect and load data into the Excel sheet.
• Various methods to connect to Web data source.

Unit 12: Power BI - Connecting To Data Using Power Query 27


DADS304: Visualization Manipal University Jaipur (MUJ)

MASTER OF BUSINESS ADMINISTRATION


SEMESTER 3

DADS304
VISUALIZATION

Unit 13: Merging and Appending Data Sources 1


DADS304: Visualization Manipal University Jaipur (MUJ)

Unit 13
Merging and Appending Data Sources
Table of Contents

SL SAQ /
Topic Fig No / Table / Graph Page No
No Activity
1 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
Introduction 1 3-25
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26
2 Summary - - 25
3 Terminal questions - - 26
4 Answers - - 26-32

Unit 13: Merging and Appending Data Sources 2


DADS304: Visualization Manipal University Jaipur (MUJ)

1. INTRODUCTION
In Unit 12 we have discussed about connecting to various data sources. Now let’s understand
the concept of merging of data from different data sources into a single data source in order
to visualize the result.

Consider the two sheets as patient master data and patient experience data connected from
the file data and xml data, to merge these we need to do right click on the data model and
select the merge option, and then we need to choose the sheet or the data source which is
required to be merged, in this case we need to merge is patient experience data, for this we
need to follow the steps as shown below.

Fig 1: Step 1 Merge Data

Unit 13: Merging and Appending Data Sources 3


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 2: Step 2 Selecting Merge Data

Unit 13: Merging and Appending Data Sources 4


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 3: Step 3 Merging Data Join Element

We need to choose the left outer as a join kind and then have to select the common join
element as shown in figure 3. In this case the common column or field that need to join on is
the case number. After selecting the common join element, we need to click on ‘ok’. The data
editor window will open which shows the columns from both the data sheets appearing in
the single window as shown in figure 4.

Unit 13: Merging and Appending Data Sources 5


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 4: Step 4 Merged Data Sheet

The columns from the second sheet are concatenated and are created as a single column
called the new column. We need to click on the expand icon and then select the data fields
that are required for the new merge data source. Since there will be repeating data columns
we need to choose only the required data, in this case, our required data would be ‘wait time
for check in’, ‘wait time in waiting room’, ‘wait time for physician’, ‘wait time at check out’,
‘total wait time’ and ‘patient satisfaction rating’ as shown in figure 5. After this we need to
click ok to display to the selected data as shown in figure 6.

Unit 13: Merging and Appending Data Sources 6


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 5: Data Fields to Display

Fig 6: Required Data Fields

Unit 13: Merging and Appending Data Sources 7


DADS304: Visualization Manipal University Jaipur (MUJ)

Now we have required field from the data sources which we wanted to merge. After this we
need to select the close and load option on the top left of the window. The merge data source
will be loaded into the workbook query window which contains the data from both the
master data file and the experience data file, as shown in figure 7.

Fig 7: Loaded Merge Data Source

Once this is done, we have to right click on the merge file and rename it as needed. Here the
merge data source named as hospital performance data which will be used to visualize the
data as shown in figure 8.

Fig 8: Renamed Merged Data Source

Editing and Transforming Data:


Now let’s try to understand the event of modifying and transforming the data to match any
required specifications.

Unit 13: Merging and Appending Data Sources 8


DADS304: Visualization Manipal University Jaipur (MUJ)

First, we need to shape the dataset by deleting unnecessary columns, adding specific
calculated columns, filtering data to our requirements. To do this, we need to right click on
the merged data source which we renamed in figure 8 and click on edit. The query editor
window will open where we can perform any of the required data modification or cleansing
technique to prepare the data set for visualization.

Let’s begin with renaming the existing columns, this can be done by either double clicking
on the column name to change the name directly or right click on the column name and select
the rename option to change the name of the column accordingly.

To delete or remove any existing columns, we need to right click on the column name and
choose the option “remove” which will delete the column from the worksheet.

Note: the query editor keeps track of each transformation which has been performed under
the applied steps window. All the transformations which have been applied makes up the
new query. Note that none of these actions performed change the original source data, excel
records each data that is performed and takes a snapshot of it and brings it back to the
workbook.

To speed up the development of any complex data transformation processes we need to filter
or remove data to reduce the sample dataset. We can use the ‘Remove Rows’ option in the
top menu where we can do numerous operations like remove top rows, bottom rows,
alternate rows, remove duplicates and errors which will help in removing certain records
which will make the development of the data transformation process faster and easier.

Appending the Data:


In case if we have a requirement where the data is coming from two different data sources,
but the data is similar, we can use the append option which will allow to combine rows from
2 or 3 different tables with similar data structures.

Unit 13: Merging and Appending Data Sources 9


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 9: Append Option

We need to select the required data model, then right click to select the append option in the
drop menu. This option helps in appending two or more different tables having same data
structure. For example, if we have two or more rows appearing in one table and rest of rows
with the same column structure appearing in another table. Now using append option, we
can choose primary table which contain our first set of rows and second set of rows in second
table, once they are chosen, we can press ok. This would append the rows from both the data
sources.

Unit 13: Merging and Appending Data Sources 10


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 10: Append Window

Data Manipulation with Power BI


Consider a scenario to sort patients by physician wait time. Here we considering the top 20
patients with the longest waiting time.

We need to sort the column in descending order to get the top 20 patients.

Fig 11: Sorting the Selected Column

The top 20 longest wait time patients are sorted and displayed as shown in figure 11. Once
we have sorted data, we must select to the keep the top rows option and enter the number
of rows required. For this above scenario we need 20 rows

Unit 13: Merging and Appending Data Sources 11


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 12: Keep Top Rows Window

The top 20 records for the highest physician wait time is displayed only, as shown in figure
13.

Fig 13: 20 Sorted Rows

Note that all the steps performed will be added to the applied steps window in the query
settings as shown in figure 14.

Unit 13: Merging and Appending Data Sources 12


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 14: Applied Steps Window

Here we can revert to any previous version of the records by just deleting the step we need
to undo. Here if we need to keep all the top end or bottom end records or even a specified
range of records from the data sheet, we just need to delete the keep first rows step and all
the data records will be displayed again.

Most datasets have duplicates resulting in poor data quality, so if we need to remove
duplicate values from the records, we can remove the duplicates in the query editor

Unit 13: Merging and Appending Data Sources 13


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 15: Data Records with Duplicates

Consider the case in figure 15 where there are duplicate records in the case number column.
In the data records we have 37 rows and to remove the rows with duplicate records which
in this case is 3 rows, we need to choose the case number column and then select the remove
duplicate option in the remove rows feature in the top menu as displayed in figure 16.

Unit 13: Merging and Appending Data Sources 14


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 16: Delete Duplicate Option

Now we can see that the number of rows to be reduced to 34 and the 3 duplicate records are
deleted. Similarly, if there are any column that consist of duplicate records which need to be
deleted, select the column, and choose the remove duplicates option as shown in figure 16.

Filtering Data
Let’s undo the changes made to the data records to get back the complete data record which
is available. For this we have to delete the filtered rows option to obtain the original complete
data.

Unit 13: Merging and Appending Data Sources 15


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 17: Obtaining the Complete Data

Once the complete data is obtained the process of limiting data by filtering can be done by
choosing the filter option in the expand option adjacent to each column as shown in figure
18.

Fig 18: Filter Option

Unit 13: Merging and Appending Data Sources 16


DADS304: Visualization Manipal University Jaipur (MUJ)

Here we can filter those records we do not need and display a limited set of data. For
example, consider that the data from Boston and Chicago are not required, we can deselect
the above-mentioned records and only display the required data.

Fig 19: Filter Option

Once this option is applied the data will be filtered and the data from Boston and Chicago
will not be displayed and only the remaining data will be obtained.

We can also perform this by using the text filter function.

Unit 13: Merging and Appending Data Sources 17


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 20: Text Filter Function

Figure 20 shows the text filter option where we can choose advance options to filter the data
like equals, does not equal, begin with, does not begin with, end with etc.

When the data sheet consists of date records, we can filter the data using specific date option
such as, next week, quarter, last year etc.

Inserting Custom Columns:


To insert custom column, say full name, in the dataset where we have the columns first name
and last name. Here we need to concatenate the two columns to get the values for the full
name column. To perform this operation, select the add custom column from the top menu
which will open a add custom column window as shown in figure 21.

Unit 13: Merging and Appending Data Sources 18


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 21: Adding Custom Column

Figure 21 shows the add custom column window, here we are adding the column full name,
and, in the column, formula add the attributes first name and last name and concatenating
the attributes with a space. After adding the attributes clicking ok will add the new custom
column ‘full name’ to the work sheet as shown in figure 22.

Unit 13: Merging and Appending Data Sources 19


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 22: Custom Column

Cleanse and Modifying Data:


For the consumption of data, cleansing and modification of data is required to be done. The
query editor automatically assigns the data type when the data is loaded from excel or other
databases, so checking the data types is very important especially when we are looking for
slicing and dicing of data. When the data is loaded from CSV or text files the query editor is
unable to assign the data types, in this case the data type needs to be assigned manually. The
query editor occasionally attempts to apply the data type on its own, this will add a new step
in the navigator window known as ‘changed type’.

Unit 13: Merging and Appending Data Sources 20


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 23: Changing Data Type Manually

We can change the data type manually as shown in the figure 23. Selecting the data and then
doing right click on the data type will provide a dropdown menu from which we can choose
the necessary data type. A pop up appears with all the data types options, be it date, binary,
text, decimal, whole number etc.

Some data types cannot be applied to certain columns for example if we apply date data type
to a text field then the entire column will show error data as displayed in figure 24.

Fig 24: Error Display

Unit 13: Merging and Appending Data Sources 21


DADS304: Visualization Manipal University Jaipur (MUJ)

To restore the above error, we need to delete the change type step from the navigator
window. This will restore the column to its original form.

Replacing The Column Values:


Replacing column values is useful when we are facing data quality issues. Consider data
taken from two systems where, in system 1 we have female abbreviated as ‘fe’ and in system
2 female is stored as ‘female’ completely. To maintain data consistency, we need to store data
of both the systems under the same name, to do this, we need to select the gender column
and have to choose the “replace value” option in the top menu which will open the replace
values window as shown in figure 25.

Fig 25: Replace Values Window

Here we need to enter the required value to be find and then have to enter the value we need
to replace the searched values and then click ok. This will change all the female records in
the column into 'FE’ as displayed in figure 26.

Unit 13: Merging and Appending Data Sources 22


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 26: Replaced Values for Gender Column

Changing data type and replacing values are the fundamental concepts required to
manipulate and cleanse data in excel.

Unit 13: Merging and Appending Data Sources 23


DADS304: Visualization Manipal University Jaipur (MUJ)

SELF-ASSESSMENT QUESTIONS – 1

1. Which one is not a function in MS Excel?


a) SUM
b) AVG
c) MAX
d) MIN
2. Concatenation of a text can be done using
a) Apostrophe (‘)
b) Exclamation (!)
c) Hash (#)
d) Ampersand (&)
3. If cells are entered the following values B1=72, B2=22, then in cell B3 , if the
following formula is entered, Alt=. The result so called obtained is.
a) 72
b) 94
c) 1584
d) 3.2727
4. Remove Duplicates is the ribbon available in which tab?
a) Home
b) Insert
c) Data
d) Developer
5. Which of the following is not the component of Filtering data in excel?
a) Number Filter
b) Text Filter
c) Date Filter
d) Function Filter
6. Flash Fill can be used to separate first name and last name. Which one of the
following is the wrong step to locate flash fill is ____________________.
a) Home > Fill >Flash Fill
b) Data>Flash Fill
c) Using the short cut key Control + E
d) View>Freeze Panes>Flash Fill

Unit 13: Merging and Appending Data Sources 24


DADS304: Visualization Manipal University Jaipur (MUJ)

SELF-ASSESSMENT QUESTIONS – 1

7. Which one of the following is not the type of datatype?


a) Binary
b) Date
c) Currency
d) Integer
8. The _______________________automatically assigns the data type when the data is
loaded from excel or other databases
9. When the data is loaded from __________________the query editor is unable to
assign the data types, in this case the data type needs to be assigned manually
10. Options like equals, does not equal, begin with, does not begin with, end with,
contain, does not contain comes under which filter command
a) Text Filter
b) Number Filter
c) Date Filter
d) Function Filter

2. SUMMARY
In this topic we discussed,
• How to remove unnecessary columns and add specific calculated columns.
• To remove duplicates and infiltering the data
• To transform and cleanse data so to visualize it effectively.

Unit 13: Merging and Appending Data Sources 25


DADS304: Visualization Manipal University Jaipur (MUJ)

3. TERMINAL QUESTIONS
Q1. Explain the need for appending the data?
Q2. Why deleting unnecessary columns, adding specific calculated columns, filtering data is
important for editing and modifying data?
Q3. How data cleansing helps in data visualization?
Q4. Explain sorting and filter function in MS Excel?
Q5. What do you mean by data consistency? Explain this with an example

4. ANSWERS
Self-Assessment Questions
1. Option B, AVG is not a function instead average is the function
2. Option D, Ampersand (&)
3. Option B, 94
4. Option C, Data
5. Option D, Function Filter
6. Option D, View>Freeze Panes>Flash Fill
7. Option D, Integer
8. Query Editor
9. csv or text files
10. Option A, Text Filter

Answers-Terminal Questions
Ans1= A comprehensive data appending service usually includes a data normalization
process, standardization of data, appending missing data, removing redundant data and
setting up of data automation. Here is why your business should consider investing in data
appending services.

• Cleaner data
Apart from verifying and completing your information, data appending can help you correct
typos, update information (zip codes, place names or addresses) and check up on
email/postal address errors. Data appending services can strengthen the validity of your
mailing list.

Unit 13: Merging and Appending Data Sources 26


DADS304: Visualization Manipal University Jaipur (MUJ)

• Reduced time and effort


Wrong phone numbers, bounced email addresses, returned mail can result in waste for your
business, mainly in terms of time. In business, time is equivalent to money. Hence the more
time you spend dealing with incorrect information, the lesser time you will have to focus on
other activities of your business. With an updated mailing list that contains only the right
information, your team of resourced can put their time to good use, without having to
correct/update errors in the mailing list.

• Access to more information


The biggest benefit of data appending is access to information. With data appending services,
your business can find information like gender, job roles, titles, birthdays, incomes and credit
scores. Your business can find social media handles for Twitter, LinkedIn and Twitter. With
access to more information, you will be able to send more effective marketing campaigns to
your customers.

• Better segmentation
With access to more information, your business can customize its services through
segmentation. Instead of only having access to name and age, data appending services can
give you access to other critical information like income, which will help you in your
marketing efforts. For instance, if you have a product for women in their mid-30s who make
X amount of income per month, you will be able to find them through data appending.

• Minimize cost
Data appending services can help your business keep the cost down. With verified lists, you
can save on the cost of research, recruiting staff, error correction and much more.

Ans2= First, we need to shape the dataset by deleting unnecessary columns, adding specific
calculated columns, filtering data to our requirements. To do this, we need to right click on
the merged data source which we renamed and click on edit. The query editor window will
open where we can perform any of the required data modification or cleansing technique to
prepare the data set for visualization.

Unit 13: Merging and Appending Data Sources 27


DADS304: Visualization Manipal University Jaipur (MUJ)

Let’s begin with renaming the existing columns, this can be done by either double clicking
on the column name to change the name directly or right click on the column name and select
the rename option to change the name of the column accordingly.

To delete or remove any existing columns, we need to right click on the column name and
choose the option “remove” which will delete the column from the worksheet.

Note: the query editor keeps track of each transformation which has been performed under
the applied steps window. All the transformations which have been applied makes up the
new query. Note that none of these actions performed change the original source data, excel
records each data that is performed and takes a snapshot of it and brings it back to the
workbook.

To speed up the development of any complex data transformation processes we need to filter
or remove data to reduce the sample dataset. We can use the ‘Remove Rows’ option in the
top menu where we can do numerous operations like remove top rows, bottom rows,
alternate rows, remove duplicates and errors which will help in removing certain records
which will make the development of the data transformation process faster and easier.

Ans3= For the consumption of data, cleansing and modification of data is required to be done.
The query editor automatically assigns the data type when the data is loaded from excel or
other databases, so checking the data types is very important especially when we are looking
for slicing and dicing of data. When the data is loaded from CSV or text files the query editor
is unable to assign the data types, in this case the data type needs to be assigned manually.
The query editor occasionally attempts to apply the data type on its own, this will add a new
step in the navigator window known as ‘changed type’.

Unit 13: Merging and Appending Data Sources 28


DADS304: Visualization Manipal University Jaipur (MUJ)

Fig 1: Changing Data Type Manually

We can change the data type manually as shown in the figure 1. Selecting the data and then
doing right click on the data type will provide a dropdown menu from which we can choose
the necessary data type. A pop up appears with all the data types options, be it date, binary,
text, decimal, whole number etc.

Some data types cannot be applied to certain columns for example if we apply date data type
to a text field then the entire column will show error data as displayed in figure 2

Fig 2: Error Display

Unit 13: Merging and Appending Data Sources 29


DADS304: Visualization Manipal University Jaipur (MUJ)

To restore the above error, we need to delete the change type step from the navigator
window. This will restore the column to its original form.

Ans4= There are many built-in Excel tools to help with data management and the sorting and
filtering features are among the best. The filter tool gives you the ability to filter a column of
data within a table to isolate the key components you need. The sorting tool allows you to
sort by date, number, alphabetic order and more. In the following example, we will explore
the usage of sorting and filtering and show some advanced sorting techniques.

Let’s say you had the spreadsheet above and wanted to sort by price. This process is fairly
simple. You can either highlight the whole column or even click on the first cell in the column
to get started. Then you will:
• Right click to open the menu
• Go down to the Sort option – when hovering over Sort the sub-menu will appear
• Click on Largest to Smallest
• Select Expand the selection
• Click OK

The whole table has now adjusted for the sorted column. Note: when the data in one column
is related to the data in the remaining columns of the table, you want to select Expand the
selection. This will ensure the data in that row carries over with sorted column data.

In addition to sorting, you may find that adding a filter allows you to better analyse your data.
When data is filtered, only rows that meet the filter criteria will display and other rows will
be hidden. With filtered data, you can then copy, format, print, etc., your data, without having
to sort or move it first. To use a filter,

• Go to the Home ribbon, click the arrow below the Sort & Filtering icon in the Editing
group and choose Filter.

OR

• Go to the Data ribbon, and then click Filter in the Sort & Filter group.

Unit 13: Merging and Appending Data Sources 30


DADS304: Visualization Manipal University Jaipur (MUJ)

You will notice that all of your column headings now have an arrow next to the heading name.
Click on the arrow next to the heading with which you want to filter, and you will see a list
of all the unique values in that column. Check the box next to the criteria you wish to match
and click OK. Click on the arrow next to another heading to further filter the data.

To clear the filter, choose one of these options:


• Click on the Filter icon next to the heading and choose Clear Filter from “Name of
Heading”.
• Go to the Data ribbon and click the Clear icon in the Sort & Filter group.
• Go to the Home ribbon, click the arrow below the Sort & Filter icon in the Editing group
and choose Clear.

Ans5= Data that is consistent refers to data that is formatted in a consistent way. This is great
for people working with data because it means all the data can be handled in the same way.

Additionally, data consistency can also refer to data that is constant over time or some other
relationship. For example, if you have a weather dataset, it would be considered consistent
if there were no missing days/hours (depending on your metric).

In summary, data consistency is typically people in data science try to follow and keep in
mind because it ultimately makes the process of using it much easier.

There are many instances when data needs to be transferred or processed according to
business use case. After processing, there might be possibility that the values in different
table have different values but for the same record. This calls for redesign with improved
data consistency.

For Example, table with 3 columns with employee name, employee number, phone number.
Employee Name - XYZ, Employee Number - 123 and Phone Number - 0000.785.563. You
want to use this data in another table or maybe this time your source of data is different. In
new table if you have the same employee details, Employee Name - XYZ, Employee Number
-123 and Phone Number - 0000.785.563, it is consistent data and if you get different phone
for same employee, then it is inconsistent data.

Unit 13: Merging and Appending Data Sources 31


DADS304: Visualization Manipal University Jaipur (MUJ)

Hence, data must be validated according to certain rules using constraints, triggers,
transaction, etc to get improved data consistency.

Unit 13: Merging and Appending Data Sources 32


DADS304: Visualization Manipal University Jaipur (MUJ)

MASTER OF BUSINESS ADMINISTRATION


SEMESTER 3

DADS304
VISUALIZATION

Unit 14 : Visualization with Power BI Desktop 1


DADS304: Visualization Manipal University Jaipur (MUJ)

Unit 14
Visualization with Power BI Desktop
Table of Contents

SL Fig No / Table SAQ /


Topic Page No
No / Graph Activity
1 Introduction - -
3
1.1 Learning Objectives - -
2 Power BI Desktop – Features 1 to 2 - 4-5
3 Report Canvas 3 to 4 - 6-7
4 Exploring the Report Canvas 5 to 10 - 8 - 13
5 11 to 22 - 14 - 21
Basic Charts and Conditional Formatting
6 Report Filtering Options 23 to 39 - 22 - 34
7 Report Formatting Options 40 to 44 1 35 - 38
8 Terminal Questions - - 40
9 Self-Assessment Questions – Answers - - 40

Unit 14 : Visualization with Power BI Desktop 2


DADS304: Visualization Manipal University Jaipur (MUJ)

1. INTRODUCTION

Power BI Desktop is a Windows application that enables users to create advanced data
visualizations and business intelligence reports using data from a variety of sources. It is a
free, standalone desktop application that is part of the Power BI suite of business intelligence
tools developed by Microsoft. Power BI Desktop allows users to create custom data
visualizations, reports, and dashboards with an intuitive drag-and-drop interface. It includes
a range of built-in data connectors for a variety of data sources.

1.1 Learning Objectives

At the end of this unit, you will be able to

❖ Describe the various option available on a Power BI Desktop User Interface.


❖ Explain the process to integrate data sources into Power BI Desktop.
❖ Elaborate on the process to create visualizations of the Data.
❖ Explain the steps to filter and format the various graphs created.

Unit 14 : Visualization with Power BI Desktop 3


DADS304: Visualization Manipal University Jaipur (MUJ)

2. POWER BI DESKTOP – FEATURES

Power BI Desktop is a free software that can be downloaded and installed from the Microsoft
website. Once the app is installed and launched, the UI similar to Figure 1 appears.

Figure 1: Power BI Desktop launch page

The start-up page or the launch page contains various sections. The ‘Get Data’ or ‘Recent
Sources’ can be used to connect to a data source or import the data from any file. The files or
reports that were previously worked on are listed as a ‘quick link’ option. As a part of the
start-up, links to tutorial videos are displayed. For example, there are videos on “Getting
started with Power BI Desktop’, ‘Building Reports’ and so on.

Power BI updates are released every month. The details of the latest update are also available
as a link on this screen under “What’s new”. Besides this, the start-up page has a link to
Forums, Blogs and Tutorials.

These options are always shown at every launch of the app. To not display them at start-up,
the option “Show the screen on the start-up” can be disabled.

Once this launch page is closed, the main interface appears as shown in Figure 2.

Unit 14 : Visualization with Power BI Desktop 4


DADS304: Visualization Manipal University Jaipur (MUJ)

Figure 2: Main User Interface

The main menu on the UI contains the following options

1. File Operation
2. Main Menu
3. Insert
4. Modelling
5. View
6. Help

Unit 14 : Visualization with Power BI Desktop 5


DADS304: Visualization Manipal University Jaipur (MUJ)

3. REPORT CANVAS

Figure 3: Report Canvas

The report canvas in Power BI is the main work area where users can create and design
reports and visualizations using data from various sources. It is the area where users can
drag and drop fields from their data sources, create charts and tables, and arrange
visualizations to create a cohesive report. Users can add new pages to their reports, and each
page has its own report canvas. The canvas can be customized to fit the user's needs,
including adding background colours or images and changing the page size.

The canvas in Power BI Desktop includes several layout tools that allow users to position
and resize their visualizations. It also includes a formatting pane that allows users to adjust
the formatting and appearance of their visualizations, including font size, colours, and other
formatting options.

Pages can be added and deleted on the canvas using the ‘+’ option. Along with the report
canvas, there is an option to add filters. This can be applied on a single page or on all pages.
The page also contains standard or pre-defined visualisations that can be added to the main
canvas.

Unit 14 : Visualization with Power BI Desktop 6


DADS304: Visualization Manipal University Jaipur (MUJ)

Once the connection is established with a dataset, the columns corresponding to the datasets
are available under fields. This can be dragged and dropped into the respective fields to
populate the chart on the canvas.

The UI also has options to display the data model view and the relationship view. The data
model is used to preview the data. The relationship view displays the common relationship
between tables and the common columns.

Figure 4: Relationship view

Unit 14 : Visualization with Power BI Desktop 7


DADS304: Visualization Manipal University Jaipur (MUJ)

4. EXPLORING THE REPORT CANVAS

The report view in Power BI is where you design and build the visualizations and reports
that you want to present to your audience. It offers a range of data visualization choices that
may be used to present your data in a meaningful way, and it enables you to construct custom
dashboards and reports using a drag-and-drop interface.

The workspace where you can design and generate your report in Power BI is called the
report canvas page. It is the primary location where you can format the report layout, add
and arrange visualizations, and apply filters. The canvas effectively serves as a blank page
where you can start from the beginning when creating your report and include any
visualizations you require, including maps, charts, tables, matrices, and images.

To launch the report canvas, click on the 3 dot that is present towards the right-end corner
of the page and click on remove to remove the content/object on the page (if any) as shown
in figure 5.
Click on the three dots to display the
options. Then click on the remove option
to remove the object on the page (if
any).

Figure 5: Option to remove the object on the page

Figure 6 shows the basic report canvas that we get at the start when we don’t have any object
placed on the report canvas (it’s just an empty canvas).

Unit 14 : Visualization with Power BI Desktop 8


DADS304: Visualization Manipal University Jaipur (MUJ)

Options

Figure 6: Empty/New Report Canvas

We have report canvas options and basic menus (file, home, insert etc.) as well on this page.

To change the background of the report canvas, click on view and change the background or
themes of the report canvas as shown in figure 7.

Unit 14 : Visualization with Power BI Desktop 9


DADS304: Visualization Manipal University Jaipur (MUJ)

Background Different Gridlines and


Colour Themes Snap to grid Checkbox

Page Option

Figure 7: To change the background or theme of the report canvas

Then we have check boxes for

• Gridlines – The user may enable or disable the gridlines that show up in the report
canvas by checking the "Gridlines" checkbox in the report canvas settings.
• Snap to Grid - You can enable or disable the snap-to-grid capability in Power BI's report
canvas settings by checking the "Snap to grid" checkbox. When you move or resize
visualizations with snap-to-grid enabled, they will automatically align to the closest
grid line.

Unit 14 : Visualization with Power BI Desktop 10


DADS304: Visualization Manipal University Jaipur (MUJ)

Some of the key features and components of the report view are:

• Pages: A report in Power BI can have one or more pages. The "New Page" button can
be used to add new pages, and the page navigation bar at the bottom of the screen can
be used to switch between pages as shown in figure 7.
• Fields: This section is where the actual data sets or data tables will appear. The user
may view and manage each data field that is present in your dataset in the Fields pane.
You can add fields to already-existing visualizations or create new ones by dragging
fields from the Fields pane onto the canvas as shown in figure 8.
• Visualizations: These are the graphs, tables, maps, and other types of data
visualizations that Power BI lets you make. A new visualization can be added to your
report by choosing it from the Visualizations window on the right side of the screen,
selecting it, and then dragging it onto the canvas as shown in figure 8.

In this visualization, we have a Field view pane and a field formatting pane.

✓ Field View Pane: The values in the fields list will be changing based on the visualization
that we select.
✓ Field Formatting Pane: If a user clicks on the formatting option, the user can see the
page-related information like Page Size, Page background etc. as shown in figure 8. you
can expand them by clicking on the dropdown arrow and you can see the options for
each field as shown in figure 9.

Unit 14 : Visualization with Power BI Desktop 11


DADS304: Visualization Manipal University Jaipur (MUJ)

Field View Pane

Field Formatting
Pane

Figure 8: Showcasing the Fields, Visualization and Filters pane

Figure 9: Options of the field formatting pane

• Filters: You may customize which data is shown in your visualizations by using filters.
When creating a report, choose the visualization you want to filter, then click the
"Filters" button in the Visualizations pane as shown in figure 10.

To the left of the Visualization section, we have Filter page/Section which is used to filter
your visualization based on certain conditions. On this page, we have 3 options available i.e.,

Unit 14 : Visualization with Power BI Desktop 12


DADS304: Visualization Manipal University Jaipur (MUJ)

✓ Filters on this page: To apply filters on the particular page


✓ Filters on all Pages: To apply filters on all the pages
✓ Filters on Visual: This field will be enabled when you add any visualization to the page
and this filter can be applied to that particular visualization.

Figure 10: Options for Filters

Unit 14 : Visualization with Power BI Desktop 13


DADS304: Visualization Manipal University Jaipur (MUJ)

5. BASIC CHARTS AND CONDITIONAL FORMATTING

For the demonstrations in this unit, the data from “Adventure Works” dataset is used.
AdventureWorks is a free sample database of retail sales data.

Once the data has been imported, the columns ( both inherent and derived) are available on
the right of the Power BI Desktop view as shown in figure 11. The different charts available
are also displayed here.

Figure 11 : Visualisation options and Fields on Power BI Desktop

Once a column is selected, the most appropriate chart corresponding to it is displayed on the
view. For example, if the column ‘Total Orders’ is selected, a bar graph is selected and
displayed on the screen. The attributes of the bar plot are available on the screen as shown
in Figure 12.

Unit 14 : Visualization with Power BI Desktop 14


DADS304: Visualization Manipal University Jaipur (MUJ)

Figure 12: Bar plot with attributes

The type of graph can be changed by selecting the most appropriate one in the visualisation
section. For instance, if the ‘stacked bar chart’ is selected, the graph will be converted into a
horizontal chart. The attributes displayed will be that of a ‘horizontal bar chart’.

We can create a new chart by selecting the ‘New Visual’ option as shown in Figure 13. This
will create a default column chart.

Figure 13: New Visual

Unit 14 : Visualization with Power BI Desktop 15


DADS304: Visualization Manipal University Jaipur (MUJ)

The third method of creating a chart is by double-clicking on the canvass. This will create a
default ‘Q&A’ chart.

Let us go back to selecting the ‘Total Orders’ as shown in Figure 14. The text corresponding
to the ‘Total Count’ can be formatted using the options as shown in Figure 14. The number
can be displayed as a comma-separated value by selecting the appropriate values.

Figure 14: Formatting the text

To display the total orders subdivided by the subcategory names, ‘select’ the sub-category
name from the ‘fields’ option and drag and drop into the ‘Axis’ attribute as shown in Figure
15. This will display the image as shown in Figure 15.

Unit 14 : Visualization with Power BI Desktop 16


DADS304: Visualization Manipal University Jaipur (MUJ)

Figure 15: Bar plot per sub-category

Note that there is another attribute named ‘Legend’. If the column ‘Subcategory name’ is
moved to an attribute named ‘Legend’, the chart as shown in Figure 16 appears. This is not
very easy to interpret.

Figure 16: Bar plot with Subcategory as a ‘Legend’.

Unit 14 : Visualization with Power BI Desktop 17


DADS304: Visualization Manipal University Jaipur (MUJ)

The chart can be highlighted by clicking on the option ‘Focus Mode’. The type of the chart can
be modified by clicking on the ‘Pie chart’ option as shown in Figure 17. Note that the
Attributes corresponding to the bar chart are also updated on the screen.

Figure 17: Pie chart

Tooltips are information that appears when the mouse pointer hovers around the visual.
Additional content can be added to the tooltip by adding the columns to the respective
attributes as shown in Figure 18.

Unit 14 : Visualization with Power BI Desktop 18


DADS304: Visualization Manipal University Jaipur (MUJ)

Figure 18: Adding Columns to Tooltips

Additional Formatting can be done by using the ‘Format’ option as shown in Figure 19 below.

Figure 19: Format Options

Unit 14 : Visualization with Power BI Desktop 19


DADS304: Visualization Manipal University Jaipur (MUJ)

There are general options. Also, there are options specific to the X-axis and Y-axis. For
example, to change the colour of the content, the ‘Color’ option of the y-axis can be changed
as shown in Figure 20.

Figure 20: Y-axis specific formatting options

Similarly, alignment, font, size of the bar graph, inner padding between the bar graph and
other formatting options can be changed by setting the appropriate attributes in this section.
This can be done for both X-axis and Y-axis separately. There is an option to “Return to
Default” settings after the changes have been made.

There is an option called the ‘zoom slider’. As shown in Figure 21, once the zoom slider is
turned on, the slider appears. Using this, the content that needs to appear on the screen can
be controlled. Figure 21 displays the bar greater than 5k only. This option can be used to
display a range of values also.

Unit 14 : Visualization with Power BI Desktop 20


DADS304: Visualization Manipal University Jaipur (MUJ)

Figure 21: Zoom slider

There are options available to change the colour of the bar chart, and the colour of the text
on the bar chart. There are options available to change the Title, the format of the title and
so on.

Figure 22: Title Formatting

Similarly, there are formatting options to modify and enhance the background, border and
so on.

Unit 14 : Visualization with Power BI Desktop 21


DADS304: Visualization Manipal University Jaipur (MUJ)

6. REPORT FILTERING OPTIONS

Power BI has several filtering options that can be used to refine and analyze data. Let’s talk
about filtering options that are available to us in Power BI report canvas.

Let’s add some visuals to the report canvas to explain the filtering options as shown in below
figure 23.

Here the user has added two different graphs that have been added from the visualization
pane by selecting the required fields.

Figure 23: Report Canvas Page with two Visuals

Right now, in filter pane we have two filtering options (i.e., Filters on this Page and Filters on
all Pages)

Now, if the user selects a particular visual, then user can notice the options in filters field that
can be applied for that particular visual that is selected as shown in figure24.

Unit 14 : Visualization with Power BI Desktop 22


DADS304: Visualization Manipal University Jaipur (MUJ)

Figure 24: User has selected the visual 1

Now, let us create a new page i.e., page 2, and copy the visual 2 from page 1 and paste it in
page 2 and resize the graph if it is required.

Back to page 1, and select the particular bar graph, then user can notice the filters on this
visual on Filters Pane can be applied only on category name because there are only two
attributes are present in that graph i.e., Category Name and Total Orders as shown in the
below figure 25.

Unit 14 : Visualization with Power BI Desktop 23


DADS304: Visualization Manipal University Jaipur (MUJ)

Figure 25: Selected Visual 2 and Filters options have been enabled for the visual.

With respect to category name, if user click on drop down arrow then user will get the filter
type option and there are two filter type options.

They are, ‘Basic Filtering Options’ and ‘Advanced Filtering Options’.

Basic filtering options - where user can select the category on which he wants to visualize
the graph.

For example, in the below figure 26 user is selecting Bikes and clothing option and
accordingly he will be visualizing the graph.

Unit 14 : Visualization with Power BI Desktop 24


DADS304: Visualization Manipal University Jaipur (MUJ)

Figure 26: Enabled Basic Filtering Options

Advanced Filtering options – Here if the user wants to filter their data based on whether a
specific column contains a certain string of text i.e., by mentioning the string in contains
field, then user can use this option as shown in figure 27.

Figure 27: Active page of Advanced Filtering Options

For example, in figure 28, the user has mentioned Bikes in the contains field, and accordingly
his graph will be containing only Bike data.

Unit 14 : Visualization with Power BI Desktop 25


DADS304: Visualization Manipal University Jaipur (MUJ)

Figure 28: Resultant Graph after applying the Advanced Filtering for ‘Bikes’

Apart from contains, the advanced filtering has some other options also like ‘doesnot’, ‘is’, ‘is
not’, ‘starts with’ etc as shown in figure 29.

Figure 29: Available options on show items in Advanced Filtering

Unit 14 : Visualization with Power BI Desktop 26


DADS304: Visualization Manipal University Jaipur (MUJ)

Apart from this, we have options like “Top N” and “Bottom N” (In Filter Type field). If user
wants to see either top or bottom then user can make use of this option. These options allow
users to filter their data to show the top or bottom N items, where N is a specified number.

Let us work on Top N filtering. The below figure 30 shows the Top N filter type, where
number of items is to be set (i.e., n value needs to be set) in “Show items” field, and
accordingly the filter will be applied and it will display visual for Top N items.

Figure 30: Top N Filtering

In the below figure 31(a) the number of items has been set to 2 (i.e., n=2) and accordingly
data fields value to be set in “By Value” field (Here we are considering Order Quantity which
is selected from Fields pane) and apply filter as shown in figure 31(b).

Unit 14 : Visualization with Power BI Desktop 27


DADS304: Visualization Manipal University Jaipur (MUJ)

Figure 31(a): N value set to 2 Figure 31(b): Values set to ‘Total Orders’

After applying filter, user can see the below figure 32 with the graph with top 2 products
based on order quantity (i.e., Bikes and Accessories).

Figure 32: Resultant Graph with Top 2 Products based on order quantity.

Unit 14 : Visualization with Power BI Desktop 28


DADS304: Visualization Manipal University Jaipur (MUJ)

Now, let us see another filter (from filters on this visual) called “Total Orders”, which contains
a subfield ‘Show items when the value’.

If the user clicks on ‘Show items when the value’ field, then it displays options like is less than,
is less than or equal to, is, is not, is greater than etc.as shown in figure 33. (User can select
any option based on his requirement)

Figure 33: Options of ‘Show items when the value’

Next to ‘Show items when the value’ we have an empty field to enter the value.

For example, ‘Show items when the value’ is set to ‘is less than’ and the empty field is set to
5000 and click on apply filter then the below graph will be displayed. User can see the empty
graph indicating there is no such value where the order is less than 5000 as shown in figure
34.

Unit 14 : Visualization with Power BI Desktop 29


DADS304: Visualization Manipal University Jaipur (MUJ)

Figure 34: Empty graph indicating there is no such value where the order is less than 5000.

If user enter 15000 (instead of 5000) then the below figure 35 - graph will be displayed
indicating that bikes and clothing are ordered less than 15000.

Figure 35: Graphs indicating that bikes and clothing are ordered less than 15000.

Unit 14 : Visualization with Power BI Desktop 30


DADS304: Visualization Manipal University Jaipur (MUJ)

Applying Filters to the Entire (this) Page:

Let’s consider the user has added the Product Category in the data field from the field section.
Then the user can notice the same filter type as discussed before i.e., basic filtering and
advanced filtering.

Here if we select the options let say bike, then the click on apply. Then changes have been
applied to all the visuals (graphs) that are present on that page as shown in figure 36.

Figure36 (a): Before applying the filter of ‘Filters on this Page’

Unit 14 : Visualization with Power BI Desktop 31


DADS304: Visualization Manipal University Jaipur (MUJ)

Figure 36 (b): After applying the filter of ‘Filters on this Page’.

Similarly, we can apply advanced Filtering by selecting the options from ‘Show items when
the value’ and by entering the value in the next field.

For example, let’s select the ‘or’ operation for ‘contains’ field to visualize either Bikes or
Clothing data:

Filter Type: Advanced Filtering

Show items when the value: contains

Bikes (to be entered in the empty field after the item field) and check the ‘or’ option

Again, select the ‘contains’ and the second category after that i.e., clothing and apply the filter.

The result is shown in figure 37, where the filter has been applied to both visuals.

Unit 14 : Visualization with Power BI Desktop 32


DADS304: Visualization Manipal University Jaipur (MUJ)

Figure 37: Resultant visuals after applying advanced filtering

Now, look back at the page 2, where no changes have been applied to the visuals (as we have
applied changes for the current entire page and not for all the pages).If user wants to check
the page 2 then clear the filters (by clicking on eraser icon which is present near category
name field) and select Filters on all pages.

Applying Filters on all pages:

Select the Filters on all pages and select the ‘category name’ from the ‘Fields’. Again, here we
have Basic filtering and Advanced filtering.

Let’s work with Basic Filtering, and select the clothing and Bikes from the options. Then in
all the pages (here we have Page 1 and Page 2) we can see the updating in visuals with
respect to clothing and Bikes only as shown in figure 38(a) and (b).

Unit 14 : Visualization with Power BI Desktop 33


DADS304: Visualization Manipal University Jaipur (MUJ)

Figure 38(a): Applied filter to page 1 Figure 38(b): Applied filter to page 2.

Next will see how to highlight the particular attribute.

Let’s clear all the filters and work on Page 1. In this page if user selects Bikes by clicking on
the bar graph as shown in the below figure 39, then Bike bar will be highlighted and the
corresponding sub categories from the second graph will be highlighted indicating that this
is also a filtering option.

Figure 39: Highlighting the Particular category and sub category in the graph.

Unit 14 : Visualization with Power BI Desktop 34


DADS304: Visualization Manipal University Jaipur (MUJ)

7. REPORT FORMATTING OPTIONS

Power BI is a well-known application for data visualization and reporting that gives you a
variety of formatting options to help you produce polished and interesting reports.

Now, let’s understand the report formatting options by considering a sample graph as shown
in figure 40.

Figure 40: Sample Report Canvas with a visual.

To change the view of the displayed visual, then will perform the below steps:

1. Select the Visual (bar graph) by clicking on it, so that it gets highlighted. As soon as the
visual gets highlighted, the fields are updated under the field pane indicating what are
all the fields present in the visual as shown in figure 41.

Unit 14 : Visualization with Power BI Desktop 35


DADS304: Visualization Manipal University Jaipur (MUJ)

Figure 41: Highlighted Fields in ‘Field Pane’

If the user clicks on formatting options, then formatting options also get updated concerning
bar graphs like the x-axis, y-axis etc. as shown in figure 42.

Figure 42: Updated formatting options in ‘Formatting Pane’

Unit 14 : Visualization with Power BI Desktop 36


DADS304: Visualization Manipal University Jaipur (MUJ)

If a user wants to change the visual of the graph, let’s say instead of a Bar graph (shown in
fig 42) user wants to view it as a pie chart. The user has to click the pie-chart symbol from
the visualization pane and update the graph. As soon as the graph is updated the user can
notice that formatting options will also get updated with options like Legend, Data colors,
Title etc. The updated pie chart and formatting options can be seen in figure 43.

Figure 43: Bar graph converted to the pie chart with updated formatting options.

Basically, the field pane and the formatting options are specific to the chart type the user
selects to visualize the data.

For example, if the user wants to view data as a line graph then according to that the field
pane/section and formatting options will be updated as shown in figure 44.

Unit 14 : Visualization with Power BI Desktop 37


DADS304: Visualization Manipal University Jaipur (MUJ)

Figure 44: Visualizing Data as a line graph with updated formatting options view

Unit 14 : Visualization with Power BI Desktop 38


DADS304: Visualization Manipal University Jaipur (MUJ)

Self-Assessment Questions -1

1. Name some of the information displayed in the startup or the launch page
2. The ___________in Power BI is the main work area where users can create and
design reports and visualizations using data from various sources.
3. The default chart selected for a integer column with continuous values is a
____________________.
4. To display a horizontal bar chart per subcategory, set the subcategory
column to the __________ attribute.
5. To control the range of values that appear on the screen the
______________option can be enabled.
6. The ____________ in Power BI is where you design and build the visualizations
and reports that you want to present to your audience.
7. Visualizations can be moved or resized with _____________ enabled, and the
visualizations will automatically align to the closest grid line.
8. The actual data sets or data tables will appear in ________ section
9. User can customize which data to be shown in his visualizations by using
________
10. Numbers appearing in Toolkit can be formatted to include symbols like ‘,’, ‘%’
etc -True or False

Unit 14 : Visualization with Power BI Desktop 39


DADS304: Visualization Manipal University Jaipur (MUJ)

8. TERMINAL QUESTIONS

1. What are the different types of visualizations available in Power BI and how can a user
format them?
2. Explain the procedure to apply conditional formatting to the report visuals in Power
BI?
3. Elucidate the procedure to apply a filter to a specific visual or group of visuals in Power
BI?
4. How can a user use a filter to highlight specific data in a Power BI report?
5. Briefly explain the key features and components of the report view.

9. SELF-ASSESSMENT QUESTIONS – ANSWERS


1. Recent Sources connected, Getting Started Videos, Updates, Blogs, Forums, Tutorials
etc
2. report canvas
3. vertical bar chart.
4. "Axis"
5. "zoom slider"
6. Report View
7. Snap-To-Grid
8. Field
9. Filters
10. True

Terminal Questions – Answers

1. Refer section ‘Basic Charts and Conditional Formatting’


2. Refer section ‘Basic Charts and Conditional Formatting’
3. Refer section ‘Report Filtering Options’
4. Refer section ‘Report Filtering Options’
5. Refer section ‘Exploring the Report Canvas’

Unit 14 : Visualization with Power BI Desktop 40

You might also like