0% found this document useful (0 votes)
26 views13 pages

DPV LabManual

The document outlines a comprehensive guide for using R Programming and Tableau, detailing objectives and manual steps for data import, cleaning, manipulation, visualization, and reporting. It includes exercises for practical application, covering topics such as data joins, advanced visualizations, and dashboard creation. Each section provides specific tasks to enhance proficiency in data analysis and visualization using these tools.

Uploaded by

ENLIGHTNING PATH
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views13 pages

DPV LabManual

The document outlines a comprehensive guide for using R Programming and Tableau, detailing objectives and manual steps for data import, cleaning, manipulation, visualization, and reporting. It includes exercises for practical application, covering topics such as data joins, advanced visualizations, and dashboard creation. Each section provides specific tasks to enhance proficiency in data analysis and visualization using these tools.

Uploaded by

ENLIGHTNING PATH
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

📊 List-1: Using R Programming

1. Importing and Cleaning Data


 Objective: Import data from CSV, Excel, and database sources using readr, readxl,
and DBI.
 Manual:
o Load libraries (readr, readxl, DBI).
o Read CSV/Excel files.
o Connect to a database and query data.
o Handle missing values and inconsistencies using dplyr and tidyr.
2. Selecting and Filtering Variables
 Objective: Select specific columns and filter rows based on conditions.
 Manual:
o Use select() to pick columns.
o Use filter() for row filtering with logical conditions.
3. Data Summarization and Aggregation
 Objective: Summarize and aggregate data using group_by() and summarize().
 Manual:
o Group data by a factor variable.
o Calculate summary statistics (mean, median, count).
4. Data Reshaping
 Objective: Transform data between wide and long formats.
 Manual:
o Use pivot_longer() and pivot_wider() functions.
5. Data Transformation and Recoding
 Objective: Create new variables, recode existing ones, and perform calculations.
 Manual:
o Use mutate() to create new columns.
o Use case_when() for recoding.
6. Data Exploration and Profiling
 Objective: Discover patterns and summarize data quality.
 Manual:
o Use summary(), skimr::skim() and DataExplorer package.
o Plot missing values and distributions.
7. Visualization Basics
 Objective: Create basic plots with ggplot2.
 Manual:
o Create bar charts, pie charts, histograms, scatter plots, and boxplots.
o Use facet_wrap() and facet_grid() for subplots.
8. Advanced Graphing with ggplot2
 Objective: Create violin plots, ridgeline plots, beeswarm plots, and density plots.
 Manual:
o Customize aesthetics (colors, shapes, sizes).
o Use themes and labels.
9. Data Blending with Joins
 Objective: Merge data from different sources using joins.
 Manual:
o Use left_join(), right_join(), inner_join(), and full_join().
10. Reporting
 Objective: Export cleaned and visualized data.
 Manual:
o Use write.csv(), writexl::write_xlsx().
o Save plots using ggsave().
📈 List-2: Using Tableau
1. Connecting Data Sources
 Objective: Connect Tableau to Excel, CSV, and databases.
 Manual:
o Import retail garment shop data from Excel.
o Import student data set; handle NULL values.
2. Data Cleaning
 Objective: Perform data cleaning and transformation in Tableau.
 Manual:
o Use metadata grid.
o Rename fields and handle missing values.
o Create calculated fields for recoding.
3. Data Joins and Blending
 Objective: Combine data from multiple sources.
 Manual:
o Practice joins (inner, left, right, full) using payroll data set.
o Perform data blending in medical data set.
4. Grouping and Sets
 Objective: Create groups and sets in Tableau.
 Manual:
o Highlight desired items and group them in library data set.
o Create dynamic sets in inventory data set.
5. Basic Charts
 Objective: Create pie charts, line graphs, scatter plots.
 Manual:
o Drag and drop dimensions and measures.
o Customize colors, labels, and tooltips.
6. Combination and Dual-Axis Charts
 Objective: Create complex charts.
 Manual:
o Build combination charts.
o Build dual-axis charts for revenue vs. sales analysis.
7. Advanced Visualizations
 Objective: Create heat maps, treemaps, box plots, KPI charts, and waterfall charts.
 Manual:
o Use Show Me panel for quick generation.
o Customize marks and color palettes.
8. Bump Chart and Level of Detail (LOD) Expressions
 Objective: Visualize rankings and use LOD calculations.
 Manual:
o Create bump charts for rank trends.
o Use LOD to calculate metrics like average sales per customer.
9. Dashboards and Stories
 Objective: Combine multiple visualizations into dashboards.
 Manual:
o Add filters and parameters.
o Design interactive dashboards.
o Create stories to present insights.
10. Exporting and Sharing
 Objective: Export dashboards and share visualizations.
 Manual:
o Publish to Tableau Server/Public.
o Export to PDF or PowerPoint.
📊 List-1: 25 R Programming Lab Exercises
🔹 Data Import & Cleaning
1. Import CSV file and inspect dataset
o Use readr::read_csv() and head(), str().
2. Import Excel file and inspect dataset
o Use readxl::read_excel() and summary().
3. Import data from a database (SQLite)
o Use DBI and dplyr to connect and query.
4. Handle missing data (NA values)
o Use is.na(), na.omit(), and tidyr::replace_na().
5. Detect and remove duplicates
o Use duplicated() and distinct().
🔹 Data Manipulation & Transformation
6. Select specific columns from data
o Use dplyr::select().
7. Filter rows based on conditions
o Use dplyr::filter().
8. Create new variables with calculations
o Use mutate() to compute new fields.
9. Recode categorical variables
o Use dplyr::case_when() or forcats::fct_recode().
10. Reshape data (wide to long and vice versa)
o Use tidyr::pivot_longer() and pivot_wider().
🔹 Data Exploration
11. Generate summary statistics for numeric data
o Use summary() and skimr::skim().
12. Explore categorical variables
o Use table() and prop.table().
13. Profile data using DataExplorer
o Use DataExplorer::create_report().
🔹 Data Integration
14. Merge two datasets using joins
o Use left_join(), right_join(), inner_join(), and full_join().
15. Concatenate datasets vertically
o Use bind_rows().
🔹 Visualization with ggplot2
16. Create bar charts (categorical variables)
o Use ggplot2::geom_bar().
17. Create pie charts
o Use ggplot2 with coord_polar().
18. Create histograms and density plots
o Use geom_histogram() and geom_density().
19. Create scatter plots with regression line
o Use geom_point() and geom_smooth().
20. Create box plots for group comparisons
o Use geom_boxplot().
🔹 Advanced Visualizations
21. Create violin plots
o Use geom_violin().
22. Create ridgeline plots
o Use ggridges::geom_density_ridges().
23. Create beeswarm plots
o Use ggbeeswarm::geom_beeswarm().
24. Create heatmaps for correlation matrices
o Use ggplot2 with geom_tile() and reshape2::melt().
25. Faceted plots (multi-panel plots)
o Use facet_wrap() and facet_grid().

📈 List-2: 25 Tableau Lab Exercises


🔹 Data Import & Preparation
1. Connect to an Excel file (Retail data)
o Import and preview.
2. Connect to a CSV file (Student data)
o Handle NULL values using Data Interpreter.
3. Connect to a database (MS Access)
o Create a connection and explore tables.
4. Use metadata grid for renaming and organizing columns
o Rename, hide, and reorder columns.
5. Create calculated fields for recoding
o Create new columns using formulas.
🔹 Data Cleaning & Transformation
6. Handle missing values with default values or filters
o Filter out NULLs or fill with zeros.
7. Filter data based on conditions
o Apply filters in data source and worksheet levels.
8. Sort and group data
o Use manual and automatic sorting.
9. Create data hierarchies (e.g., Year > Month)
o Create hierarchy in data pane.
10. Create sets (e.g., top 10 sales)
o Create static and dynamic sets.
🔹 Data Integration
11. Perform inner and outer joins
o Join Payroll dataset with Departments dataset.
12. Perform data blending
o Blend Medical dataset with external sources.
13. Combine multiple data sources (cross-database joins)
o Use Excel and database sources.
🔹 Basic Visualizations
14. Create bar charts and stacked bar charts
o Drag-and-drop measures/dimensions.
15. Create pie charts
o Use Show Me panel.
16. Create line graphs for trends
o Add time dimension on X-axis.
17. Create scatter plots for correlation analysis
o Add measures to Columns and Rows.
18. Create box plots to compare distributions
o Add dimension and measure, select box plot.
🔹 Advanced Visualizations
19. Create heatmaps
o Use color encoding on mark cards.
20. Create treemaps
o Drag measures and dimensions.
21. Create KPI charts
o Use calculated fields and shapes.
22. Create waterfall charts
o Use Gantt bars and table calculations.
23. Create bump charts (ranking)
o Create index calculations.
🔹 Level of Detail (LOD) and Dashboards
24. Use LOD calculations for aggregates
o Calculate sales per customer.
25. Build interactive dashboards and stories
o Combine multiple sheets, add filters and actions.

📊 List-1: 25 Exercises Using R Programming


This list mirrors the core topics in your lab syllabus (e.g. connecting data, handling NULL
values, joins, data blending, sets, highlighting, charts) but adapted for R:

🔹 Data Import and Preparation


1. Import retail garment shop data from CSV.
2. Connect to an Excel sheet and import student dataset.
3. Import data from a SQLite database (simulate Access).
4. Handle NULL/missing values using tidyr and dplyr.
5. Clean up payroll dataset by removing duplicates and irrelevant columns.
6. Use metadata (column names, data types) to standardize datasets.
7. Create subsets using select() and filter() functions.

🔹 Data Cleaning and Transformation


8. Recoding variables in student data (e.g. grades).
9. Create new columns in payroll dataset using mutate().
10. Aggregate and summarize salary data using group_by() and summarize().
11. Standardize data: uppercase/lowercase transformations, date formatting.
12. De-duplicate payroll data using distinct().
13. Blend two datasets (medical + department) using join functions.
14. Create data extracts (simulate Tableau extract) by saving filtered datasets.

🔹 Grouping and Sets


15. Create and edit sets (e.g. top-performing students).
16. Highlight desired items in inventory data using conditional formatting.
17. Group sales data by category and region using group_by().

🔹 Visualization
18. Create bar charts (sales by region).
19. Create pie charts (market share).
20. Create line graphs (sales trends).
21. Create scatter plots (sales vs. profit).
22. Create combination charts using ggplot2 with multiple geoms.
23. Create dual-axis plots using ggplot2.
24. Create heatmaps of sales correlation using ggplot2 and reshape2.
25. Create advanced charts (boxplots, violin plots) for salary distributions.

📈 List-2: 25 Exercises Using Tableau


This list mirrors your lab syllabus topics explicitly:

🔹 Data Connections and Cleaning


1. Create a retail garment shop data connection in Tableau (Excel).
2. Import a student dataset and handle NULL values.
3. Connect to an MS Access database in Tableau.
4. Use metadata to rename and organize columns.
5. Clean payroll dataset (remove duplicates, filter columns).

🔹 Joins and Blending


6. Perform various join techniques in payroll dataset (inner, left, right, outer).
7. Perform data blending from multiple sources in medical dataset.
8. Combine Excel and Access data (simulate cross-database join).

🔹 Grouping and Sets


9. Create and edit sets using marks in the inventory dataset.
10. Highlight desired items in the library dataset using conditional formatting.
11. Make groups in library dataset (grouping books by genre or author).

🔹 Basic Charts
12. Create a pie chart of sales by region.
13. Create a bar chart of employee salaries by department.
14. Create a line graph of monthly sales.
15. Create a scatter plot of sales vs. profit.

🔹 Advanced Charts
16. Create a combination chart (sales and profit) using dual axes.
17. Create a dual-axis chart for sales and cost comparison.
18. Create a heatmap showing sales density.
19. Create a treemap to show market share.
20. Create a box plot comparing salaries across departments.

🔹 KPI and Analytical Visualizations


21. Create a KPI chart showing sales targets.
22. Create a waterfall chart showing profit build-up.
23. Create a bump chart to visualize rank trends over time.
24. Create Level of Detail (LOD) calculations (average sales per customer).
25. Create a dashboard combining multiple visualizations with filters.

📘 Data Preparation and Visualization Lab Manual


📊 Part A: R Programming (25 Exercises)
🔹 Dataset References
 retail_garments.csv
 student_data.xlsx
 payroll_data.sqlite
 medical_data.csv
Exercise 1: Import Retail Garment Shop Data
Objective: Import a CSV file and inspect the data.
Steps:
1. Open RStudio.
2. Install & load the readr package.
3. Read the file:
retail_data <- read_csv("retail_garments.csv")
head(retail_data)
str(retail_data)
4. Screenshot: View first few rows in the console.

Exercise 2: Import Student Data from Excel


Objective: Import Excel data and handle missing values.
Steps:
1. Install & load readxl.
2. Import data:
student_data <- read_excel("student_data.xlsx")
3. Check for missing values:
sum(is.na(student_data))

Exercise 3: Import Payroll Data from SQLite


Objective: Connect to a database.
Steps:
1. Install & load DBI, RSQLite.
2. Connect:
con <- dbConnect(RSQLite::SQLite(), "payroll_data.sqlite")
payroll_data <- dbReadTable(con, "payroll")
3. View data:
head(payroll_data)

Exercise 4: Clean Payroll Data


Objective: Remove duplicates.
Steps:
library(dplyr)
payroll_data <- distinct(payroll_data)

Exercise 5: Handle NULL Values


Objective: Replace missing values.
Steps:
library(tidyr)
student_data <- replace_na(student_data, list(Gender = "Unknown", Marks =
0))

Exercise 6: Select Specific Columns


Objective: Select columns from a dataset.
Steps:
retail_data <- select(retail_data, Invoice_ID, Product, Quantity)

Exercise 7: Filter Rows


Objective: Filter rows based on conditions.
Steps:
retail_data <- filter(retail_data, Quantity > 5)
Exercise 8: Create New Variables
Objective: Use mutate() to add columns.
Steps:
retail_data <- mutate(retail_data, Total = Quantity * Unit_Price)

Exercise 9: Recode Variables


Objective: Recode categories using case_when().
Steps:
student_data <- mutate(student_data,
Grade = case_when(
Marks >= 90 ~ "A",
Marks >= 75 ~ "B",
TRUE ~ "C"
)
)

Exercise 10: Data Summarization


Objective: Use group_by() and summarize().
Steps:
summary <- retail_data %>% group_by(Product) %>% summarize(Total_Sales =
sum(Total_Sales))

Exercise 11-25:
(Continue with steps for joins, reshaping, plotting, data exploration, blending, advanced
ggplot (bar, pie, hist, scatter, violin, ridgeline, beeswarm), facets, saving outputs.)

📈 Part B: Tableau (25 Exercises)


🔹 Dataset References
 retail_garments.xlsx
 student_data.xlsx
 payroll_data.accdb
 medical_data.xlsx
Exercise 1: Connect Retail Data
Objective: Import Excel into Tableau.
Steps:
1. Open Tableau.
2. Connect → Excel → retail_garments.xlsx.
3. Drag worksheet to canvas.

Exercise 2: Handle NULL Values in Student Data


Objective: Clean data.
Steps:
1. Connect → Excel → student_data.xlsx.
2. Use Data Interpreter.
3. Create calculated field: IFNULL([Gender], "Unknown").

Exercise 3: Connect MS Access Database


Objective: Connect to Access.
Steps:
1. Connect → Access → payroll_data.accdb.
2. Drag Payroll table.
Exercise 4: Rename Columns
Objective: Use metadata grid.
Steps:
1. Go to Data Source tab.
2. Rename columns.
Exercise 5: Clean Payroll Data
Objective: Filter unwanted rows.
Steps:
1. Drag Payroll to canvas.
2. Use filters or Data Interpreter.

Exercise 6-25:
(Continue with joins, blending, groups, highlights, bar chart, pie chart, line graph, scatter, box
plot, heatmap, treemap, KPI, waterfall, bump, LOD, dashboard, export.)

📂 Sample Dataset Structures


 retail_garments.csv — Invoice_ID, Product, Category, Quantity, Unit_Price,
Total_Sales.
 student_data.xlsx — Student_ID, Name, Gender, Department, Marks, Attendance.
 payroll_data.sqlite — Payroll table with Emp_ID, Name, Department, Salary,
Bonus.
 medical_data.csv — Patient_ID, Test_Type, Result.

📸 Screenshots
For each exercise:
 R: Screenshot of RStudio console output, e.g., head() output, ggplot graph.
 Tableau: Screenshot of Data Source pane or visualization.

Note:
 For the Word document version, I will embed images where needed and expand all
exercises with step-by-step details.
 Let me know if you'd like me to generate the actual screenshots or placeholder labels
(e.g., [Insert Screenshot Here]).
 If you’d like, I can zip all sample datasets and screenshots and share them as a bundle.
Would you like me to proceed with completing the full manual with screenshots and dataset
files, or would you like any adjustments first? 😊
Where is the link to download the word file bearing entire content as we discussed so far

📊 Data Preparation and Visualization Lab Manual (Using


Tableau)
🌟 Introduction to Tableau
Tableau is a powerful data visualization and business intelligence tool that allows you to
connect to various data sources, clean and prepare data, and create a wide range of interactive
charts and dashboards.

👉 Key Features:

 Connects to Excel, databases, and cloud data.


 Supports data cleaning and transformation (using Data Interpreter, calculated fields,
etc.).
 Allows creation of charts like pie, line, scatter plots, heat maps, etc.
 Easy drag-and-drop interface—no programming required.

🔧 Basic Concepts:

 Workbook: The main Tableau file where all work is saved.


 Sheet: Each worksheet is where you build individual visualizations.
 Dashboard: A collection of multiple sheets arranged together.
 Data Source: The file or database you connect to (e.g., Excel, Access).
 Dimensions: Categorical fields (e.g., Product, Region).
 Measures: Numerical fields (e.g., Sales, Profit).

📝 Step-by-Step Practical Manual


I’ve organized the steps to match the order of the syllabus topics in the uploaded file.

1️⃣ Create a Retail Garments Shop Data and Connect to an Excel Sheet and
Import Data into Tableau

Step 1.1: Prepare an Excel sheet with sample data (columns like: ItemID, ItemName,
Category, Price, QuantitySold).

 Example:

ItemID ItemName Category Price QuantitySold

1 Jeans Clothing 1200 25

2 T-Shirt Clothing 600 45

Step 1.2: Open Tableau Desktop → Click Connect → Microsoft Excel → Browse and
select your Excel file.

Step 1.3: Drag the desired sheet from the left pane into the canvas area to load the data.

Step 1.4: Click Sheet 1 at the bottom to start building visualizations.


2️⃣ Use Metadata and Extracts, Handle NULL Values in Student Data Set

Step 2.1: Load a student data set Excel file with fields like StudentID, Name, Marks, Gender.

Step 2.2: Click the Data Source tab → review data types and metadata.

Step 2.3: Right-click on the data source → choose Extract Data → Save extract for better
performance.

Step 2.4: Identify NULL values (they appear as blank or “Null”).

 Click on a NULL cell and use Data Interpreter (if Excel) or calculated fields to
handle missing values.
 Example: Use ZN() function to replace NULL with 0.

3️⃣ Clean Up the Data Before Use in Payroll Data Set

Step 3.1: Load the Payroll Data Set (Excel or CSV).

Step 3.2: Remove unwanted columns:

 Right-click the column header → Hide or Remove.

Step 3.3: Filter data using filters panel or data source filters.

Step 3.4: Correct data types (e.g., Salary → Number, DateOfJoining → Date).

4️⃣ Perform Various Join Techniques in Payroll Data Set

Step 4.1: Load two related tables: e.g., EmployeeMaster and SalaryDetails.

Step 4.2: Drag both sheets into the canvas → Tableau shows a Venn diagram.

Step 4.3: Click the Venn diagram → Select Inner Join, Left Join, Right Join, or Full
Outer Join.

Step 4.4: Verify by inspecting the joined data preview.

5️⃣ Perform Data Blending from More Than One Source in Medical Data Set

Step 5.1: Load the first data source (e.g., PatientDetails.xlsx).

Step 5.2: Click Data → New Data Source → Add second data source (e.g.,
TestResults.csv).

Step 5.3: In Sheet 1, drag a field from the primary data source.
Step 5.4: Drag a field from the secondary source → Tableau automatically blends on
common fields (indicated by an orange linking icon).

Step 5.5: Adjust linking fields manually if needed.

6️⃣ Create and Edit Sets Using Marks in Inventory Data Set

Step 6.1: Load InventoryData.xlsx.

Step 6.2: Drag Category to Rows → Right-click a Category → Create Set.

Step 6.3: Name the set (e.g., “High Value Items”).

Step 6.4: To edit a set → Right-click on the set in Data Pane → Edit Set.

7️⃣ Highlight Desired Items and Make Groups in Library Data Set

Step 7.1: Load LibraryData.xlsx.

Step 7.2: Drag BookName to Rows → Select desired books using Ctrl+Click.

Step 7.3: Right-click → Group.

Step 7.4: Use the Highlighter tool (top-right) to emphasize items.

8️⃣ Working on Excel Data Through Tableau

Step 8.1: Load Excel data.

Step 8.2: Use Data Interpreter if data is messy.

Step 8.3: Explore pivoting data, renaming columns, splitting columns.

9️⃣ Connect a MS Access Database in Tableau

Step 9.1: Click Connect → Microsoft Access.

Step 9.2: Select the .accdb file.

Step 9.3: Drag desired tables and perform joins if needed.

10️⃣–22️⃣Create Visualizations (Pie Chart, Line Graph, etc.)

Below are generic steps for all chart types:


Common Steps:

✅ Drag the appropriate dimension (e.g., Category) to Columns.


✅ Drag the appropriate measure (e.g., Sales) to Rows.
✅ Use the Show Me panel on the right to select the desired chart type (e.g., Pie, Line, Scatter
Plot).
✅ Format the chart: Colors, labels, tooltips.

For each chart:

🔹 Pie Chart: Use dimension on Color shelf, measure on Angle shelf.


🔹 Line Graph: Use Date on Columns, measure on Rows.
🔹 Scatter Plot: Use two measures (e.g., Sales vs Profit) on Columns and Rows.
🔹 Combination Chart: Dual-axis with different chart types.
🔹 Dual Axis Chart: Right-click on axis → Dual Axis.
🔹 Heat Map: Dimension on Rows and Columns → Color shelf with measure.
🔹 Tree Map: Use dimension on Color/Size shelves.
🔹 Box Plot: Use dimension on Columns, measure on Rows → select Box Plot from Show
Me.
🔹 KPI Chart: Use calculated fields to define thresholds → display using shapes.
🔹 Waterfall Chart: Use Gantt Bar marks → build cumulative measures.
🔹 Bump Chart: Create a rank calculation → plot over time.
🔹 LOD Expression: Use {FIXED [Dimension] : SUM([Measure])}.

📂 Saving Work
 Save Tableau workbooks using .twb or .twbx (packaged) formats.
 Save extracts as .hyper files for faster performance.

🌟 Pro Tips for Beginners


✅ Save work frequently.
✅ Explore Tableau’s built-in Sample Superstore for practice.
✅ Use Tooltips for quick summaries.
✅ Right-click everywhere — most features are accessible via right-click.
✅ Use Undo and Redo liberally.

You might also like