0% found this document useful (0 votes)

5 views

MDS Final Task

The assignment involves analyzing and visualizing soft drink sales data for Coke and Pepsi using Quarto, focusing on panel data characteristics. Key tasks include data loading, variable renaming, creating a DateTime variable, and various visualizations related to sales trends, price changes, and buyer behavior. Additionally, students are encouraged to explore the dataset creatively to uncover new insights and document their findings.

Uploaded by

ingkarat.watt

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views

MDS Final Task

Uploaded by

ingkarat.watt

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

MDS Final Task

This assignment focuses on analyzing and visualizing data related to soft drink sales,
specifically Coke and Pepsi products. Please use Quarto to report your findings.

The dataset you're working with is an example of panel data. Panel data, also known as
longitudinal data or time-series cross-sectional data, is a type of data that follows the same
subjects (such as individuals, households, firms, countries, or any other entities) over a
period of time. In your dataset, the key characteristics of panel data are evident:

1. Multiple Observations per Subject: Each entity (in this case, households or
individuals represented by PANID ) is observed at multiple points in time. This is evident
from variables like year , WEEK , and MINUTE which track the same entity over different
time periods.
2. Temporal Dimension: The data includes a time component (years and weeks),
allowing for the analysis of trends, patterns, and changes over time within each subject.
3. Cross-Sectional Dimension: The dataset also has a cross-sectional aspect, meaning it
includes multiple subjects (different PANIDs) observed at each point in time.

Variable description:
You are given the following information on this dataset:

1. IRI_KEY: This is the masked store number, serving as a unique identifier for each store.
2. WEEK: IRI Week number, which corresponds to specific weeks in the calendar year.
3. UNITS (represented in the dataset as units ): Total unit sales of the product.
4. DOLLARS (represented in the dataset as dollars ): Total dollar sales of the product.
5. F (Feature): This variable indicates the type of advertising or promotional feature
associated with a product. It can take several values:

NONE: No feature or promotion.

- FS-C: Frequent Shopper Program C, available to members only.
- C: Small ad, usually a single line of text.
- FS-B: Frequent Shopper Program B.
- B: Medium size ad.
- FS-A: Frequent Shopper Program A.
- A: Large size ad.
- FSA+: Frequent Shopper Program A+.
- A+: Also known as “Q” or “R” – indicating a retailer coupon or rebate.
In the dataset, this could correspond to the variables fa , faplus , fb , and fc , where
each variable might represent a different type of feature.

1. D (Display): This variable represents the level of in-store display for a product. It can be:
0: No display.
1: Minor display.
2: Major display (includes codes 1 & 2).
This corresponds to the d1 and d2 variables in the dataset.
2. PR (Price Reduction flag): Indicates whether there was a significant price reduction (1 if
the temporary price reduction is 5% or greater, 0 otherwise). This information may be
reflected in the price or price_storedata variables in the dataset, although a specific
flag for price reduction is not directly mentioned in the glimpse of the dataset.

Here is a bit more information:

9. PANID ( <dbl> ): Panelist identifier, representing individual households or buyers.

10. IRI_KEY ( <dbl> ): Masked store number, uniquely identifying each store.
11. Market_Name ( <chr> ): The market or geographic area name where the store is
located, represented as characters.
12. store_type ( <chr> ): Category or type of the store, represented as characters.
13. year ( <dbl> ): Year of purchase.
14. WEEK ( <dbl> ): IRI week number, translated to calendar weeks.
15. MINUTE ( <dbl> ): Time of purchase in minutes from the beginning of the week.
16. units ( <dbl> ): Total unit sales, represented. Contains missing values (NA).
17. dollars ( <dbl> ): Total dollar sales.
18. price ( <dbl> ): Product price. (note: change scale)
19. pid ( <dbl> ): Product ID
20. brand ( <dbl> ): Brand identifier.
21. decision ( <dbl> ): Indicates the buyer's choice, with 1 indicating selection.
22. L4 ( <chr> ): Likely represents a category or company name, such as "COCA COLA CO"
or "PEPSICO INC", represented as characters.
23. L5 ( <chr> ): Likely represents a sub-category or specific product name, such as "COKE
CLASSI", "DIET COKE", etc., represented as characters.
24. price_storedata ( <dbl> ): Store-specific price data.
25. d1 and d2 ( <dbl> ): Variables that could be related to display or promotional strategies.
26. fa, faplus, fb, fc ( <dbl> ): Feature-related variables, likely indicating different types of
advertisements or promotions.
27. no_choice ( <dbl> ): Indicates instances with no purchase decision.
28. marketshare ( <dbl> ): Market share of the product.
29. brandchoice ( <dbl> ): Indicates brand choice
30. b1, b2, b3, b4 ( <dbl> ): Additional brand-related variables.
31. choice ( <dbl> ): Another variable related to purchase decision
32. u ( <dbl> ): A calculation of the utility of the buyer for this option (ignore this)

Tasks
1. Explanation of the Data
Task: Understand the dataset, which contains information on soft drink products,
including Coke and Pepsi, with a 'decision' variable indicating the buyer's choice.

2. Load the Data

Task: Import SAS data into R.
Hint: Use packages like haven to read SAS files in R.

3. Rename Variables
Task: Standardize variable names according to a style guide.

4. Create DateTime Variable

Task: Use 'Year', 'Week', and 'Minute' to create a DateTime variable using lubridate .
Details:
Recode WEEK to reflect the week number for each year, starting with Week 1322 as
the first week of 2005. The dataset starts on January 3rd, just before 9 AM.
MINUTE is the number of minutes since the beginning of the week.
Hint: Utilize years() , weeks() , and minutes() functions from lubridate .

5. Plot Total Soft Drink Sales Over Time

Task: Create a time series plot of total expenditure on all soft drinks (use price ).

6. Sales of Classic Coke in Specific Stores

Task: Plot monthly sales of classic Coke in PITTSFIELD and EAU CLAIR .
Hint: Aggregate data monthly and include proper labels.

7. Sales of All Soft Drink Types

Task: Create a plot showing sales of all types of soft drinks over time (each type should
get its own color) and interpret it.

8. Price Trends for Diet Pepsi

Task: Visualize how the price of Diet Pepsi has changed over time.

9. Price and Sales Relationship

Task: Show the relationship between price and sales for all drinks, interpret the data,
and discuss price elasticity.
Hint: Consider scatter plots and correlation analysis.

10. Confidence Intervals for Prices

Task: Calculate confidence intervals for prices of different products using the infer
package.

11. Compare Mean Prices

Task: Compare the mean prices in the two stores using the infer package.

12. Proportion of Pepsi Buyers

Task: Determine if the proportion of buyers choosing Pepsi products ( L4 ) differs
between the two stores using infer .

13. Additional exploration

Objective: For this task, you are encouraged to dive deeper into the dataset2_4brands data
and conduct your own analysis. This is an opportunity to apply your creativity and analytical
skills to uncover new insights, patterns, or trends in the data that have not been previously
explored.
Instructions:

1. Choose a Focus Area: Select a specific aspect of the data you find interesting. This
could be customer behavior, sales trends, geographical differences, or anything else
that catches your attention.
2. Formulate a Hypothesis or Question: Start with a clear hypothesis or a research
question that you want to explore. For example, "Do marketing campaigns significantly
impact the sales of a particular brand?" or "Is there a regional preference for Coke over
Pepsi?"
3. Data Analysis and Visualization: This can include creating new variables, segmenting
the data, and performing statistical tests. Use visualization tools, such as ggplot2 , to
help illustrate your findings.
4. Innovative Approach: Try to think outside the box. You can combine different variables,
use advanced statistical techniques, or even merge this data with external data sources
to enrich your analysis.
5. Document Your Findings: Prepare a report or presentation that outlines your
methodology, findings, and conclusions. Include visualizations and any interesting
patterns or anomalies you discovered.
6. Reflect on the Implications: Discuss the implications of your findings. How do they
add value to understanding consumer behavior or market trends? What
recommendations would you make to a company based on your analysis?

Power BI Capstone Projects
No ratings yet
Power BI Capstone Projects
19 pages
Assignment Fall2019 DBW
No ratings yet
Assignment Fall2019 DBW
4 pages
Plays Three - Harold Pinter PDF
100% (11)
Plays Three - Harold Pinter PDF
279 pages
SS Teamproject Documentation
No ratings yet
SS Teamproject Documentation
33 pages
Warehouse Assignment
No ratings yet
Warehouse Assignment
9 pages
Economic Data Analysis (Finance Analyst)
No ratings yet
Economic Data Analysis (Finance Analyst)
38 pages
Walmart Capstone Project
No ratings yet
Walmart Capstone Project
46 pages
Minor Unit 3-5 13 Marks Merged
No ratings yet
Minor Unit 3-5 13 Marks Merged
32 pages
Power BI 5 Mini Projects
No ratings yet
Power BI 5 Mini Projects
9 pages
Data Analysis Report
No ratings yet
Data Analysis Report
27 pages
Assessment Task - Assignment 1: MKT10007 Fundamentals of Marketing
No ratings yet
Assessment Task - Assignment 1: MKT10007 Fundamentals of Marketing
6 pages
Q1063255_JEROMEBASIL_VSTT_SET_ASSIGNMENT
No ratings yet
Q1063255_JEROMEBASIL_VSTT_SET_ASSIGNMENT
24 pages
Coding and Communication in Statistics Presentation 2024
No ratings yet
Coding and Communication in Statistics Presentation 2024
11 pages
Customer Data Analysis
No ratings yet
Customer Data Analysis
14 pages
Assignment 2 Analysis and Product Strategy - Tableau
No ratings yet
Assignment 2 Analysis and Product Strategy - Tableau
11 pages
Business Analytics Course
No ratings yet
Business Analytics Course
11 pages
MDX Tutorials
No ratings yet
MDX Tutorials
33 pages
Assessment Task - Assignment 1: MKT10007 Fundamentals of Marketing
No ratings yet
Assessment Task - Assignment 1: MKT10007 Fundamentals of Marketing
7 pages
Assignment
No ratings yet
Assignment
2 pages
SABRE Online Student Guide
No ratings yet
SABRE Online Student Guide
44 pages
Marketing Plan Project - Evaluating A New Product
No ratings yet
Marketing Plan Project - Evaluating A New Product
12 pages
1 Sas-Assignment
No ratings yet
1 Sas-Assignment
14 pages
Assignment 3 Revised
No ratings yet
Assignment 3 Revised
6 pages
Document Formatting
No ratings yet
Document Formatting
7 pages
Marketing
No ratings yet
Marketing
12 pages
GP Report - Achintya
No ratings yet
GP Report - Achintya
40 pages
entrep 7
No ratings yet
entrep 7
6 pages
Marketing Plan
No ratings yet
Marketing Plan
7 pages
Data - Warehousing - Dimensional - Modeling Basics
No ratings yet
Data - Warehousing - Dimensional - Modeling Basics
48 pages
case_2
No ratings yet
case_2
2 pages
Superstore Sales .PDF
No ratings yet
Superstore Sales .PDF
10 pages
MIDSEMI
No ratings yet
MIDSEMI
4 pages
synopsis ankit
No ratings yet
synopsis ankit
16 pages
Advance Power BI Assignments
No ratings yet
Advance Power BI Assignments
7 pages
Sales Analysis of Walmart Data: Mayank Gupta, Prerana Ghosh, Deepti Bahel, Anantha Venkata Sai Akhilesh Karumanchi
No ratings yet
Sales Analysis of Walmart Data: Mayank Gupta, Prerana Ghosh, Deepti Bahel, Anantha Venkata Sai Akhilesh Karumanchi
10 pages
Projects PDF
No ratings yet
Projects PDF
12 pages
Business Plan Task Sheet
No ratings yet
Business Plan Task Sheet
7 pages
Taller de Comprensión Ingles
No ratings yet
Taller de Comprensión Ingles
7 pages
Taller de Comprension de Ingles
No ratings yet
Taller de Comprension de Ingles
7 pages
MKT101 Group Assignment MKT1812
100% (1)
MKT101 Group Assignment MKT1812
8 pages
Ingles Oferta y Demanda
No ratings yet
Ingles Oferta y Demanda
12 pages
Pharma Data Analysis
No ratings yet
Pharma Data Analysis
2 pages
projects
No ratings yet
projects
11 pages
Blinkit Powerbi Project
No ratings yet
Blinkit Powerbi Project
2 pages
Group 6 Report
No ratings yet
Group 6 Report
11 pages
AP04-EV04 - Ingles - Taller de Comprensión de Lectura
No ratings yet
AP04-EV04 - Ingles - Taller de Comprensión de Lectura
10 pages
Guidelines For Feasibility Paper
No ratings yet
Guidelines For Feasibility Paper
20 pages
advance database
No ratings yet
advance database
15 pages
Evidencia de Comprension de Lectura en Ingles
No ratings yet
Evidencia de Comprension de Lectura en Ingles
9 pages
Notes Data Visualization Unit 5
No ratings yet
Notes Data Visualization Unit 5
15 pages
32 BDA Exp7&8
No ratings yet
32 BDA Exp7&8
13 pages
Case Study Optimizing Product - Command Center-Final
No ratings yet
Case Study Optimizing Product - Command Center-Final
5 pages
Financial Performance Dashboard (Business Analyst)
No ratings yet
Financial Performance Dashboard (Business Analyst)
14 pages
2016-BIDM Assignment No2. and 3
No ratings yet
2016-BIDM Assignment No2. and 3
2 pages
Maha Raja Sales-1
No ratings yet
Maha Raja Sales-1
11 pages
Pitchprogram 10 Minutepitch 131112165615 Phpapp01 PDF
No ratings yet
Pitchprogram 10 Minutepitch 131112165615 Phpapp01 PDF
16 pages
Business Plan: Executive Summary
No ratings yet
Business Plan: Executive Summary
19 pages
Why We Sell The Way We Do?: Beating The Pareto Psychology
No ratings yet
Why We Sell The Way We Do?: Beating The Pareto Psychology
16 pages
The Space Planning Handbook
From Everand
The Space Planning Handbook
Flora Delaney
No ratings yet
Marketing Management Worked Assignment: Model Answer Series
From Everand
Marketing Management Worked Assignment: Model Answer Series
AIB Publishing
No ratings yet
Summary of Roman Pichler's Strategize
From Everand
Summary of Roman Pichler's Strategize
IRB Media
No ratings yet
DMEE
No ratings yet
DMEE
9 pages
A Study On Sources of Fund and Its Mobilization
100% (1)
A Study On Sources of Fund and Its Mobilization
28 pages
Modern School Magazine
No ratings yet
Modern School Magazine
8 pages
Eshal Kashif - Poetry Reflection
No ratings yet
Eshal Kashif - Poetry Reflection
4 pages
Plaintiff-Appellee vs. vs. Defendant-Appellant The Solicitor General Deogracias Eufemio
No ratings yet
Plaintiff-Appellee vs. vs. Defendant-Appellant The Solicitor General Deogracias Eufemio
12 pages
Honest
No ratings yet
Honest
2 pages
Chiron Retrograde in The Houses
100% (3)
Chiron Retrograde in The Houses
13 pages
Hotspot Manual
No ratings yet
Hotspot Manual
28 pages
Metric Conversion
No ratings yet
Metric Conversion
5 pages
Gaushala Hackathon (2023) - Compressed
No ratings yet
Gaushala Hackathon (2023) - Compressed
11 pages
RFP New
No ratings yet
RFP New
6 pages
Channa Mereya (Ae Dil Hai Mushkil)
No ratings yet
Channa Mereya (Ae Dil Hai Mushkil)
4 pages
Answer Key CH 2
No ratings yet
Answer Key CH 2
8 pages
English Tongue Twisters
No ratings yet
English Tongue Twisters
60 pages
Sbi Essay Letter 2
No ratings yet
Sbi Essay Letter 2
2 pages
1990 Watts and Zimmerman
No ratings yet
1990 Watts and Zimmerman
27 pages
(Studies in Ecological Economics 5) Karl Seeley (Auth.) - Macroeconomics in Ecological Context-Springer International Publishing (2017)
No ratings yet
(Studies in Ecological Economics 5) Karl Seeley (Auth.) - Macroeconomics in Ecological Context-Springer International Publishing (2017)
384 pages
Does Anybody Here Remember When Hanz Gubenstein Invented Time Travel?
100% (3)
Does Anybody Here Remember When Hanz Gubenstein Invented Time Travel?
102 pages
Temporary Position For A Foreign Language Court Interpreter With English, Zimbabwe Ndebele, Shona and Afrikaans in Brits
No ratings yet
Temporary Position For A Foreign Language Court Interpreter With English, Zimbabwe Ndebele, Shona and Afrikaans in Brits
2 pages
100 Riddles
No ratings yet
100 Riddles
7 pages
HUAWEI AICC Technical Proposal for XXXX- SaaS v1.0_20230921
No ratings yet
HUAWEI AICC Technical Proposal for XXXX- SaaS v1.0_20230921
186 pages
2 DCR 1991 Till March 2018 PDF
No ratings yet
2 DCR 1991 Till March 2018 PDF
490 pages
Complete Guideline for Video Editing Sound Effects
No ratings yet
Complete Guideline for Video Editing Sound Effects
3 pages
Allan Sekula_Reading an Archive
No ratings yet
Allan Sekula_Reading an Archive
1 page
Conduc Tometr Y: Physical Chemistry
No ratings yet
Conduc Tometr Y: Physical Chemistry
5 pages
BS en Iso 00105-E04-2009
No ratings yet
BS en Iso 00105-E04-2009
12 pages
BMD0003 Intelligent Business Information Systems
No ratings yet
BMD0003 Intelligent Business Information Systems
11 pages
Calculating Geometric Means
No ratings yet
Calculating Geometric Means
11 pages
Manpower Agency Proposal
No ratings yet
Manpower Agency Proposal
10 pages

MDS Final Task

Uploaded by

MDS Final Task

Uploaded by

MDS Final Task

NONE: No feature or promotion.

Here is a bit more information:

9. PANID ( <dbl> ): Panelist identifier, representing individual households or buyers.

2. Load the Data

4. Create DateTime Variable

5. Plot Total Soft Drink Sales Over Time

6. Sales of Classic Coke in Specific Stores

7. Sales of All Soft Drink Types

8. Price Trends for Diet Pepsi

9. Price and Sales Relationship

10. Confidence Intervals for Prices

11. Compare Mean Prices

12. Proportion of Pepsi Buyers

13. Additional exploration

You might also like