0% found this document useful (0 votes)

16 views25 pages

Module2 Analytical Tool

Module2_Analytical_Tool

Uploaded by

olihu767

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views25 pages

Module2 Analytical Tool

Module2_Analytical_Tool

Uploaded by

olihu767

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 25

MKTG 631

Marketing Analytics
Module2 – R tutorial

Jinhee Huh
Marketing Analytical Tools
Data Type

• XLS and XLSX

• CSV (Comma Separated Value)

• Each row of data is stored in a text file with
a comma separating each column’s values from one another.

• Tab delimited file

• The columns of data are stored as a text file with
a TAB character between values.
Open and Proprietary Programming Tools

• Open-source programming tools:

• Programming tools that are made freely available, often developed by
and for the community.
• Adapting for new methodology

• Proprietary programming tools:

• Programming tools that are developed by a firm and distributed for
sale to the public.
• Expensive
• It will take longer time
R and Python

• R program advantages
• Developed by data scientists
• Large number of ready-made packages for statistical analysis than Python
• Supported by an integrated development environment called RStudio
• Built-in ways to professionally visualize data
• Easy to install and set up the work environment

• R program disadvantages
• Less efficient for general computations, sometimes due to inefficiently
written packages
R and Python
R Program Practice
R Program Practice
• Let’s practice the descriptive statistics calculation, t-test, multivariate
descriptive statistics, and plot drawing.

• How can we open the ”Module2_Demographics.csv”

• demo <- read.csv(“your directory path/Module2_Demographics.csv”)
Descriptive Statistics
• Central tendency measurement: mean, median, and mode
• Mean function: mean()
• Median function: median()
• Mode function: Mode() in DescTools package

• What if there is a missing value in a vector?

• If you just type in mean(”variable name”), then R will produce “NA”.
• You can get the central tendency measurement statistics after dropping such missing
observations by adding “na.rm=TRUE” or “na.rm=T” parameter.
• mean(demo$age, na.rm=T)
Descriptive Statistics
• Measures of variability: range, variance, and standard deviation
• Range: max() - min()
• Variance: var()
• Standard deviation: sd() or sqrt(var())

• Let’s try the following code

• max(demo$age) – min(demo$age)
• var(demo$age)
• sd(demo$age)
• sqrt(var(demo$age))
Descriptive Statistics

• Frequency table
• Function: table()

• How can I make the frequency table with pre-defined bin ranges?
• table(cut(x, bin range vector))
• Let’s try the following code
• br <- c(0, 20, 30, 50, 60, 70)
• table(cut(demo$age, br))
In-class practice
• Use ”Module2_Demographics.csv” to answer the questions below.

• Q1. What are the mean, median, and mode of female, age, and income?
• Q2. What are the range, variance, and standard deviation of female, age, and
income?
• Q3. Create frequency tables for female, age, and income using the following bins.
• female: c(-2, -1, 0, 1, 2)
• age: c(20, 30, 50, 60, 70)
• income: c(10000, 30000, 50000, 70000, 90000)
One-Sample t-test

• Hypotheses
• 𝐻0: 𝜇=𝜇_0; 𝐻1: 𝜇≠𝜇_0
• 𝐻0: 𝜇≤𝜇_0 𝐻1: 𝜇>𝜇_0
• 𝐻0: 𝜇≥𝜇_0;𝐻1: 𝜇<𝜇_0

• A t-test is suitable if a variable is believed to be drawn from a normal

distribution, or if the sample size is large.
One-Sample t-test
• How can I conduct the one-sample t-test?

• t.test(demo$age, mu=0)
• Reject the null hypothesis, which means that the null hypothesis that average age is 0 can be rejected.
• t.test(demo$age, mu=40)
• Cannot reject the null hypothesis, which means that the null hypothesis that the average age is 40 cannot be rejected.
• t.test(demo$age, mu=45, alternative="greater”)
• 𝐻0: 𝜇≤𝜇_0 𝐻1: 𝜇>𝜇_0
• Cannot reject the null hypothesis, which means that the null hypothesis that the average age is smaller than or equal to 45
cannot be rejected.
• t.test(demo$age, mu=45, alternative=”less”)
• 𝐻0: 𝜇≥𝜇_0;𝐻1: 𝜇<𝜇_0
• Reject the null hypothesis, which means that the null hypothesis that the average age is greater than or equal to 45 can be
rejected.
In-class practice

• Q4. One sample t-test

• Q4-1. Is the female variable average significantly different from zero?

• Q4-2. Is the age variable average significantly different from 40?

• Tage<-t.test(demo$age, mu=40)
• Tage
• Tage$statistic
• Tage$p.value
• If(tage$p.value<.05)

• Q4-3. Is the income variable average significantly greater than 30000?

Two-sample t-test
• Hypotheses

• 𝐻0: 𝜇_1=𝜇_2; 𝐻1:𝜇_1≠𝜇_2

• Test the null hypothesis that the means of groups are equal.
• Test the null hypothesis that the means of variables are equal.

• 𝐻0: 𝜇_1≤𝜇_2; 𝐻1:𝜇_1>𝜇_2

• 𝐻0: 𝜇_1≥𝜇_2; 𝐻1:𝜇_1<𝜇_2

Two-sample t-test
• R function

• t.test(x, y, alternative = “two.sided”, var.equal = FALSE)- if the variable the

same or not

• alternative = “two.sided”, “greater”, “less”

• alternative = “greater”
• 𝐻0: 𝜇_1≤𝜇_2; 𝐻1:𝜇_1>𝜇_2
• alternative = “less”
• 𝐻0: 𝜇_1≤𝜇_2; 𝐻1:𝜇_1>𝜇_2

• var.equal = “TRUE” or “FALSE”; “T” or “F”

Two-sample t-test

• Do we need to use var.equal = FALSE or TRUE?

• var(female$age); var(male$age)

• var.test(female$age, male$age, alternative = “two.sided”)

• Cannot reject the null hypothesis that the variance of female age and the male age are the same.
•
• t.test(female$age, male$age, var.equal=T)
• Can reject the null hypothesis that the average female age and male age are the same (given the
equal variance assumption).
In-class practice
• Q5. Two-sample t-test

• Tip: For the following questions, you need to test if the variances are the same. then
use the appropriate argument and parameter.

• Q5-1. Is average female income significantly different from average male income?

• Q5-2. Is average female age significantly different from average male age?
Multivariate Descriptive Statistics

• Measures of the relationship between two variables

• Covariance: cov()
• Correlation
• cor(x, y, method = c("pearson", "kendall", "spearman"))
• cor.test(x, y, method=c("pearson", "kendall", "spearman"))
In-class practice

• Q6. What are the correlations and covariances of the following pairs? Use
Pearson correlation for correlation coefficient calculation. Are the
correlation coefficients significantly different from zero?

• female and age

• female and income
• age and income
Plotting
• Histogram
• hist()
• Let’s try the following code
• hist(demo$female, main = "Female histogram", xlab="Female or not", col="blue")

• Histogram by group
• ggplot2 package
• Data visualization package.
• Lots of resources to learn about the details. Ex) https://ggplot2.tidyverse.org/
Plotting
• Let’s try the following code

• install.packages(“ggplot2”)
• library(ggplot2)

• ggplot(demo, aes(x = female)) +

geom_histogram(color = "grey30", fill = "blue") +
ggtitle("Female ggplot histrogram")
Plotting
• Let’s try the following code

• ggplot(demo, aes(x = age, fill = female)) +

geom_histogram(binwidth = 1)

• ggplot(demo, aes(x = age, fill = female)) +

geom_histogram(binwidth = 5)

• ggplot(demo, aes(x=age, fill = female)) +

geom_histogram(binwidth = 5, position = "dodge")

• ggplot(demo, aes(x = age)) +

geom_histogram(binwidth = 1, color = "grey30") +
facet_grid(female ~ .)
In-class practice

• Q7. Plots

• Q7-1. Create histograms that show the distribution of female, age, and income.

• Q7-2. Create a histogram that shows the distribution of age by sex (female or male).
Use binwidth = 5.

• Q7-3. Create a histogram that shows the distribution of income by sex (female or
male) Use binwidth = 5000.

Unit 2 DSRP
No ratings yet
Unit 2 DSRP
56 pages
Statistical Techniques - Bda
No ratings yet
Statistical Techniques - Bda
33 pages
Unit 4-1
No ratings yet
Unit 4-1
21 pages
Chapter 4 - STAT1204 A
No ratings yet
Chapter 4 - STAT1204 A
10 pages
Cebu - Day 1 (Descriptive Statistics Lecture) Part 1
No ratings yet
Cebu - Day 1 (Descriptive Statistics Lecture) Part 1
107 pages
MMW Data Management
No ratings yet
MMW Data Management
87 pages
Lab6 - Hypothesis Testing and Confidence Intervals in R
No ratings yet
Lab6 - Hypothesis Testing and Confidence Intervals in R
3 pages
RM EBBA Class 8 CH0 11 Quatitative Analysis
No ratings yet
RM EBBA Class 8 CH0 11 Quatitative Analysis
37 pages
Central Tendency Dispersion Visualization
No ratings yet
Central Tendency Dispersion Visualization
34 pages
Practical 8 PDF
No ratings yet
Practical 8 PDF
3 pages
Data Analysis Basics and Techniques
No ratings yet
Data Analysis Basics and Techniques
31 pages
SPSS Guide: Tests of Differences: One-Sample T-Test
No ratings yet
SPSS Guide: Tests of Differences: One-Sample T-Test
11 pages
Introduction To Data Science Exploratory Data Analysis
No ratings yet
Introduction To Data Science Exploratory Data Analysis
55 pages
Tndy - Ta Session 1
No ratings yet
Tndy - Ta Session 1
10 pages
Unit II: Basic Data Analytic Methods
No ratings yet
Unit II: Basic Data Analytic Methods
38 pages
Test On Variables: in Surveys, The Foolish Ask Questions, Wise Cannot Answers
No ratings yet
Test On Variables: in Surveys, The Foolish Ask Questions, Wise Cannot Answers
24 pages
Which Test When: 1 Exploratory Tests
No ratings yet
Which Test When: 1 Exploratory Tests
5 pages
Parametric Tests
100% (1)
Parametric Tests
57 pages
Stats Lab1
No ratings yet
Stats Lab1
11 pages
BRM Unit 3 & 5 Data Analysis
No ratings yet
BRM Unit 3 & 5 Data Analysis
50 pages
Chap 4 Part1 Intro Measures of Central Tendency of Ungrouped Data 1
No ratings yet
Chap 4 Part1 Intro Measures of Central Tendency of Ungrouped Data 1
74 pages
Basic Biostats
No ratings yet
Basic Biostats
36 pages
Unit 2 Assignment SKELETON R spr18
No ratings yet
Unit 2 Assignment SKELETON R spr18
12 pages
Unit4 R
No ratings yet
Unit4 R
21 pages
Chapter 5 Data Analysis Ab
No ratings yet
Chapter 5 Data Analysis Ab
56 pages
Data Analysis Training Workshop - Day 2 Presentation
No ratings yet
Data Analysis Training Workshop - Day 2 Presentation
52 pages
Stata Commands for Data Analysis
No ratings yet
Stata Commands for Data Analysis
29 pages
Unit 4
No ratings yet
Unit 4
35 pages
Statistical Computing by Using R
100% (1)
Statistical Computing by Using R
11 pages
Introduction To Statistics: Prepared By: Joshua Erdy A. Tan
No ratings yet
Introduction To Statistics: Prepared By: Joshua Erdy A. Tan
29 pages
R Unit-4
No ratings yet
R Unit-4
13 pages
Unit II TYCS DS
No ratings yet
Unit II TYCS DS
176 pages
Session 6-15 - Unit II & III: Probability and Distribution, Classical Tests
No ratings yet
Session 6-15 - Unit II & III: Probability and Distribution, Classical Tests
34 pages
SPSS Data Analysis
No ratings yet
SPSS Data Analysis
47 pages
9 Tutorial Statistics Revision
No ratings yet
9 Tutorial Statistics Revision
56 pages
T - Test
No ratings yet
T - Test
45 pages
Simple Statistics
No ratings yet
Simple Statistics
8 pages
Intro To R
No ratings yet
Intro To R
18 pages
Module2 BDA
No ratings yet
Module2 BDA
44 pages
SAS Statement Analysis in R
No ratings yet
SAS Statement Analysis in R
25 pages
T-Test Analysis in Quantitative Data
No ratings yet
T-Test Analysis in Quantitative Data
50 pages
STATS 10 Assignment 1
No ratings yet
STATS 10 Assignment 1
7 pages
Stat 362 UNIT 2
No ratings yet
Stat 362 UNIT 2
40 pages
Method Chooser Basic Statistical Tests
100% (1)
Method Chooser Basic Statistical Tests
36 pages
Basic Statistical Tests
No ratings yet
Basic Statistical Tests
36 pages
R Programming
No ratings yet
R Programming
8 pages
T-Tests & Chi2
100% (1)
T-Tests & Chi2
35 pages
Sodapdf
No ratings yet
Sodapdf
47 pages
Modelling in R
No ratings yet
Modelling in R
47 pages
R Basics: Graphs & Paired t-Test Guide
No ratings yet
R Basics: Graphs & Paired t-Test Guide
5 pages
M1 & M2 Supplementaries
No ratings yet
M1 & M2 Supplementaries
52 pages
Presentation1 T TEST MCC 703
No ratings yet
Presentation1 T TEST MCC 703
43 pages
408 Mid
No ratings yet
408 Mid
7 pages
Statistical Techniques For Analyzing Quantitative Data
100% (1)
Statistical Techniques For Analyzing Quantitative Data
41 pages
Statistical Testing and R Programming Guide
No ratings yet
Statistical Testing and R Programming Guide
21 pages
Statistics - Exam Reviewer (Final)
No ratings yet
Statistics - Exam Reviewer (Final)
10 pages
Module5 Marketing Mix Model 1
No ratings yet
Module5 Marketing Mix Model 1
43 pages
Ttest
No ratings yet
Ttest
3 pages
Practice Chi
No ratings yet
Practice Chi
6 pages
Module1 2 Review MKT Statistics
No ratings yet
Module1 2 Review MKT Statistics
36 pages
0417 IGCSE Information and Communication Technology Frequently Asked Questions General
No ratings yet
0417 IGCSE Information and Communication Technology Frequently Asked Questions General
4 pages
ETL Tableau 150 Important MCQs
No ratings yet
ETL Tableau 150 Important MCQs
31 pages
Data Science - Chapter 3
No ratings yet
Data Science - Chapter 3
29 pages
Power Bi
No ratings yet
Power Bi
60 pages
PPD Application Note 082022
No ratings yet
PPD Application Note 082022
16 pages
Tutorial2 Q&A
No ratings yet
Tutorial2 Q&A
5 pages
VB6 Reading Writing Text Files
No ratings yet
VB6 Reading Writing Text Files
26 pages
Oracle Fusion Applications Supply Chain Planning Upload: Calendar Assignments
No ratings yet
Oracle Fusion Applications Supply Chain Planning Upload: Calendar Assignments
6 pages
Assignment 1 File Reading, File Writing, C-String
No ratings yet
Assignment 1 File Reading, File Writing, C-String
3 pages
Syslib rm026 - en P
No ratings yet
Syslib rm026 - en P
60 pages
Jira Cloud CSV Import Guide
No ratings yet
Jira Cloud CSV Import Guide
4 pages
Importing & Exporting CSV Fileppt For Class 12, Presentation With Examples
100% (2)
Importing & Exporting CSV Fileppt For Class 12, Presentation With Examples
12 pages
Hilti PROFIS Ferroscan Scripting Guide
No ratings yet
Hilti PROFIS Ferroscan Scripting Guide
44 pages
PsychoPyManual v1.81.03
No ratings yet
PsychoPyManual v1.81.03
207 pages
Excel Practic Exam 3
No ratings yet
Excel Practic Exam 3
65 pages
ATP Plot
No ratings yet
ATP Plot
7 pages
Accessing Enquiries - Overview - R10 User Guides
100% (2)
Accessing Enquiries - Overview - R10 User Guides
3 pages
Manovich - Data Article.2020
No ratings yet
Manovich - Data Article.2020
4 pages
Surpac Reporting
No ratings yet
Surpac Reporting
4 pages
MX Excel
No ratings yet
MX Excel
32 pages
EasyPrinter Setup and Operation Guide
No ratings yet
EasyPrinter Setup and Operation Guide
20 pages
RSLogix 500 - Import - Export To - From CSV File and MS Excel
No ratings yet
RSLogix 500 - Import - Export To - From CSV File and MS Excel
3 pages
AssignmentCSV
No ratings yet
AssignmentCSV
2 pages
XWEB Pro Operating Manual V1.2
No ratings yet
XWEB Pro Operating Manual V1.2
114 pages
Surveylab LTD: PO Box 6529 Te Aro Wellington New Zealand PH +64 4 3828064
No ratings yet
Surveylab LTD: PO Box 6529 Te Aro Wellington New Zealand PH +64 4 3828064
10 pages
Chapter 3
No ratings yet
Chapter 3
12 pages
LRW Manual 0420
No ratings yet
LRW Manual 0420
40 pages
N5 Computer Practice Lecturer Guide de Villiers
100% (1)
N5 Computer Practice Lecturer Guide de Villiers
212 pages
39 Python File Handling - Complete Guide With Real-World Examples
No ratings yet
39 Python File Handling - Complete Guide With Real-World Examples
9 pages
Cambridge International General Certificate of Secondary Education
No ratings yet
Cambridge International General Certificate of Secondary Education
8 pages