0% found this document useful (0 votes)
17 views2 pages

ST & Spark Syllabus

The M.Sc. Data Science course on Statistical Methods for Data Science covers foundational statistical concepts, including probability, hypothesis testing, and regression analysis, with a focus on practical application in data science. Students will learn to use statistical software like R or Python for data analysis and enhance their critical thinking skills. The course consists of two modules covering applied statistics and multivariate techniques, with assessments split between university and college evaluations.

Uploaded by

Chitra Venkatesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views2 pages

ST & Spark Syllabus

The M.Sc. Data Science course on Statistical Methods for Data Science covers foundational statistical concepts, including probability, hypothesis testing, and regression analysis, with a focus on practical application in data science. Students will learn to use statistical software like R or Python for data analysis and enhance their critical thinking skills. The course consists of two modules covering applied statistics and multivariate techniques, with assessments split between university and college evaluations.

Uploaded by

Chitra Venkatesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Programme Name: M.Sc.

Data Science Course Name: Statistical Methods for Data Science


(Semester-I)

Total Credits: 02 Total Marks: 50

University assessment: 25 College assessment: 25

Prerequisite:
Knowledge of statistics and mathematical concepts

Course outcomes:
 Students will gain a solid understanding of foundational statistical concepts, including
probability, sampling distributions, hypothesis testing, and confidence intervals. They will
learn the principles and techniques used in statistical analysis.
 Students will learn how to apply statistical methods to analyze data in the context of data
science. They will become proficient in using statistical techniques such as regression
analysis, analysis of variance (ANOVA), chi-square tests, and non-parametric tests.
 Students will gain proficiency in using statistical software and programming languages
such as R or Python to implement statistical analyses. They will learn how to write code
to perform statistical calculations, visualize data, and automate data analysis processes.
 Students will enhance their critical thinking skills and ability to solve problems using
statistical methods.

Course Code Course Title Total


Credits
PSDS505 Statistical Methods for Data Science 02
MODULE - I: 02
Unit 1: Introduction to Applied Statistics
The Nature of Statistics and Inference, What is “Big Data”?,Statistical Modelling,
Statistical Significance Testing and Error Rates, Simple Example of Inference
Using a Coin, Statistics is for Messy Situations, Type I versus Type II Errors, Point
Estimates and Confidence Intervals, Variable Types, Sample Size, Statistical
Power, and Statistical Significance, The Verdict on Significance Testing, Training
versus Test Data.
Means, Correlations, Counts: Drawing Inferences: Computing z and Related
Scores, Statistical Tests, Plotting Normal Distributions, Correlation Coefficients,
Evaluating Pearson‟s r for Statistical Significance, Spearman‟s Rho: A
Nonparametric Alternative to Pearson.
Tests of Mean Differences: t-Tests for One Sample, Two Sample t-Test, Paired-
Samples t-Test Categorical Data: Binomial Test, Categorical Data Having More
Than Two Possibilities.
Power Analysis and Sample Size Estimation: Power for t-Tests, Power for One-
Way ANOVA, Power for Correlations. Analysis of Variance: Fixed Effects,
Random Effects, Mixed Models, Introducing the Analysis of Variance (ANOVA),
Performing the ANOVA, Random Effects ANOVA and Mixed Models, One-Way
Random Effects ANOVA

Page 13 of 51
Unit 2: Multivariate Techniques
Simple and Multiple Linear Regression, Hierarchical Regression, How Forward
Regression Works Logistic Regression and the Generalized Linear Model,
Predicting Probabilities, Multiple Logistic Regression, Training Error Rate Versus
Test Error Rate. Multivariate Analysis of Variance (MANOVA)and Discriminant
Analysis: Multivariate Tests of Significance, Example of MANOVA, Outliers,
Homogeneity of Covariance Matrices, Linear Discriminant Function Analysis,
Theory of Discriminant Analysis, Predicting Group Membership, Visualizing
Separation.
Principal Component Analysis: Principal Component Analysis Versus Factor
Analysis, Properties of Principal Components, Component Scores, How Many
Components to Keep? Exploratory Factor Analysis, Common Factor Analysis
Model, Factor Analysis Versus Principal Component Analysis onthe Same, Initial
Eigenvalues in Factor Analysis, Rotation in Exploratory Factor Analysis,
Estimation in Factor Analysis.
Cluster Analysis: k-Means Cluster Analysis, Minimizing Criteria, Example of k-
Means Clustering Hierarchical Cluster Analysis, Why Clustering Is Inherently
Subjective.
Nonparametric Tests: Mann– Whitney U Test, Kruskal–Wallis Test,
Nonparametric Test for Paired Comparisons and Repeated

Reference Books:

1. Gupta S. C., Kapoor V. K.: Fundamentals of Mathematical Statistics; Tenth


Edition.Sultan
Chand &Sons. (2000)
2. Johnson, R.A., Wichern, D.W.: Applied Multivariate Statistical Analysis, Prentice-Hall,
NewJersey, 2002.
3. Draper, N. R. and Smith, H. (1998), Applied Regression Analysis (John Wiley), Third
Edition.
4. Purohit, S. G. Gore, S.D. and Deshmukh, S.R. (2015). Statistics using R, second edition.
Narosa Publishing House, New Delhi.
5. Daniel W. W.: Applied Non-Parametric Statistics, First edition Boston-Houghton Mifflin
Company.

Page 14 of 51

You might also like