0% found this document useful (0 votes)

20 views6 pages

Data Science Viva Notes

The document provides a comprehensive overview of key concepts in data science, including Exploratory Data Analysis (EDA), data cleaning, statistical analysis, and various modeling techniques. It covers essential topics such as regression analysis, clustering, time series analysis, and machine learning fundamentals. Each concept is defined succinctly, making it a useful reference for understanding data science methodologies.

Uploaded by

shrutikurade0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views6 pages

Data Science Viva Notes

Uploaded by

shrutikurade0

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Data Science Viva Notes

Q: What is Exploratory Data Analysis (EDA)?

A: EDA is the process of exploring and visualizing data to understand its structure, patterns, and relationships

before applying any model.

Q: Summary statistics means?

A: Summary statistics are basic values that describe a dataset like mean, median, mode, min, max, standard

deviation.

Q: Histogram displays what?

A: A histogram shows the frequency distribution of a numeric variable.

Q: Box plot?

A: A box plot shows the spread of data using median, quartiles, and outliers.

Q: How to conclude after seeing boxplot?

A: You can see if data is symmetric, skewed, and whether there are outliers.

Q: Whiskers means?

A: Whiskers in a boxplot show the minimum and maximum values within 1.5 IQR from the quartiles.

Q: Scatter plot?

A: A scatter plot shows the relationship between two numeric variables.

Q: Data cleaning?

A: Data cleaning means fixing or removing wrong, incomplete, or inconsistent data.

Page 1
Data Science Viva Notes

Q: Handling inconsistencies mean?

A: It means correcting values that are wrongly formatted or mismatched in the dataset.

Q: How to apply imputation?

A: Imputation is filling missing values using mean, median, mode, or predictive models.

Q: How to remove duplicates?

A: Use tools or code (like `.drop_duplicates()` in Python) to delete repeated rows.

Q: Data transformation and feature engineering means?

A: Data transformation changes the data format or scale. Feature engineering creates new useful features for

the model.

Q: Normalization means?

A: Scaling all numeric data to a common range (like 0 to 1) to treat all features equally.

Q: Data transformation: converting categorical variables?

A: Convert them into numbers using encoding like One-Hot Encoding or Label Encoding.

Q: Binning?

A: Binning means converting continuous data into fixed intervals or categories.

Q: Polynomial feature creation?

A: Creating new features by raising existing numeric features to powers (like x², x³).

Q: Statistical analysis?

Page 2
Data Science Viva Notes

A: It involves using mathematical techniques to summarize, understand, and draw conclusions from data.

Q: Hypothesis testing means?

A: It tests if a statement about a population is likely true using sample data.

Q: Regression analysis?

A: A technique to study relationships between variables and predict one based on others.

Q: Linear regression model means?

A: A model that predicts an output using a straight-line relationship with input(s).

Q: T-test means?

A: A test to compare the means of two groups to see if they are significantly different.

Q: Chi-square test?

A: A test to check the association between two categorical variables.

Q: P-value means?

A: It shows the probability that the result happened by chance. A small p-value (<0.05) means the result is

statistically significant.

Q: Logistic regression?

A: A model used for classification problems (like yes/no) by predicting probability.

Q: Accuracy means?

A: Accuracy is the percentage of correct predictions made by a model.

Page 3
Data Science Viva Notes

Q: Accuracy, Precision, and Recall?

A: Accuracy: Overall correct predictions. Precision: Correct positive predictions. Recall: All actual positives

correctly predicted.

Q: ROC AUC curve?

A: A graph that shows model performance. AUC score near 1 is best.

Q: Clustering means?

A: Grouping similar data points together based on features.

Q: Segmentation?

A: Dividing data into meaningful groups (like customer segments).

Q: K-means clustering?

A: A method that divides data into 'k' clusters based on similarity.

Q: Difference between clustering and segmentation?

A: Clustering is the technique, segmentation is the result or goal.

Q: Churn prediction model?

A: A model that predicts which customers are likely to leave (churn).

Q: Time series analysis?

A: Analyzing data collected over time to find patterns and trends.

Page 4
Data Science Viva Notes

Q: Trend?

A: Long-term movement in data (upward or downward).

Q: Seasonality?

A: Repeating patterns at regular intervals (like monthly or yearly).

Q: Noise components?

A: Random or irregular variations in data that cannot be explained.

Q: Outliers mean?

A: Unusual values far from most of the data.

Q: ARIMA?

A: A forecasting model using past values and errors. It stands for AutoRegressive Integrated Moving

Average.

Q: Forecasting means?

A: Predicting future values based on past data.

Q: Exponential smoothing?

A: A method to forecast data by giving more weight to recent observations.

Q: Anomalies?

A: Unusual or unexpected data points that don?t fit the pattern.

Q: Z-score?

Page 5
Data Science Viva Notes

A: A value that shows how far a data point is from the mean, in standard deviations.

Q: Isolation Forest model?

A: A model used to detect anomalies by isolating them from the rest of the data.

Q: Profiling means?

A: Creating a summary of data to understand its structure, quality, and patterns.

Q: Correlation matrix?

A: A table showing how variables relate to each other (with values between -1 and 1).

Q: Correlation coefficient?

A: A number that shows the strength and direction of the relationship between two variables.

Q: ML, AI, and Deep Learning?

A: AI is the broad field. ML is a part of AI that learns from data. Deep Learning is a type of ML using neural

networks.

Q: Supervised and Unsupervised learning?

A: Supervised: learns with labeled data (has answers). Unsupervised: finds patterns from unlabeled data.

Page 6

Data Science Interview
No ratings yet
Data Science Interview
132 pages
Crack Data Science Interview 1731300339
No ratings yet
Crack Data Science Interview 1731300339
132 pages
Data Science Viva Questions
No ratings yet
Data Science Viva Questions
2 pages
Da 1733591326
No ratings yet
Da 1733591326
132 pages
6220010
No ratings yet
6220010
37 pages
CS3352-QB Fds
No ratings yet
CS3352-QB Fds
12 pages
Exploratory Data Analysis
100% (1)
Exploratory Data Analysis
209 pages
File
No ratings yet
File
27 pages
Data Mining
No ratings yet
Data Mining
34 pages
23SC3201 Data Science and Challenges-2
No ratings yet
23SC3201 Data Science and Challenges-2
28 pages
Introduction To Data Science
No ratings yet
Introduction To Data Science
29 pages
DS Unit-1 PDF
No ratings yet
DS Unit-1 PDF
50 pages
Unit I 2 Marks
No ratings yet
Unit I 2 Marks
5 pages
FDSNotes
No ratings yet
FDSNotes
12 pages
Data Science
No ratings yet
Data Science
11 pages
Unit 1 - Exploratory Data Analysis Fundamentals
No ratings yet
Unit 1 - Exploratory Data Analysis Fundamentals
47 pages
Data Science S3mca
No ratings yet
Data Science S3mca
55 pages
Data Science - Ebook
No ratings yet
Data Science - Ebook
32 pages
JobRecord MUHAMMAD NAEEM F70a3eba Db3d 11ef A12f 96f32f87411b
No ratings yet
JobRecord MUHAMMAD NAEEM F70a3eba Db3d 11ef A12f 96f32f87411b
63 pages
Notes For Data Science
No ratings yet
Notes For Data Science
6 pages
2 Marks Foundations of Data Science
No ratings yet
2 Marks Foundations of Data Science
13 pages
Data Science Interview Questions
No ratings yet
Data Science Interview Questions
32 pages
Unit 4
No ratings yet
Unit 4
6 pages
Datascience (Mod1)
No ratings yet
Datascience (Mod1)
4 pages
Ds 1
No ratings yet
Ds 1
8 pages
Ixs8h l8mgc
No ratings yet
Ixs8h l8mgc
40 pages
Selected Topics - Datascience
No ratings yet
Selected Topics - Datascience
17 pages
Free Data Science Course Material 2018
No ratings yet
Free Data Science Course Material 2018
32 pages
Slidesgo Enhancing Insights A Comprehensive Overview of Data Science Modules 20250113133756aOMY
No ratings yet
Slidesgo Enhancing Insights A Comprehensive Overview of Data Science Modules 20250113133756aOMY
14 pages
DSF 1-2
No ratings yet
DSF 1-2
28 pages
Introduction To Data Science
No ratings yet
Introduction To Data Science
11 pages
Scale - Jobs Data Science
No ratings yet
Scale - Jobs Data Science
12 pages
Data Science
No ratings yet
Data Science
10 pages
Unit I
No ratings yet
Unit I
52 pages
Data Science Comprehension Worksheets
No ratings yet
Data Science Comprehension Worksheets
32 pages
Chapter 01 2
No ratings yet
Chapter 01 2
19 pages
Introduction To Datasciecne
No ratings yet
Introduction To Datasciecne
50 pages
Approaches in Data Science (Slides)
No ratings yet
Approaches in Data Science (Slides)
13 pages
Ds Intro KK
No ratings yet
Ds Intro KK
11 pages
Chap 1 B
No ratings yet
Chap 1 B
24 pages
Summer Training
No ratings yet
Summer Training
8 pages
EDS Unit 1?
No ratings yet
EDS Unit 1?
15 pages
Data Science Excercises (Chaprers 1-4)
No ratings yet
Data Science Excercises (Chaprers 1-4)
4 pages
Data
No ratings yet
Data
43 pages
DSC Unit 1
No ratings yet
DSC Unit 1
59 pages
Data Science Lecture No 02
No ratings yet
Data Science Lecture No 02
21 pages
Data Scientist
No ratings yet
Data Scientist
12 pages
Datascience Notes
No ratings yet
Datascience Notes
161 pages
Cs3352 Foundation of Data Science
No ratings yet
Cs3352 Foundation of Data Science
17 pages
Data Science Guide: Concepts & Roles
100% (1)
Data Science Guide: Concepts & Roles
67 pages
Q1. Explain Data Science Process Along With Detailed Diagram
No ratings yet
Q1. Explain Data Science Process Along With Detailed Diagram
7 pages
Kenny-230718-Top 70 Microsoft Data Science Interview Questions
No ratings yet
Kenny-230718-Top 70 Microsoft Data Science Interview Questions
17 pages
Data Science Lecture No 02
No ratings yet
Data Science Lecture No 02
21 pages
Data Science 1
100% (5)
Data Science 1
133 pages
Lecture 1 Introduction Tools An - Chniques For Data Science
No ratings yet
Lecture 1 Introduction Tools An - Chniques For Data Science
16 pages
FDS PYQ Solution
No ratings yet
FDS PYQ Solution
8 pages
Foundation of Data Science Previous Year Question Paper
No ratings yet
Foundation of Data Science Previous Year Question Paper
40 pages
Data Science
No ratings yet
Data Science
6 pages
DAT100 - Int - Data - Ana - Lec2 - Intro II
No ratings yet
DAT100 - Int - Data - Ana - Lec2 - Intro II
39 pages
Modeling and Simulation of Surface Roughness: Rajendra M. Patrikar
No ratings yet
Modeling and Simulation of Surface Roughness: Rajendra M. Patrikar
8 pages
Gemini Lagna Birth Chart Analysis
No ratings yet
Gemini Lagna Birth Chart Analysis
39 pages
Associate CET - JD
No ratings yet
Associate CET - JD
2 pages
Acquiring IT Applications and Infrastructure
No ratings yet
Acquiring IT Applications and Infrastructure
38 pages
Divide-and-Conquer Posterior Sampling For Denoising Diffusion Priors
No ratings yet
Divide-and-Conquer Posterior Sampling For Denoising Diffusion Priors
30 pages
Slovakia: From A Difficult Case of Transition To A Consolidated Central European Democracy
No ratings yet
Slovakia: From A Difficult Case of Transition To A Consolidated Central European Democracy
40 pages
The Therian in Fact and Form
No ratings yet
The Therian in Fact and Form
3 pages
Deputy Commissioner Sh. Harpreet Singh Sudan, IAS Sh. Surinder Singh
No ratings yet
Deputy Commissioner Sh. Harpreet Singh Sudan, IAS Sh. Surinder Singh
3 pages
AA1 Holiday Work S.1
No ratings yet
AA1 Holiday Work S.1
9 pages
Healing Imagery in Kalevala
No ratings yet
Healing Imagery in Kalevala
16 pages
Experimental Gingivitis in Man PDF
No ratings yet
Experimental Gingivitis in Man PDF
2 pages
IX Latihan Soal Invitation
No ratings yet
IX Latihan Soal Invitation
4 pages
Consumer Surplus Explained
100% (1)
Consumer Surplus Explained
7 pages
1.二十四节气 the 24 Solar Terms
No ratings yet
1.二十四节气 the 24 Solar Terms
3 pages
Learning From Normal Work Professional Safety Journal 1698956321
No ratings yet
Learning From Normal Work Professional Safety Journal 1698956321
8 pages
Class 10 Maths Solutions Guide
40% (10)
Class 10 Maths Solutions Guide
2 pages
Lesson Plan Form 4
No ratings yet
Lesson Plan Form 4
6 pages
The Use of Dance Movement Therapy in Social Work
100% (3)
The Use of Dance Movement Therapy in Social Work
16 pages
By: Prof. A.S.Mohanty: Lesson Notes On Organizational Behaviour Semester - 3 Under BPUT Syllabus NOTE - 19
No ratings yet
By: Prof. A.S.Mohanty: Lesson Notes On Organizational Behaviour Semester - 3 Under BPUT Syllabus NOTE - 19
2 pages
Axial Piston Pump Variable Displacement Bosch Rexroth A4VSO-1421347275
No ratings yet
Axial Piston Pump Variable Displacement Bosch Rexroth A4VSO-1421347275
60 pages
Test 3 Aug Course
No ratings yet
Test 3 Aug Course
15 pages
Load Attaching Points On Loads in Nuclear Power Plants
No ratings yet
Load Attaching Points On Loads in Nuclear Power Plants
59 pages
Statictics Maths 2 Marks - 2
No ratings yet
Statictics Maths 2 Marks - 2
42 pages
Adhesives Design Toolkit
No ratings yet
Adhesives Design Toolkit
2 pages
Coherence & Speech Acts Guide
No ratings yet
Coherence & Speech Acts Guide
26 pages
Debate Lesson 1
No ratings yet
Debate Lesson 1
5 pages
An Introduction To Linear Algebra by Krishnamurthy Mainra Arora PDF
No ratings yet
An Introduction To Linear Algebra by Krishnamurthy Mainra Arora PDF
348 pages
AS AND A LEVELS Paper 2025 March 9990
No ratings yet
AS AND A LEVELS Paper 2025 March 9990
12 pages
Activity 1: Break It Down Activity 3: The Nerves!!!
No ratings yet
Activity 1: Break It Down Activity 3: The Nerves!!!
2 pages