Amazon DS Questions

Uploaded by

Hritick Roy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views6 pages

Amazon DS Questions

Uploaded by

Hritick Roy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

1/5

AMAZON DATA
SCIENCE
INTERVIEW
𝗪𝗵𝗮𝘁 𝗶𝘀 𝘃𝗮𝗿𝗶𝗮𝗻𝗰𝗲 𝗶𝗻 𝗮 𝗺𝗼𝗱𝗲𝗹?

Variance in a model refers to how much the model's

predictions change when trained on different
subsets of the data. It captures the sensitivity of the
model to variations in the training data.
High variance means the model is very sensitive to
the specific data it was trained on. This results in
large fluctuations in predictions when exposed to
different datasets, even if they are similar. High
variance is typically associated with overfitting.
Low variance means the model's predictions are
stable, even when trained on different datasets.

@karunt
𝗜𝘀 𝗮 𝗱𝗲𝗰𝗶𝘀𝗶𝗼𝗻 𝘁𝗿𝗲𝗲 𝗺𝗼𝗱𝗲𝗹 𝗯𝗲𝘀𝘁 𝗳𝗼𝗿 𝗽𝗿𝗲𝗱𝗶𝗰𝘁𝗶𝗻𝗴 𝗶𝗳 𝗮
𝗯𝗼𝗿𝗿𝗼𝘄𝗲𝗿 𝘄𝗶𝗹𝗹 𝗽𝗮𝘆 𝗯𝗮𝗰𝗸 𝗮 𝗽𝗲𝗿𝘀𝗼𝗻𝗮𝗹 𝗹𝗼𝗮𝗻? 𝗛𝗼𝘄 𝘄𝗼𝘂𝗹𝗱
𝘆𝗼𝘂 𝗲𝘃𝗮𝗹𝘂𝗮𝘁𝗲 𝗽𝗲𝗿𝗳𝗼𝗿𝗺𝗮𝗻𝗰𝗲 𝗼𝗳 𝘁𝗵𝗲 𝗺𝗼𝗱𝗲𝗹?
This would depend on the data and model
performance.
For instance, decision trees can be a good starting
point since it's quite interpretable, handles non-
linear relationships and requires minimal pre-
processing.
However, in financial datasets the data tends to be
imbalanced. So assessing performance on wide
variety of classification metrics like precision, recall,
f1-score would be important to assess model
performance.

From here, the interviewer might have follow up - so

make sure you understand metrics very well,
specifically - Precision, Recall, F1, AUC-ROC, AUC-PR
etc.

@karunt
𝗪𝗵𝗮𝘁 𝘄𝗼𝘂𝗹𝗱 𝘆𝗼𝘂 𝗱𝗼 𝗶𝗳 20% 𝗼𝗳 𝘁𝗵𝗲 100,000 𝘀𝗼𝗹𝗱
𝗹𝗶𝘀𝘁𝗶𝗻𝗴𝘀 𝗮𝗿𝗲 𝗺𝗶𝘀𝘀𝗶𝗻𝗴 𝘀𝗾𝘂𝗮𝗿𝗲 𝗳𝗼𝗼𝘁𝗮𝗴𝗲 𝗱𝗮𝘁𝗮. 𝗬𝗼𝘂
𝘄𝗮𝗻𝘁 𝘁𝗼 𝗽𝗿𝗲𝗱𝗶𝗰𝘁 𝗽𝗿𝗶𝗰𝗲.

This technique would depend on what cause of

missing values are - is it at random, does it mean
the listing is 'pending', or 'not ready for sale'. Once
we understand what the reason, we can solve it in
different ways like -
(a) drop the feature if its not predictive, or if
other feature are good proxies.
(a) imputation-mean/median or KNN-imputer
etc.
(b) use models that can handle missing values -
XGBoost or Random Forest

Get feedback from interviews and be ready to

dive into each apporach!

@karunt
𝗪𝗵𝗮𝘁 𝗶𝘀 𝘁𝗵𝗲 𝗱𝗶𝗳𝗳𝗲𝗿𝗲𝗻𝗰𝗲 𝗯𝗲𝘁𝘄𝗲𝗲𝗻 𝗫𝗚𝗯𝗼𝗼𝘀𝘁 𝗮𝗻𝗱
𝗿𝗮𝗻𝗱𝗼𝗺 𝗳𝗼𝗿𝗲𝘀𝘁?
Some differences are -

XGBoost: It’s an implementation of gradient boosting.

XGBoost builds trees 𝘀𝗲𝗾𝘂𝗲𝗻𝘁𝗶𝗮𝗹𝗹𝘆, where each new tree
tries to correct the errors made by the previous trees.
It has Lower bias due to the sequential learning
process, but potentially higher variance if overfitting
occurs. Regularization techniques are applied to
mitigate overfitting.

Random Forest: It is an example of bagging

(Bootstrap Aggregating)where multiple trees are
build 𝗶𝗻𝗱𝗲𝗽𝗲𝗻𝗱𝗲𝗻𝘁𝗹𝘆 𝗮𝗻𝗱 𝗶𝗻 𝗽𝗮𝗿𝗮𝗹𝗹𝗲𝗹. Higher bias because
trees are grown independently, but variance is
reduced because of the averaging across many trees.
It does not have explicit regularization, but naturally
prevents overfitting by averaging predictions from
multiple trees and using random feature selection.

@karunt
WAS THIS HELPFUL?
Be sure to save it so you
can come back to it later!

@karunt

UNIT 2-Part2
No ratings yet
UNIT 2-Part2
9 pages
Random Forest Algorithm in Machine Learning Random Forest Random Forests or Random Decision Trees Decision Trees
No ratings yet
Random Forest Algorithm in Machine Learning Random Forest Random Forests or Random Decision Trees Decision Trees
6 pages
ML Fundamentals
No ratings yet
ML Fundamentals
15 pages
Classification Algorithms
No ratings yet
Classification Algorithms
68 pages
Eda - M4
No ratings yet
Eda - M4
7 pages
PDS LVC 2 Post-Session Summary
No ratings yet
PDS LVC 2 Post-Session Summary
11 pages
Random Forest
No ratings yet
Random Forest
25 pages
FAI Lecture - 4-10-2023 PDF
No ratings yet
FAI Lecture - 4-10-2023 PDF
27 pages
D3 IT Random Forest Apr 2023
No ratings yet
D3 IT Random Forest Apr 2023
32 pages
ES335
No ratings yet
ES335
22 pages
ML Unit 3
No ratings yet
ML Unit 3
22 pages
Data Mining Notes
No ratings yet
Data Mining Notes
5 pages
??????? ???????? ??????????!
No ratings yet
??????? ???????? ??????????!
16 pages
Loan
No ratings yet
Loan
3 pages
Unit 3 ML
No ratings yet
Unit 3 ML
40 pages
Week 14 Bias Variance Tradeoff
No ratings yet
Week 14 Bias Variance Tradeoff
3 pages
Random Forest for ML Enthusiasts
No ratings yet
Random Forest for ML Enthusiasts
4 pages
Assessing Predictive Models
No ratings yet
Assessing Predictive Models
25 pages
Random Forest Algorithms - Comprehensive Guide With Examples
No ratings yet
Random Forest Algorithms - Comprehensive Guide With Examples
13 pages
QB ML Ans
No ratings yet
QB ML Ans
14 pages
Loan Default Prediction Using ML Models
100% (1)
Loan Default Prediction Using ML Models
17 pages
Lecture #15: Regression Trees & Random Forests
No ratings yet
Lecture #15: Regression Trees & Random Forests
34 pages
Random Forest
No ratings yet
Random Forest
25 pages
Pa - Unit - Iv
No ratings yet
Pa - Unit - Iv
45 pages
DA PRA WEEK 13 (Random Forest) - 054551
No ratings yet
DA PRA WEEK 13 (Random Forest) - 054551
12 pages
Lecture-12 Machine Learning With Python
No ratings yet
Lecture-12 Machine Learning With Python
18 pages
Decision Trees
67% (3)
Decision Trees
14 pages
Random Forest: Proprietary Content. ©great Learning. All Rights Reserved. Unauthorized Use or Distribution Prohibited
No ratings yet
Random Forest: Proprietary Content. ©great Learning. All Rights Reserved. Unauthorized Use or Distribution Prohibited
16 pages
05 - Ensemble Learning
No ratings yet
05 - Ensemble Learning
39 pages
ML Unit-3 Part-1
No ratings yet
ML Unit-3 Part-1
17 pages
Random Forests Simplified
No ratings yet
Random Forests Simplified
39 pages
Ensemble Learning Explained
No ratings yet
Ensemble Learning Explained
32 pages
Random Forest
No ratings yet
Random Forest
29 pages
Lecture3 Bias and Variance Analysis and Bagging
No ratings yet
Lecture3 Bias and Variance Analysis and Bagging
22 pages
Team 5
No ratings yet
Team 5
12 pages
Lecture 6
No ratings yet
Lecture 6
24 pages
Trees and Random Forest
No ratings yet
Trees and Random Forest
34 pages
Exploratory Data Analysis & ML Concepts
No ratings yet
Exploratory Data Analysis & ML Concepts
16 pages
ML Lecture 15 Ensemble
No ratings yet
ML Lecture 15 Ensemble
27 pages
Ca-Project: Aryan Devesh Puja Shabnas Mudit
No ratings yet
Ca-Project: Aryan Devesh Puja Shabnas Mudit
8 pages
Module 5 Machine Learning
No ratings yet
Module 5 Machine Learning
36 pages
Data Science Interview Question
No ratings yet
Data Science Interview Question
23 pages
Phys361 S24 Lecture 17 Random Forests
No ratings yet
Phys361 S24 Lecture 17 Random Forests
24 pages
Evaluating Machine Learning Models
100% (2)
Evaluating Machine Learning Models
10 pages
Lecture 13
No ratings yet
Lecture 13
39 pages
Lecture 05 Random Forest 07112022 124639pm
No ratings yet
Lecture 05 Random Forest 07112022 124639pm
25 pages
5 Decision Trees RF Boosting
No ratings yet
5 Decision Trees RF Boosting
7 pages
Aiml Nts
No ratings yet
Aiml Nts
33 pages
DS535 Note 6 (Page28-30)
No ratings yet
DS535 Note 6 (Page28-30)
4 pages
Top 25 Machine Learning Interview Questions
No ratings yet
Top 25 Machine Learning Interview Questions
21 pages
Bagging, Boosting, and Random Forests Explained
No ratings yet
Bagging, Boosting, and Random Forests Explained
27 pages
Machine Learning: Practical Tutorial On Random Forest and Parameter Tuning in R
No ratings yet
Machine Learning: Practical Tutorial On Random Forest and Parameter Tuning in R
11 pages
Decision Tree
No ratings yet
Decision Tree
7 pages
Deep Learning and Neural Networks
No ratings yet
Deep Learning and Neural Networks
21 pages
Random Forest
No ratings yet
Random Forest
10 pages
Random Forest
No ratings yet
Random Forest
27 pages
Random Forest-Supervised ML
No ratings yet
Random Forest-Supervised ML
45 pages
Speech On Individual Vs Teamwork
No ratings yet
Speech On Individual Vs Teamwork
1 page
Ph.D. Prospectus (2025-26)
No ratings yet
Ph.D. Prospectus (2025-26)
88 pages
Technical Communication - 9
No ratings yet
Technical Communication - 9
171 pages
ML Unit 2
No ratings yet
ML Unit 2
16 pages
Reading Skills For Business Students Ppt-2
No ratings yet
Reading Skills For Business Students Ppt-2
44 pages
ENG-102 (Functional Literacy Across Disciplines)
No ratings yet
ENG-102 (Functional Literacy Across Disciplines)
10 pages
Erich Fromm 1929a5-E: Psychoanalysis and Sociology
100% (4)
Erich Fromm 1929a5-E: Psychoanalysis and Sociology
3 pages
Roy - Callista - Publications 2015
No ratings yet
Roy - Callista - Publications 2015
3 pages
Beginner's Guide to Competitive Programming
No ratings yet
Beginner's Guide to Competitive Programming
5 pages
HR Career Highlights
No ratings yet
HR Career Highlights
6 pages
DAILY LESSON LOG OF M11GM - Ie-F-2 (Week Six - Day One) : Answer Key
No ratings yet
DAILY LESSON LOG OF M11GM - Ie-F-2 (Week Six - Day One) : Answer Key
3 pages
LSAF - Registration Form
No ratings yet
LSAF - Registration Form
2 pages
Applying Interprofessional Team-Based Learning in Patient Safety: A Pilot Evaluation Study
No ratings yet
Applying Interprofessional Team-Based Learning in Patient Safety: A Pilot Evaluation Study
9 pages
Elementary Test 2
No ratings yet
Elementary Test 2
2 pages
wnocdu7rdfbw-5OS02AssessmentGuidanceVideo24 011
No ratings yet
wnocdu7rdfbw-5OS02AssessmentGuidanceVideo24 011
19 pages
Project Work Template 2081
No ratings yet
Project Work Template 2081
16 pages
Student Homework Stress Solutions
100% (1)
Student Homework Stress Solutions
6 pages
Investigating The Obstacles of Speaking Fluency Among EFL Sudanese University Students
No ratings yet
Investigating The Obstacles of Speaking Fluency Among EFL Sudanese University Students
76 pages
Philosophy for Ethics Students
No ratings yet
Philosophy for Ethics Students
19 pages
BÀI TẬP VIẾT LẠI CÂU Because -Although
No ratings yet
BÀI TẬP VIẾT LẠI CÂU Because -Although
3 pages
Meeting Minutes (GP Part 1)
No ratings yet
Meeting Minutes (GP Part 1)
4 pages
Bringing Up Bebe
0% (4)
Bringing Up Bebe
2 pages
Grade 9 Term 3 Creative Schemes
No ratings yet
Grade 9 Term 3 Creative Schemes
13 pages
Edl 775 Final
No ratings yet
Edl 775 Final
28 pages
Sources of Ethical Norms
90% (10)
Sources of Ethical Norms
2 pages
RPMS Teacher Portfolio Assessment Guide
100% (6)
RPMS Teacher Portfolio Assessment Guide
61 pages
Phonics Reading November 2018
No ratings yet
Phonics Reading November 2018
24 pages
Summary of Ug Fee (2024-2025)
No ratings yet
Summary of Ug Fee (2024-2025)
6 pages
حديد الاحتياجات التدريبية. دراسة ميدانية على عينة من مستشاري التوجيه و التقييم والإدماج المهني بقطاع التكوين و التعليم المهنيين
No ratings yet
حديد الاحتياجات التدريبية. دراسة ميدانية على عينة من مستشاري التوجيه و التقييم والإدماج المهني بقطاع التكوين و التعليم المهنيين
25 pages
Metal Products Market Survey Presentation
No ratings yet
Metal Products Market Survey Presentation
12 pages