0% found this document useful (0 votes)

37 views8 pages

BD Question Bank MCQ Answered

Uploaded by

kalyaniprabhudesai2211

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views8 pages

BD Question Bank MCQ Answered

Uploaded by

kalyaniprabhudesai2211

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Shivaji University, Kolhapur

Question Bank for Mar 2022 (Summer) Examination

Subject Code: 84719, Subject Name: Big Data Analytics

Sr.
Question A B C D
No.
What are the main components of All of the
1 Hadoop? MapReduce HDFS YARN
above

The Big data analytics work on the

None of the
2 unstructured data, where no specific True False Can’t Say
above
pattern of the data is defined.
Identify the incorrect big data Apache Apache
3 Technologies. Apache Kafka Apache Spark
Pytorch Hadoop

Identify among the options below

which is general-purpose computing All of the
4 HDFS MapReduce Oozie
model and runtime system for above
Distributed Data Analytics.
Big data analysis does the following
5 Spreads data Analyze data Organizes data Collect data
except?
What is NOT a characteristic of big
6 Volume Variety Vision Velocity
data?
Pig is a Hadoop-based open-source
platform for analyzing the large-scale
7 datasets via its own SQL-like language Pig Latin Pig German Pig Roman Pig Italian

_______

The key aspect of the MapReduce

algorithm is that if every Map and
Reduce is independent of all other series on parallel on parallel on
8 series on same
ongoing Maps and Reduces in the different different same
network, the operation will run in
______ keys and lists of data.
In Hadoop MapReduce, _____ is a Java
class that comes with several methods RecordCollect
9 Mapper RecordReader Reporter
to retrieve key and values by iterating or
them among the data splits.

TaskTracker JobTracker NameNode DataNode

10 Which of the following scenarios
failure failure failure failure
makes HDFS unavailable?

Hadoop MapReduce is a popular_____

for easily written applications. It
Spring Java Django Web
11 processes vast amounts of data (multi-
framework framework framework framework
terabyte datasets) in parallel on large
clusters (thousands of nodes).
Which is not a way to link R and Hadoop
12 Hadoop? RHIPE RHadoop RHDFS
streaming

The RHIPE package uses the ________

technique to perform data analytics Divide and Divide and Integrate and None of the
13
over Big Data. recombine conquer recombine above

Phase 3: Phase 4: Phase 5:

_____ phase of the data analytics Phase 2: Data
14 Model Model Communicate
lifecycle usually takes the longest time. Preparation
Planning Building Results
Identifying the Identifying the
Identifying the Identifying the
problem>desig problem >
problem>desig problem>
ning the performing
ning the visualizing
requirements> analytics over
requirements> data
The data analytics project life cycle >performing data
pre-processing >designing the
15 stages in correct sequence are analytics over >designing the
data>performi requirements>
__________ data> pre- requirements>
ng analytics pre-processing
processing pre-processing
over data> data>performi
data > data>
visualizing ng analytics
visualizing visualizing
data over data
data data
Which of the following is/are true
about Random Forest and Gradient
Boosting ensemble methods?

1. Both methods can be used for

classification task
16 1 2 2 and 3 1 and 4
2. Random Forest is used for
classification whereas Gradient
Boosting is used for regression
task
3. Random Forest is used for
regression whereas Gradient
Boosting is used for
Classification task
4. Both methods can be used for
regression task

In Random forest you can generate

hundreds of trees (say T1, T2 …..Tn)
and then aggregate the results of these
tree. Which of the following is true
about individual(Tk) tree in Random
Forest?

1. Individual tree is built on a

17 1 and 3 1 and 4 2 and 3 2 and 4
subset of the features
2. Individual tree is built on all the
features
3. Individual tree is built on a
subset of observation
4. Individual tree is built on full
set of observations

The primary Machine Learning API for All of the

18 Dataframe Dataset RDD
Spark is now the _____ based API above

Which of the following is a module for

19 GraphX MLib SparkSQL Spark R
Structured data processing?

SparkSQL translates commands into

Executor Cluster None of the
20 codes. These codes are processed by Driver Nodes
nodes Manager above
________

SparkSQL plays the main role in the

21 True False Can’t Say None is correct
optimization of queries.

Which of the following is not a Logical Physical

22 Analysis Execution
SparkSQL query execution phases? Optimization Planning

Takes RDD as
The ways to
input and Creates one or
send result All of the
23 What is action in Spark RDD? produces one many new
from executors above
or more RDD RDDs
to the driver
as output.
The data The data
required to required to
Which of the following is true about compute compute None of the
24 Both
narrow transformation? resides on resides on the above
multiple single
partitions. partition.

__________ is a distributed machine Spark

25 MLib GraphX RDDs
learning framework on top of Spark. Streaming
Which of following component of
Cluster Driver
26 Spark runtime architecture provides Worker nodes Spark context
manager program
resources to execute a task?
Semi
Among the following option identify Unsupervised Reinforcement Supervised
27 unsupervised
the one which is not a type of learning. learning learning learning
learning
Semi
Identify the type of learning in which Unsupervised Reinforcement Supervised
28 unsupervised
labeled training data is used. learning learning learning
learning
Machine learning is a subset of which Artificial None of the
29 Deep Learning Data Learning
of the following? Intelligence above
Which of the following machine
Anomaly All of the
30 learning techniques helps in detecting Classification Clustering
Detection above
the outliers in data?
Which of the following are common
All of the
31 classes of problems in machine Regression Classification Clustering
above
learning?
Tries to Similarity
recommend among users
What is content based items based on Similarity buying, All of the
32
recommendation system? profile built among items watching, or above
from their enjoying
preferences something
Machine Learning is a field of AI
At executing Over time with Improve their All of the
33 consisting of learning algorithms that
some task experience performance above
__________

Which of the following machine

34 learning algorithm is based upon the Decision tree Random forest Classification Regression
idea of bagging?
Among the following options identify It relates It discovers
It is used for It is used for
35 the one which is false regarding inputs to casual
the prediction interpretation
regression. outputs relationships
UNIT – I
1. Define Big Data? Explain the Characteristics / V’s of Bigdata?
2. Write a note on: Drivers for Big Data.
3. Explain different applications of Big Data.
4. Write a note on: Data Privacy Protection.
5. With neat diagram depict the Product Knowledge Hub in Big Data?
6. Write a short note on Location Based Services in Big Data?
7. Explain the architectural components of Big Data?
8. Explain Real-time Adaptive Analytics and Decision engine?
9. Explain Massively Parallel Processing (MPP) platforms.
10. Explain Unstructured Data Analytics and Reporting.

UNIT – II
1. Explain the features of R Language?
2. Explain different phases of MapReduce with an example?
3. What is HDFS? Explain the features of HDFS?
4. Explain the HDFS and MapReduce architecture.
5. List and explain different components of Hadoop.
6. Explain in detail the stages of Hadoop MapReduce data processing.
7. Explain in detail the dataflow of MapReduce with diagram.
8. Explain the limitations of MapReduce.
9. Explain the data mining techniques which are used to perform data modeling in R.
10. Mention different Hadoop installation modes? Explain each of them.

UNIT – III
1. Explain the architecture of RHIPE.
2. Explain RHadoop in detail.
3. Explain the architecture of RHadoop
4. Explain the working of RHadoop with example?
5. Explain the hstable reader function for Hadoop streaming.
6. Explain the hskeyval reader function for Hadoop streaming.
7. Explain the Hadoop streaming components?
8. Explain the format of Hadoop Streaming commands with each line?
UNIT – IV
1. Explain Data Analytics project life cycle stages.
2. Explain how data analytics problem for calculating the frequency of stock market
changes can be solved using MapReduce.
3. Write a case study for predicting the auction sales price of heavy equipment to
create a blue book for bulldozers.
4. Explain Poisson-approximation resampling technique on the Map of the
MapReduce task.
5. How will data analytics help to identify the category of a web page of a website,
which may categorize popularity wise as high, medium, or low (regular), based on
the visit count of the pages.
6. Write steps to build and run the MapReduce algorithm with R and Hadoop
integration for web page categorization problem.
7. Explain pre-processing and performing analytics over any data?
8. Explain how MapReduce problem is designed for computing the frequency of
stock market changes.

UNIT – V
1. What is Resilient Distributed Dataset (RDD)? Explain transformations and actions
in RDD. Explain RDD operations in brief?
2. Why Spark is preferred over Hadoop? Explain the limitations of Hadoop?
3. Explain how Spark overcomes the limitations of Hadoop.
4. Briefly explain the core components in Spark.
5. Explain the architecture of Spark.
6. What is SparkContext in Apache Spark?
7. What is a Directed acyclic graphs (DAG) in Spark, and how does it work?
8. What are Spark DataFrames? Why do we use them in Spark?
9. Explain Apache Spark RDD Operations in detail.
10. What are different types of RDD transformation? Explain functions in RDD
transformation.
11. What are RDD actions? When they are used? Explain Spark actions.
12. What are the deployment modes in Spark? What is difference between client and
cluster mode deployment?
13. What are the components of Spark architecture?
14. What is Spark core? What are the various functions of Spark core? Which is a
component on the top of Spark core?
15. What are the components of Spark Streaming? What is Spark Streaming used for?
UNIT – VI

1. What is machine learning? Explain types of machine-learning algorithms.

2. Explain Supervised Machine Learning Algorithm.
3. Explain how Linear regression is performed using with R and Hadoop?
4. Explain how logistic regression is performed using with R and Hadoop?
5. Explain Unsupervised Machine Learning Algorithm.
6. Explain steps to performing clustering with R and Hadoop.
7. Explain Steps to generate recommendations in R.
8. What is recommendation algorithm? Explain two different types of
Recommendations Algorithms.
9. How do you create a recommendation algorithm with R and Hadoop?
10. How one can use R and Hadoop together to generate recommendations from big
datasets?

Bda MCQ Set
No ratings yet
Bda MCQ Set
8 pages
BDA Question Bank
No ratings yet
BDA Question Bank
17 pages
JNTUK 3-2 1st Mid Big Data Analytics - (R2032121) Online Bits
No ratings yet
JNTUK 3-2 1st Mid Big Data Analytics - (R2032121) Online Bits
10 pages
BDA R22 Question Bank
No ratings yet
BDA R22 Question Bank
14 pages
Big Data and Hadoop - Semester Exam - 6th Sem-Set 01
No ratings yet
Big Data and Hadoop - Semester Exam - 6th Sem-Set 01
3 pages
2REVIEW Merged
No ratings yet
2REVIEW Merged
309 pages
14.QP - Comprehensive Exam
No ratings yet
14.QP - Comprehensive Exam
3 pages
Bda Summer 2024 Solution
No ratings yet
Bda Summer 2024 Solution
26 pages
DS BigDATA 2ièmeN2TR UVT 2022 2023
No ratings yet
DS BigDATA 2ièmeN2TR UVT 2022 2023
4 pages
IoT Quiz
No ratings yet
IoT Quiz
4 pages
DSBDA Kadak Document
No ratings yet
DSBDA Kadak Document
249 pages
Big Data Quiz-Merged
No ratings yet
Big Data Quiz-Merged
152 pages
Big Data 22 23 24
No ratings yet
Big Data 22 23 24
10 pages
Ai ML
No ratings yet
Ai ML
12 pages
Big Data Assignment 1 1
No ratings yet
Big Data Assignment 1 1
4 pages
Big Data Analytics
No ratings yet
Big Data Analytics
6 pages
Business Intelligence and Analytics: Systems For Decision Support, 10e (Sharda) Chapter 13 Big Data and Analytics
No ratings yet
Business Intelligence and Analytics: Systems For Decision Support, 10e (Sharda) Chapter 13 Big Data and Analytics
13 pages
Big Data Analytics 2M Definitions
No ratings yet
Big Data Analytics 2M Definitions
3 pages
TYCS - SEM6 - Data Science
No ratings yet
TYCS - SEM6 - Data Science
7 pages
Quiz 1
No ratings yet
Quiz 1
10 pages
Unit - Iv
No ratings yet
Unit - Iv
18 pages
Ite06 Big Data Analytics-Qbank
No ratings yet
Ite06 Big Data Analytics-Qbank
18 pages
Data Science & Big Data MCQs
No ratings yet
Data Science & Big Data MCQs
17 pages
Data Science Multiple Choice Question
No ratings yet
Data Science Multiple Choice Question
9 pages
Saivamsi Data Science
No ratings yet
Saivamsi Data Science
8 pages
Bda Solved Sample Question Paper 70 Marks
No ratings yet
Bda Solved Sample Question Paper 70 Marks
29 pages
Bda Solved Sample Question Paper 70 Marks
No ratings yet
Bda Solved Sample Question Paper 70 Marks
29 pages
DA QnBank Full 17jan22 NoKey
No ratings yet
DA QnBank Full 17jan22 NoKey
16 pages
Big Data
No ratings yet
Big Data
22 pages
Question Bank For All 5 Units: Department of Computer Science and Engineering & Department of Information Technology
No ratings yet
Question Bank For All 5 Units: Department of Computer Science and Engineering & Department of Information Technology
14 pages
It 6001 Da 2 Marks With Answer PDF
No ratings yet
It 6001 Da 2 Marks With Answer PDF
10 pages
BDA Question Bank
No ratings yet
BDA Question Bank
33 pages
Big Data & Cloud Computing Guide
No ratings yet
Big Data & Cloud Computing Guide
10 pages
Big Data Anlaytics: Unit 1 & 2 - Question Bank MCQ's
100% (1)
Big Data Anlaytics: Unit 1 & 2 - Question Bank MCQ's
4 pages
Big Data and Hadoop MCQs and XML Configurations
No ratings yet
Big Data and Hadoop MCQs and XML Configurations
21 pages
Bda Winter 2024 Solution
No ratings yet
Bda Winter 2024 Solution
25 pages
EoDA Open QA Batch 1
No ratings yet
EoDA Open QA Batch 1
1 page
Data Science
No ratings yet
Data Science
31 pages
16MC822 - Big Data Analytics
No ratings yet
16MC822 - Big Data Analytics
5 pages
CS 3440 Graded Quiz Unit 3
No ratings yet
CS 3440 Graded Quiz Unit 3
7 pages
Question Bank (DA) - 1
No ratings yet
Question Bank (DA) - 1
14 pages
Big Data Bank
No ratings yet
Big Data Bank
24 pages
Big Data Course Overview
No ratings yet
Big Data Course Overview
46 pages
Two Marks
No ratings yet
Two Marks
39 pages
ABD Exame PDF
No ratings yet
ABD Exame PDF
17 pages
Big Data QCM 1 PDF
100% (1)
Big Data QCM 1 PDF
7 pages
BDA PYQ - Copyy
No ratings yet
BDA PYQ - Copyy
3 pages
Data Analytics8 QB
No ratings yet
Data Analytics8 QB
5 pages
Chapter 1
No ratings yet
Chapter 1
16 pages
Bda A1
No ratings yet
Bda A1
15 pages
Big Data Quiz for College Students
No ratings yet
Big Data Quiz for College Students
47 pages
Big Data Analytics - Notes
No ratings yet
Big Data Analytics - Notes
13 pages
Chapter 1 Bigdata Introduction Questions Answers
No ratings yet
Chapter 1 Bigdata Introduction Questions Answers
6 pages
Big Data Questions and Answers
No ratings yet
Big Data Questions and Answers
14 pages
Assignment-1 Bda Student
100% (1)
Assignment-1 Bda Student
2 pages
Search Strategies AI
No ratings yet
Search Strategies AI
3 pages
Blank LIC Format
No ratings yet
Blank LIC Format
10 pages
Shivaji University, Kolhapur
No ratings yet
Shivaji University, Kolhapur
13 pages
CAP Admission Procedure
No ratings yet
CAP Admission Procedure
41 pages
General Studies 1 SAMPLE
No ratings yet
General Studies 1 SAMPLE
43 pages
Front Pages
No ratings yet
Front Pages
7 pages
Last Year Certificate
No ratings yet
Last Year Certificate
1 page
जिज्ञासा 2025 Rulebook
No ratings yet
जिज्ञासा 2025 Rulebook
4 pages
Front Pages Final
No ratings yet
Front Pages Final
8 pages
Version Control and Collaboration
No ratings yet
Version Control and Collaboration
10 pages
CSE Pre Crop 01-1
No ratings yet
CSE Pre Crop 01-1
13 pages
Final Front Pages
No ratings yet
Final Front Pages
7 pages
Chapter B Tech
No ratings yet
Chapter B Tech
41 pages
ICTACT Journal Template 02
No ratings yet
ICTACT Journal Template 02
6 pages
Final Plant Paper
No ratings yet
Final Plant Paper
7 pages
Time Table Sem - VI
No ratings yet
Time Table Sem - VI
1 page
Open-Source Tools For Productivity
No ratings yet
Open-Source Tools For Productivity
13 pages
Front Pages 1
No ratings yet
Front Pages 1
5 pages
Black Book 01
No ratings yet
Black Book 01
55 pages
Front Pages-2
No ratings yet
Front Pages-2
5 pages
Index 1
No ratings yet
Index 1
3 pages
Time Table Sem - IV
No ratings yet
Time Table Sem - IV
1 page
Time Table New
No ratings yet
Time Table New
5 pages
Modern College, Kolhapur: All The Best
No ratings yet
Modern College, Kolhapur: All The Best
1 page
CSE MARCH 2023 Advanced Database Systems
No ratings yet
CSE MARCH 2023 Advanced Database Systems
4 pages
Final Project Black 00
No ratings yet
Final Project Black 00
33 pages
Talathi 2025 Study Plan
No ratings yet
Talathi 2025 Study Plan
2 pages
OOMD Summer
No ratings yet
OOMD Summer
12 pages
Inception of Indian Knowledge System
No ratings yet
Inception of Indian Knowledge System
7 pages
Time Table 1
No ratings yet
Time Table 1
3 pages
Cloudera Spark Developer Training
No ratings yet
Cloudera Spark Developer Training
491 pages
Aditya 18cs03 Seminar Report
No ratings yet
Aditya 18cs03 Seminar Report
27 pages
SAP HANA Hadoop Integration
No ratings yet
SAP HANA Hadoop Integration
16 pages
Ruchi Pandey: Data Engineer Resume
No ratings yet
Ruchi Pandey: Data Engineer Resume
1 page
Hadoop Data Lake: Hadoop Log Files Json
No ratings yet
Hadoop Data Lake: Hadoop Log Files Json
5 pages
Cassandra DBA
No ratings yet
Cassandra DBA
5 pages
Kibana, Grafana and Zeppelin On Monitoring Data
100% (1)
Kibana, Grafana and Zeppelin On Monitoring Data
21 pages
Data Science & Big Data Projects
100% (1)
Data Science & Big Data Projects
85 pages
Apache Spark Analytics Made Simple PDF
No ratings yet
Apache Spark Analytics Made Simple PDF
76 pages
Benjamin Reyes Cabalona JR.: Okada Manila
No ratings yet
Benjamin Reyes Cabalona JR.: Okada Manila
1 page
Canonical Charmed Spark 3 Release 1
No ratings yet
Canonical Charmed Spark 3 Release 1
31 pages
6CS030 Big Data 2019/0 Portfolio - Part 1: Worksheet Three - 5% Hand-Out: Week 9. Demo: Week 10 Workshop
No ratings yet
6CS030 Big Data 2019/0 Portfolio - Part 1: Worksheet Three - 5% Hand-Out: Week 9. Demo: Week 10 Workshop
2 pages
Bigdata 11
No ratings yet
Bigdata 11
12 pages
BIG DATA ANALYTICS MCQs
No ratings yet
BIG DATA ANALYTICS MCQs
8 pages
Carren Hudson AWS Engineer
No ratings yet
Carren Hudson AWS Engineer
7 pages
Apache Spark: Fast Big Data Processing
No ratings yet
Apache Spark: Fast Big Data Processing
4 pages
Data Engineering Roadmap & Resources
No ratings yet
Data Engineering Roadmap & Resources
7 pages
Dataiku: Empowering AI for Businesses
No ratings yet
Dataiku: Empowering AI for Businesses
16 pages
Original
No ratings yet
Original
17 pages
Big Data Hadoop & Spark Course Guide
No ratings yet
Big Data Hadoop & Spark Course Guide
29 pages
Apache Iceberg - Additional Real World Use Cases
No ratings yet
Apache Iceberg - Additional Real World Use Cases
25 pages
DP 600
No ratings yet
DP 600
121 pages
Data Science Collected Resources
No ratings yet
Data Science Collected Resources
30 pages
Apache Spark Setup Guide
No ratings yet
Apache Spark Setup Guide
8 pages
CV HZCC
No ratings yet
CV HZCC
1 page
Spark-Based Machine Learning Platform
No ratings yet
Spark-Based Machine Learning Platform
6 pages
Sr. AWS Data Engineer. Resume Nashville, TN - Hire IT People - We Get IT Done
No ratings yet
Sr. AWS Data Engineer. Resume Nashville, TN - Hire IT People - We Get IT Done
10 pages
M.SC DA Syllabus 2017 19 Batch
No ratings yet
M.SC DA Syllabus 2017 19 Batch
64 pages
Stream Processing and Analytics - Regular-HO
No ratings yet
Stream Processing and Analytics - Regular-HO
7 pages
Scala Practicals
No ratings yet
Scala Practicals
37 pages

BD Question Bank MCQ Answered

Uploaded by

BD Question Bank MCQ Answered

Uploaded by

Shivaji University, Kolhapur

Question Bank for Mar 2022 (Summer) Examination

Subject Code: 84719, Subject Name: Big Data Analytics

The Big data analytics work on the

Identify among the options below

The key aspect of the MapReduce

TaskTracker JobTracker NameNode DataNode

Hadoop MapReduce is a popular_____

The RHIPE package uses the ________

Phase 3: Phase 4: Phase 5:

1. Both methods can be used for

In Random forest you can generate

1. Individual tree is built on a

The primary Machine Learning API for All of the

Which of the following is a module for

SparkSQL translates commands into

SparkSQL plays the main role in the

Which of the following is not a Logical Physical

__________ is a distributed machine Spark

Which of the following machine

1. What is machine learning? Explain types of machine-learning algorithms.

You might also like