0% found this document useful (0 votes)

28 views6 pages

DS QCM BigData 2021

This document is an examination paper for a Big Data course at the Université de Sousse, covering various topics related to Big Data, Hadoop, and Spark. It consists of multiple-choice questions that assess knowledge on concepts such as the 4Vs of Big Data, Hadoop components, and programming languages supported by Apache Spark. The questions also explore the advantages of different file formats and the performance factors in Hadoop clusters.

Uploaded by

raed touil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views6 pages

DS QCM BigData 2021

Uploaded by

raed touil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

UNIVERSITÉ DE SOUSSE

École Nationale d’Ingénieurs de Sousse A.U. 2019-2020

DS Big-Data
Filière : GTE2
Nom : ………………………… Prénom :……….………………
Groupe : …… Salle: …… …… No place: ……………
=================================================================================
For each question circle the right answer(s).

1. What are the 4Vs of Big Data? (Please select the FOUR that apply)
A) Veracity
B) Velocity
C) Variety
D) Value
E) Volume
F) Visualization

2. What is meant by Data at rest?

A) Data in a file that has experied
B) Data encrypted
C) Data that is no changing
D) A file that has been processed by hadoop

3. What are the three types of Big Data? (Please select the THREE that apply)
A) Natural Language
B) Semi-structured
C) Graph-based
D) Structured
E) Machine-Generated
F) Unstructured

4. Which two factors in a Hadoop cluster increase performance most significantly? Select the TWO
answers that apply
A) solid state disks
B) immediate failover of failed disks
C) data redundancy on management nodes
D) high speed networking between nodes
E) parallel reading of large data files
F) large number of small data files

5. What is the default number of replicas in a Hadoop system?

A) 5
B) 4
C) 3
6. True or False: At least 2 Name Nodes are required for a standalone Hadoop cluster.
A) TRUE
B) FALSE

7. Which computing technology provides Hadoop's high performance? sélectionnez une répponse
A) Online Analytical Processing
B) Parallel Processing
C) RAID-0
D) Online Transactional Processing

8. Centralized handling of job control flow is one of the the limitations of MR v1.
A) TRUE
B) FALSE

9. The Job Tracker in MR1 is replaced by which component(s) in YARN?

A) ResourceMaster
B) ApplicationMaster
C) ApplicationManager
D) ResourceManager

10. What is an advantage of the ORC file format?

A. Efficient compression
B. Big SQL can exploit advanced features
C. Supported by multiple I/O engines
D. Data interchange outside Hadoop

11. What is the default directory in HDFS where tables are stored?
A. /apps/hive/warehouse/schema
B. /apps/hive/warehouse/
C. /apps/hive/warehouse/data
D. /apps/hive/warehouse/bigsql

12. Which three programming languages are directly supported by Apache Spark?
A) Scala
B) C++
C) C#
D) Java
E) Python
F) .NET
13. Under the YARN/MRv2 framework, which daemon arbitrates the execution of tasks among all the
applications in the system?
A. ScheduleManager
B. ApplicationMaster
C. JobMaster
D. ResourceManager

14. Which Apache Hadoop component can potentially replace an RDBMS as a large Hadoop datastore
and is particularly good for "sparse data"?
A. MapReduce
B. HBase
C. Spark
D. Ambari

15. Which statement is true about Hortonworks Data Platform (HDP)?

A. It is a Hadoop distribution based on a centralized architecture with YARN at its core.
B. It is a powerful platform for managing large volumes of structured data.
C. It is engineered and developed by IBM's BigInsights team.
D. It is designed specifically for IBM Big Data customers.

16. What are two primary limitations of MapReduce v1?

A) Workloads limited to MapReduce
B) Resource utilization
C) Scalability
D) TaskTrackers can be a bottleneck to MapReduce jobs
E) Number of TaskTrackers limited to 1,000

17. Which statement is true about MapReduce v1 APIs?

A) MapReduce v1 APIs define how MapReduce jobs are executed.
B) MapReduce v1 APIs are implemented by applications which are largely independent of the
execution environment.
C) MapReduce v1 APIs cannot be used with YARN.
D) MapReduce v1 APIs provide a flexible execution environment to run MapReduce.

18. Apache Spark provides a single, unifying platform for which three of the following types of
operations?
A. graph operations
B. record locking
C. batch processing
D. machine learning
E. ACID transactions

19. Which statement is true about the Combiner phase of the MapReduce architecture?
A. It determines the size and distribution of data split in the Map phase.
B. It reduces the amount of data that is sent to the Reducer task nodes.
C. It aggregates all input data before it goes through the Map phase.
D. It is performed after the Reducer phase to produce the final output.
20. Which component of the Spark Unified Stack allows developers to intermix structured database
queries with Spark's programming language?
A. Mesos
B. Spark SQL
C. Java
D. Mllib

21. Which Spark Core function provides the main element of Spark API?
A. RDD
B. MLlib
C. YARN
D. Mesos

22. Which two factors in a Hadoop cluster increase performance most significantly?
A. large number of small data files
B. data redundancy on management nodes
C. high-speed networking between nodes
D. solid state disks
E. immediate failover of failed disks
F. parallel reading of large data files

23. Hadoop uses which two Google technologies as its foundation?

A. YARN
B. Google File System
C. Ambari
D. HBase
E. MapReduce

24. Under the MapReduce v1 programming model, which shows the proper order of the full set of
MapReduce phases?
A. Map -> Split -> Reduce -> Combine
B. Map -> Combine -> Reduce -> Shuffle
C. Split -> Map -> Combine -> Reduce
D. Map -> Combine -> Shuffle -> Reduce

25. Under the YARN/MRv2 framework, which daemon is tasked with negotiating with the
NodeManager(s) to execute and monitor tasks?
A. ApplicationMaster
B. TaskManager
C. ResourceManager
D. JobMaster

26. What is the default directory in HDFS where tables are stored?
A. /apps/hive/warehouse/bigsql
B. /apps/hive/warehouse/
C. /apps/hive/warehouse/data
D. /apps/hive/warehouse/schema
27. Which Apache Hadoop component can potentially replace an RDBMS as a large Hadoop
datastore and is particularly good for "sparse data"?
A. Ambari
B. HBase
C. MapReduce
D. Spark

28. Which component of an Hadoop system is the primary cause of poor performance?
A. network
B. disk latency
C. CPU
D. RAM

29. The number of MAPS is determined by

A) The Input data
B) The output data
C) The Cluster

30. What are the benefits of using Spark?(Please select the THREE that apply)
A) Generality
B) Versality
C) Speed
D) Ease of use

31. Resilient Distributed Dataset (RDD) is the primary abstraction of Spark.

A) TRUE
B) FALSE

32. Which Spark RDD operations creat an acyclic graph through a lazy execution
A) Actions
B) Map-Reduce
C) Count
D) Transformations

33. What would you need to do in a Spark application that you would not need to do in a Spark shell
to start using Spark?
A) Extract the necessary libraries to load the SparkContext
B) Export the necessary libraries to load the SparkContext
C) Delete the necessary libraries to load the SparkContext
D) Import the necessary libraries to load the SparkContext

34. True or False: NoSQL database is designed for those that do not want to use SQL.
A) TRUE
B) FALSE

35. Which database is a columnar storage database?

A) SQL
B) Hive
C) HBase
36. Which file format has the highest performance
A) Sequence
B) ORC
C) Parquet
D) Delimited

37. What is an advantage of the ORC file format?

A. Efficient compression
B. Data interchange outside Hadoop
C. Supported by multiple I/O engines
D. Big SQL can exploit advanced features

38. You are creating a new table and need to format it with parquet. Which partial SQL statement
would create the table in parquet format?
A. STORED AS parquet
B. CREATE AS parquetfile
C. STORED AS parquetfile
D. CREATE AS parquet

39. Which Hadoop ecosystem tool can import data into a Hadoop cluster from a DB2, MySQL, or
other databases?
A. Sqoop
B. HBase
C. Accumulo
D. Oozie

40. Which data encoding format supports exact storage of all data in binary representations ?
A. Parquet
B. RCFile
C. SequenceFiles
D. Flat

Devoir Surveillé: Please Answer The Following Multiple-Choice Questions
No ratings yet
Devoir Surveillé: Please Answer The Following Multiple-Choice Questions
8 pages
Questions Certif BigData
No ratings yet
Questions Certif BigData
12 pages
Big Data MCQ
No ratings yet
Big Data MCQ
47 pages
DS BigDATA 2ièmeN2TR UVT 2022 2023
No ratings yet
DS BigDATA 2ièmeN2TR UVT 2022 2023
4 pages
Nptel Assignment 1
No ratings yet
Nptel Assignment 1
4 pages
MCQ Da
No ratings yet
MCQ Da
28 pages
Question 1: Your Answer
100% (1)
Question 1: Your Answer
26 pages
IBM Cloud and Big Data Quiz
100% (1)
IBM Cloud and Big Data Quiz
206 pages
Big Data & Hadoop: Review Questions
No ratings yet
Big Data & Hadoop: Review Questions
7 pages
Big Data QCM 1 PDF
100% (1)
Big Data QCM 1 PDF
7 pages
Big Data and Hadoop - Semester Exam - 6th Sem-Set 01
No ratings yet
Big Data and Hadoop - Semester Exam - 6th Sem-Set 01
3 pages
Bigdata MCQ QA Part2
No ratings yet
Bigdata MCQ QA Part2
9 pages
Big Data Analytics
No ratings yet
Big Data Analytics
6 pages
Week 0 To 8 Assignment
No ratings yet
Week 0 To 8 Assignment
31 pages
Bda MCQ
No ratings yet
Bda MCQ
9 pages
coursBUTONLYQA Merged
No ratings yet
coursBUTONLYQA Merged
52 pages
Mid Term Sample Questions
No ratings yet
Mid Term Sample Questions
8 pages
Subject Name:: Knowledge Institute of Technology & Engineering-135
No ratings yet
Subject Name:: Knowledge Institute of Technology & Engineering-135
22 pages
Big Data Analytics Unit 1 MCQ
90% (10)
Big Data Analytics Unit 1 MCQ
10 pages
454U8-Big Data Analytics
No ratings yet
454U8-Big Data Analytics
22 pages
Pig
No ratings yet
Pig
24 pages
Bda MCQ Set
No ratings yet
Bda MCQ Set
8 pages
Big Data & Hadoop Essentials
No ratings yet
Big Data & Hadoop Essentials
4 pages
Big Data Analytics Exam 2020
100% (1)
Big Data Analytics Exam 2020
10 pages
4 5969937999511686081
No ratings yet
4 5969937999511686081
6 pages
Hadoop MCQs
75% (8)
Hadoop MCQs
21 pages
Hadoop Quiz and Exam Answers
No ratings yet
Hadoop Quiz and Exam Answers
10 pages
MCQ Big
No ratings yet
MCQ Big
7 pages
MCQ Questions
No ratings yet
MCQ Questions
6 pages
Midterm Solution
0% (1)
Midterm Solution
7 pages
QCM Bigdata 1 Exampdf
No ratings yet
QCM Bigdata 1 Exampdf
7 pages
Big Data Course: Key Concepts & Tools
No ratings yet
Big Data Course: Key Concepts & Tools
66 pages
2022 Assignment Answers
No ratings yet
2022 Assignment Answers
37 pages
Big Data and Hadoop Quiz Guide
No ratings yet
Big Data and Hadoop Quiz Guide
21 pages
BigData Objective
No ratings yet
BigData Objective
93 pages
Bits
No ratings yet
Bits
2 pages
Big Data & Hadoop Essentials
No ratings yet
Big Data & Hadoop Essentials
43 pages
Big Data & NoSQL Exam Prep
No ratings yet
Big Data & NoSQL Exam Prep
5 pages
Bda MCQ
100% (1)
Bda MCQ
44 pages
BDT Quiz
No ratings yet
BDT Quiz
4 pages
DSBDA Kadak Document
No ratings yet
DSBDA Kadak Document
249 pages
5877 - 4 MCS 2 Big Data - 4093 - (19-06-2024 01 - 37 - 31 - 626 PM)
No ratings yet
5877 - 4 MCS 2 Big Data - 4093 - (19-06-2024 01 - 37 - 31 - 626 PM)
3 pages
Hadoop Big Data Concepts Guide
100% (1)
Hadoop Big Data Concepts Guide
7 pages
BIG DATA ANALYTICS MCQs
No ratings yet
BIG DATA ANALYTICS MCQs
8 pages
Big Data Analytics 2M Definitions
No ratings yet
Big Data Analytics 2M Definitions
3 pages
Big Data Visualization
No ratings yet
Big Data Visualization
55 pages
Test Blanc
No ratings yet
Test Blanc
23 pages
Big Data Quiz1.1
No ratings yet
Big Data Quiz1.1
4 pages
Mcqs 5
No ratings yet
Mcqs 5
9 pages
Tarea 8
0% (2)
Tarea 8
13 pages
A1
No ratings yet
A1
33 pages
Big Data
No ratings yet
Big Data
7 pages
Chapter 1
No ratings yet
Chapter 1
16 pages
BigData Exam C2122 PDF
100% (1)
BigData Exam C2122 PDF
6 pages
Final Exam
17% (6)
Final Exam
6 pages
Spark Interview Q&A: Key Insights
No ratings yet
Spark Interview Q&A: Key Insights
10 pages
BDA IMPORTANT QUESTION (5marks)
No ratings yet
BDA IMPORTANT QUESTION (5marks)
7 pages
Mid - 2 Questions & Bits
No ratings yet
Mid - 2 Questions & Bits
5 pages
Tonepad Bigmuff
No ratings yet
Tonepad Bigmuff
2 pages
Lee 2025 Ai Critical Thinking Survey
No ratings yet
Lee 2025 Ai Critical Thinking Survey
23 pages
Health - Simple Sentences
No ratings yet
Health - Simple Sentences
12 pages
April 3-Good Friday
No ratings yet
April 3-Good Friday
130 pages
Tennis Court Oath
No ratings yet
Tennis Court Oath
16 pages
Modulo de Ignicion Ultima
100% (1)
Modulo de Ignicion Ultima
5 pages
New in Town Ultimate Strategy Guide
No ratings yet
New in Town Ultimate Strategy Guide
5 pages
Jde F060116
No ratings yet
Jde F060116
4 pages
Makalah Pronoun: Bahasa Inggris
No ratings yet
Makalah Pronoun: Bahasa Inggris
12 pages
The Art of Problem PDF
No ratings yet
The Art of Problem PDF
223 pages
Lesson Plan
No ratings yet
Lesson Plan
8 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
1 page
Grad 2nd - Aslm II - Semester II - 2023-2024
No ratings yet
Grad 2nd - Aslm II - Semester II - 2023-2024
2 pages
Introduction To Human Philosophy
No ratings yet
Introduction To Human Philosophy
13 pages
Investor Presentation: July 2019
No ratings yet
Investor Presentation: July 2019
35 pages
Patanjali Value Chain Analysis
No ratings yet
Patanjali Value Chain Analysis
4 pages
White Paper: Sensors For Industrial Iot
No ratings yet
White Paper: Sensors For Industrial Iot
13 pages
Nairobi & African Travel Highlights
No ratings yet
Nairobi & African Travel Highlights
84 pages
Executive Summary: Operations Management Assignment Report Writing: Importance of Operations Management
No ratings yet
Executive Summary: Operations Management Assignment Report Writing: Importance of Operations Management
12 pages
English Code 4 Phonics Book
100% (7)
English Code 4 Phonics Book
51 pages
Hapten-Carrier Immunology Basics
No ratings yet
Hapten-Carrier Immunology Basics
3 pages
Manual Tecnico Micros 60 PDF
100% (1)
Manual Tecnico Micros 60 PDF
269 pages
Writing Techniques for Educators
No ratings yet
Writing Techniques for Educators
2 pages
NASA's Trailblazing Women
100% (1)
NASA's Trailblazing Women
12 pages
Time-Travel Tale: Earth's Future Hero
No ratings yet
Time-Travel Tale: Earth's Future Hero
3 pages
Lao Tzu
100% (1)
Lao Tzu
3 pages
A-Level English Lit: Comedy Analysis
No ratings yet
A-Level English Lit: Comedy Analysis
8 pages
Lec 5
No ratings yet
Lec 5
19 pages
SOMOL Robert Dummy Text or The Diagramma
No ratings yet
SOMOL Robert Dummy Text or The Diagramma
22 pages
08 Naskah Publikasi
No ratings yet
08 Naskah Publikasi
10 pages

DS QCM BigData 2021

Uploaded by

DS QCM BigData 2021

Uploaded by

UNIVERSITÉ DE SOUSSE

École Nationale d’Ingénieurs de Sousse A.U. 2019-2020

2. What is meant by Data at rest?

5. What is the default number of replicas in a Hadoop system?

9. The Job Tracker in MR1 is replaced by which component(s) in YARN?

10. What is an advantage of the ORC file format?

15. Which statement is true about Hortonworks Data Platform (HDP)?

16. What are two primary limitations of MapReduce v1?

17. Which statement is true about MapReduce v1 APIs?

23. Hadoop uses which two Google technologies as its foundation?

29. The number of MAPS is determined by

31. Resilient Distributed Dataset (RDD) is the primary abstraction of Spark.

35. Which database is a columnar storage database?

37. What is an advantage of the ORC file format?

You might also like