MCQ – Hadoop – Javaguides
1. What does HDFS stand for?
a) High-Definition File System b) Hadoop Distributed File System
c) Hadoop Data Federation Service d) High-Dynamic File System
2. What is the default block size in HDFS?
a) 32 MB b) 64 MB c) 128 MB d) 256 MB
3. Who is the primary developer of Hadoop?
a) Microsoft b) IBM c) Apache Software Foundation d) Google
4. Which of the following is not a core component of Hadoop?
a) HDFS b) MapReduce c) YARN d) Spark
5. What does YARN stand for?
a) Yet Another Resource Navigator b) Yet Another Resource Negotiator
c) You Are Really Near d) Yarn Aims to Reuse Nodes
6. What is the purpose of the JobTracker in Hadoop?
a) To store data b) To manage resources
c) To schedule and track MapReduce jobs d) To distribute data blocks
7. What is a DataNode in HDFS?
a) A node that stores actual data blocks b) A node that manages metadata
c) A node responsible for job tracking d) A node responsible for resource management
8. What is the NameNode responsible for in HDFS?
a) Storing actual data blocks b) Managing metadata and namespace
c) Job scheduling d) Resource management
9. What programming model does Hadoop use for processing large data sets?
a) Divide and Rule b) Master-Slave c) MapReduce d) None of the above
10. What is the primary language for developing Hadoop?
a) Python b) Java c) C++ d) Ruby
11. Which of the following can be used for data serialization in Hadoop?
a) Hive b) Pig c) Avro d) YARN
12. Which Hadoop ecosystem component is used as a data warehousing tool?
a) Hive b) Flume c) ZooKeeper d) Sqoop
13. What is the role of ZooKeeper in the Hadoop ecosystem?
a) Data Serialization b) Stream Processing
c) Cluster Coordination d) Scripting Platform
14. Which tool can be used to import/export data from RDBMS to HDFS?
a) Hive b) Flume c) Oozie d) Sqoop
15. Which of the following is not a function of the NameNode?
a) Store the data block b) Manage the file system namespace
c) Keep metadata information d) Handle client requests
16. What is the replication factor in HDFS?
a) The block size of the data
b) The number of copies of a data block stored in HDFS
c) The number of nodes in a cluster
d) The amount of data that can be stored in a DataNode
17. Which of the following is a scheduler in Hadoop?
a) Sqoop b) Oozie c) Flume d) Hive
18. Which daemon is responsible for MapReduce job submission and distribution?
a) DataNode b) NameNode c) ResourceManager d) NodeManager
19. What is a Combiner in Hadoop?
a) A program that combines data from various sources
b) A mini-reducer that operates on the output of the mapper
c) A tool to combine several MapReduce jobs
d) A process to combine NameNode and DataNode functionalities
20. In which directory Hadoop is installed by default?
a) /usr/local/Hadoop b) /home/Hadoop c) /opt/hadoop d) /usr/hadoop
21. Which of the following is responsible for storing large datasets in a distributed environment?
a) MapReduce b) HBase c) Hive d) Pig
22. In a Hadoop cluster, if a DataNode fails:
a) Data will be lost
b) JobTracker will be notified
c) NameNode will re-replicate the data block to other nodes
d) ResourceManager will restart the DataNode
23. Which scripting language is used by Pig?
a) HiveQL b) Java c) Pig Latin d) Python
24. What does "speculative execution" in Hadoop mean?
a) Executing a backup plan if the main execution plan fails
b) Running the same task on multiple nodes to account for node failures
c) Predicting the execution time for tasks
d) Running multiple different tasks on the same node
25. What is the role of a "Shuffler" in a MapReduce job?
a) It connects mappers to the reducers
b) It sorts and groups the keys of the intermediate output from the mapper
c) It combines the output of multiple mappers
d) It distributes data blocks across the DataNodes