0% found this document useful (0 votes)
19 views4 pages

B.Tech IT BDA Mid Exam Questions 2024

The document outlines the structure and content of mid-term examinations for the B.Tech in IT program at Dhane kula Institute of Engineering & Technology for the academic year 2024-2025. It includes details on course outcomes related to Hadoop, data processing, and visualization techniques, along with specific questions and evaluation schemes for two sets of exams. Each exam consists of questions that assess students' understanding and application of big data concepts and tools.

Uploaded by

saijyothsna.diet
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views4 pages

B.Tech IT BDA Mid Exam Questions 2024

The document outlines the structure and content of mid-term examinations for the B.Tech in IT program at Dhane kula Institute of Engineering & Technology for the academic year 2024-2025. It includes details on course outcomes related to Hadoop, data processing, and visualization techniques, along with specific questions and evaluation schemes for two sets of exams. Each exam consists of questions that assess students' understanding and application of big data concepts and tools.

Uploaded by

saijyothsna.diet
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

DHANEKULA INSTITUTE OF ENGINEERING & TECHNOLOGY

GANGURU::VIJAYAWADA – 521 139


Name of the Program:[Link] in IT Academic Year: 2024-2025 Name of the Course: BDA SET-1
Year & Semester: III-II Name of the Exam : MID-
II
HTNo of Candidate: Max Marks:15M Duration: 90Min Date:16-04-2025
Description of Course outcomes being assessed:

R2032121.3: Design and develop Hadoop.


CONO R2032121.3 R2032121.4 R2032121.5 Total
R2032121.4:Identify the characteristics of [Link] 1 2 3
datasets and compare the trivial data and big Marks 3 6 6 15
data for various applications

R2032121.5: Explore the various search methods and visualization techniques

--------------------------------------------------------------------------------------------------------------------------------------

Answer the following Questions

[Link] & explain about Scaling Out and its types in details?
[R2032121.3,BTL3, Applying,PO1,5,12/PSO1,2--3M]
2.a)Identify the Data Processing operators in Pig
[R20C3121.4,BTL3,Applying ,PO1,5,12/PSO1,2--3M]

b)Develop Hive Services ?


[R2032121.4,BTL3, Applying,PO1,5,12/PSO1,2--3M]

3.a)Develop the Simple Linear Regression with example?


[R2032121.5, BTL3, Applying, PO1,5,12 /PSO1,2---3M]
b)Illustrate about Visualizations
[R2032121.5, BTL3, Applying, PO1,5,12 /PSO1,2---3M]
DIET/7.5.1/FT13

DHANEKULA INSTITUTE OF ENGINEERING & TECHNOLOGY


GANGURU::VIJAYAWADA – 521 139
Name of the Program:[Link] in IT Academic Year:2024-2025 SET-1
Name of the Course: BDA Year & Semester:III-II Name of the Exam: MID-I
HT No of Candidate: Max Marks:15M Duration: 90Min Date: 11-02-2025
Description of Course outcomes being assessed:

R2032121.3: Design and develop CONO Total


R2032121.3 R2032121 .4 R2032121.5
Hadoop.
[Link] 1 2 3

R2032121.4:Identify the Marks 3 6 6 15


characteristics of datasets and compare the trivial data and big data for various applications

R2032121.5: Explore the various search methods and visualization techniques

---------------------------------------------------------------------------------------------------------------------------------------

Scheme of Evaluation

[Link]-1M and its Types Each 1M -------------------------------[3M]

2.a)Explanation of Each operator[1M] -----------------------------------[3M]


b)Explanation of Hive Services [1M]------------------------------------[3M]

3.a)Definition-[1m] and with example-[2M]-----------------------------[3M]


DIET/7.5.1/FT13

DHANEKULA INSTITUTE OF ENGINEERING & TECHNOLOGY


GANGURU::VIJAYAWADA – 521 139
Name of the Program:[Link] in IT Academic Year: 2024-2025 Name of the Course: BDA SET-2
Year & Semester: III-II Name of the Exam : MID-II
HTNo of Candidate Max Marks:15M Duration: 90Min
Date: 16-04-2025
Description of Course outcomes being assessed:

R2032121.3: Design and develop


CONO R2032121.3 R2032121.4 R2032121.5 Total
Hadoop.
[Link] 1 2 3
R2032121.4:Identify the Marks 3 6 6 15
characteristics of datasets and compare the trivial data and big data for various applications

R2032121.5: Explore the various search methods and visualization techniques

--------------------------------------------------------------------------------------------------------------------------------------

Answer the following Questions

1. Identify how Map Reduce works


[R2032121.3,BTL3, Applying,PO1,5,12/PSO1,2--3M]
2. a)Develop the applications on Big Data Uing Pig Hive
[R20C312.4,BTL3,Applying ,PO1,5,12/PSO1,2--3M]

b)Construct about Querying Data in Hive


[R20C312.4,BTL3,Applying ,PO1,5,12/PSO1,2--3M]

3. Explain about predictive Analytics in detail and with types

[R2032121.5, BTL3, Applying, PO1,5,12 /PSO1,2---6M]


DIET/7.5.1/FT13

DHANEKULA INSTITUTE OF ENGINEERING & TECHNOLOGY


GANGURU::VIJAYAWADA – 521 139
Name of the Program:[Link] in IT Academic Year:2024-2025 SET-2
Name of the Course: BDA Year & Semester:III-II Name of the Exam: MID-II
HT No of Candidate: Max Marks:15M Duration: 90Min Date: 16-04-2025
Description of Course outcomes being assessed

R2032121.3: Design and develop CONO Total


R2032121.3 R2032121 .4 R2032121.5
Hadoop.
[Link] 1 2 3

R2032121.4:Identify the Marks 3 6 6 15


characteristics of datasets and compare the trivial data and big data for various applications

R2032121.5: Explore the various search methods and visualization techniques

---------------------------------------------------------------------------------------------------------------------------------------

Scheme of Evaluation

1.a)Map Reduce Working Explanation-3M

2)a)Applications of Pig and Hive-3M


b)Explanation about Querying Data in Hive-3M

3)Definition of Predictive Analytics-1M


Types explanation – each 2M
and advantages -1M

Common questions

Powered by AI

Hadoop systems offer opportunities for parallel processing of vast data volumes, cost-effectiveness, and versatility in handling diverse data types. However, challenges include complexity in deployment, managing cluster resources, ensuring data security, and the learning curve associated with its ecosystem. Addressing these requires robust skill sets and strategies, yet the potential benefits in scalability and efficiency can profoundly impact data-intensive industries .

Applications developed using Pig and Hive leverage big data capabilities by enabling scalable data processing and querying. Pig provides a high-level scripting language for processing large data sets, while Hive offers an SQL-like interface to manage and analyze large volumes of data efficiently. Together, they allow businesses and researchers to derive insights from data that were previously inaccessible due to scale and processing complexity .

Predictive analytics uses statistical algorithms and machine learning techniques to forecast future outcomes based on historical data. It plays a crucial role in decision-making by anticipating trends and behaviors. Main types include regression techniques, decision trees, and neural networks, each offering distinct benefits for different prediction tasks .

Visualization techniques enable the translation of complex data sets into visual formats like charts and graphs, making insights more accessible and understandable. They help identify patterns, trends, and outliers, facilitating quicker decision-making and data-driven strategies. Effective visualization harnesses the power of human visual perception to simplify the complexity inherent to big data .

MapReduce operates by breaking down data processing tasks into smaller sub-tasks composed of map and reduce phases. In the map phase, the input data is split and processed in parallel across different nodes, producing intermediate key-value pairs. These pairs are shuffled and sorted, then fed into the reduce phase where they are aggregated and combined to produce the final output. This distributed processing enables effective handling of large data volumes .

Trivial data refers to small-scale data that can be processed using traditional data processing techniques, while big data is characterized by its volume, velocity, and variety, requiring advanced processing techniques. Applications of big data include large-scale analytics and real-time processing tasks that traditional trivial data solutions cannot handle due to constraints in scalability and processing speed .

Simple linear regression models are used to predict the relationship between a dependent and an independent variable, which is significant in analyzing trends within big data. For example, in sales forecasts, regression can be used to predict future sales based on factors like historical sales data and marketing efforts, thus providing actionable insights and strategic planning .

Hive supports SQL-like querying of large datasets stored in Hadoop's HDFS. Its primary functions include data summarization, querying, and analysis. Key services of Hive include managing and querying structured data, integrating with HDFS, and enabling easy data management through a user-friendly interface, which simplifies big data analytics tasks .

Scaling out in Hadoop involves adding more nodes to the cluster to handle increasing data and processing loads. This approach enhances performance by distributing data and computational tasks across multiple servers, allowing parallel processing and eliminating bottlenecks associated with scaling up (adding more resources to a single node). The main types of scaling include horizontal scaling, where nodes are added to a cluster, and vertical scaling, where resources are added to existing nodes .

Data processing operators in Pig, such as FILTER, JOIN, and GROUP, provide powerful tools for manipulating, transforming, and analyzing data. These operators enable complex data workflows by enabling filtering of data sets, joining tables to build richer data views, and grouping data for aggregated analytics, thus enhancing the versatility of big data processing .

You might also like