Unit - 1

Example – There are two files A.txt and B.txt which are stored in a cluster having 5 nodes. When these files are put in HDFS, as per the applicable block size, let's say both of these files are divided into two blocks.

Uploaded by

rajsreerama.s

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views

Unit - 1

Uploaded by

rajsreerama.s

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 46

Introduction to Big

Data
Unit - 1
What is Data?
• The quantities, characters, or symbols on which operations are
performed by a computer, which may be stored and transmitted in
the form of electrical signals and recorded on magnetic, optical, or
mechanical recording media.
• Big Data is a collection of data that is huge in volume, yet growing
exponentially with time. It is a data with such large size and
complexity that none of traditional data management tools can store
it or process itefficiently.
• Today we live in the digital world. With increased digitization the
amount of structured and unstructured data being created and stored
is exploding.
• The data is being generated from various sources - transactions, social
media, sensors, digital images, videos, audios and clickstreams for
domains including healthcare, retail, energy and utilities.
• For instance, 30 billion content are being shared on Facebook every
month; the photos viewed every 16 seconds in Picasa could cover a
football field.
What is Big Data?
• Big data is a collection of large, complex, and diverse data sets that
are difficult to manage and analyze using traditional data processing
tools. It can include structured, semi-structured, and unstructured
data.
• Big data refers to extremely large and diverse collections of
structured, unstructured, and semi-structured data that continues to
grow exponentially over time.
• These datasets are so huge and complex in volume, velocity, and
variety, that traditional data management systems cannot store,
process, and analyze them.
Why Big Data is Important?
Big data is important because it can help organizations make better
decisions, improve operations, and gain a competitive advantage.
The importance of big data doesn’t simply revolve around how much data
you have. The value lies in how you use it. By taking data from any source
and analyzing it, you can find answers that
1) Streamline resource management,
2) Improve operational efficiencies,
3) Optimize product development,
4) Drive new revenue and growth opportunities and
5) Enable smart decision making.
When you combine big data with high-performance analytics, you can
accomplish business-related tasks such as:
• Determining root causes of failures, issues and defects in near-real
time.
• Spotting anomalies faster and more accurately than the human eye.
• Improving patient outcomes by rapidly converting medical image data
into insights.
• Recalculating entire risk portfolios in minutes.
• Sharpening deep learning models' ability to accurately classify and
react to changing variables.
• Detecting fraudulent behavior before it affects your organization.
• Companies use Bigdata in their systems to improve operations,
provide better customer service, create personalized marketing
campaigns and take other actions that ultimately, can increase
revenue and profits.
• Bigdata is used to describe a massive volume of both structured and
unstructured data. It is difficult to process using traditional database
and software techniques.
• The term big data is believed to have originated with web search
companies who needed to query very large distributed aggregations
of loosely-structured data.
• Big data has the potential to help companies improve operations and
make faster, more intelligent decisions.
Semi-structured Data:
Semi-structured data, or partially structured data, doesn’t follow the
tabular structure associated with relational databases or other forms of
data tables. However, it does contain tags and metadata to separate
semantic elements and establish hierarchies of records and fields.
What are Examples of Semi-Structured Data?
HTML code, graphs and tables, e-mails, XML documents are examples
of semi-structured data, which are often found in object-oriented
databases.
NoSQL databases, CSV, JSON documents, Electronic data interchange
(EDI),RDF
Semi-structured Data
What is an Example of Big Data?
Following are some of the Big Data examples-
• Stock Exchange
The New York Stock Exchange is an example of Big Data that generates about
one terabyte of new trade data per day.its data is also data but with huge size.
• Social Media
The statistic shows that 500+terabytes of new data get ingested into the
databases of social media siteFacebook, every day. This data is mainly generated
in terms of photo and video uploads, message exchanges, putting comments etc.
• A single Jet engine can generate 10+terabytes of data in 30 minutes of flight
time. With many thousandflights per day, generation of data reaches up to
many Petabytes.
• One Bit - 1 or 0 - 8 bits - 1 Byte
• (1024)1 Bytes = 1 KiloBye - 1 KB
• (1024)2 Bytes = 1 MegaByte - 1 MB
• (1024)3 Bytes = 1 GigaByte - 1GB
• (1024)4 Bytes = 1 TeraByte - 1TB
• (1024)5 Bytes = 1 PetaByte - 1PB
• (1024)6 Bytes = 1 ExaByte - 1EB
• (1024)7 Bytes = 1 ZettaByte - 1ZB
• (1024)8 Bytes = 1 YottaByte - 1YB
• (1024)9 Bytes = 1 BrontoByte - 1BB
• (1024)10 Bytes = 1 geopByte - 1gB
Evolution of Big Data
1. Early days of Computing:
• Data was stored on mainframe computers and was used for business and
scientific applications.
• The amount of data stored & analyzed was limited. For data processing, Batch
Processing techniques was used.

2. Data warehousing:
• It allowed organizations to store and analyze large amount of data from
multiple sources.
• The data is primarily structured.The amount of data stored & analyzed was
limited.
3. The rise of the Internet:
• With the rise of the internet in the 1990s, the amount of
data being generated & collected began to grow rapidly.
• The data was more diverse and unstructured. It is difficult
to process and analyze using traditional techniques.

4. The emergence of Big data:

• In the early 2000s, the term “Big Data” was coined to
describe the large volume of data that was being
generated & collected.
• New technologies such as Hadoop & NoSQL databases
were developed to handle the volume & variety of data
5. The growth of Big data:
• The amount of more diversed & unstructured form of data being
generated & collected has continued to grow rapidly.
• New technologies such as cloud computing and streaming analytics
have been developed to handle the volume, variety & velocity of data.

6. Artificial Intelligence and Machine Learning:

• Big Data is also being used to train Machine Learning models and AI.
• This allows organization to gain insights & predictions that were not
possible before.
7. Internet of Things and 5G:
• With the advent of Internet of Things(IoT) and 5G, Big Data is also
becoming more distributed and mobile.
• This is creating new challenges and oppurtunities for Big Data
processing and Analytics.
8. Blockchain and Big Data:
• With the advent of Blockchain technology, big data can be secured in
a way that was not possible before.
• This opens up new oppurtunities for decentralized data processing
and analytics.
Big Data continues to evolve due to advances in technology and the
proliferation of connected devices and the internet.
History of Big Data
• There were many advancements in technology during World War 2,
which were primarily made to serve military purposes. Those
advancements would become useful to the commercial sector and the
general public, with personal computing becoming a viable option to
the everyday consumer.
• Chief Scientist at Silicon Graphics, John R. Mashey, considered the
father to the ‘Big Data' term.
• Big Data is a term that describes large volumes of high velocity,
complex & variable data that require advanced techniques and
technologies to enable the capture, storage, distribution,
management, and analysis of the information.
• Big Data Analytics is the process of examining and interrogating big
data assets to derive insights of value for decision making.
1) 1940s to 1989 – Data Warehousing and Personal Desktop Computers
• The origins of electronic storage can the development of the world’s first
programmable computer, the Electronic Numerical Integrator and
Computer (ENIAC). It was designed by the U.S. army during World War 2 to
solve numerical problems, such as calculate the range of artillery fire.
• In the early 1960s, International Business Machines (IBM) released the first
transistorized computer called TRAnsistorized DIgital Computer or
TRansistorized Airborne DIgital Computer(TRADIC), which helped data
centers branch out of the military and serve more general commercial
purposes.
• The first personal desktop computer to feature a Graphical User Interface
(GUI) was Lisa, released by Apple Computers in 1983.
• Throughout the 1980s, companies like Apple, Microsoft, and IBM would
release a wide range of PDCs.Thus, electronic storage was finally available
to the masses.
2) 1989 to 1999 – Emergence of the World Wide Web
• Between 1989 and 1993, British computer scientist Sir Tim Berners-
Lee would create the fundamental technologies required to power
the World Wide Web.
• These web technologies were HyperText Markup Language (HTML),
Uniform Resource Identifier (URI), and Hypertext Transfer Protocol
(HTTP).
• In April 1993, the decision was made to make the underlying code
for these web technologies free, forever.
• It made possible for individuals, businesses, and organizations who
could afford to pay for an internet service to go online and share
data with other internet-enabled computers.
• As more devices gained access to the internet, this led to a massive
explosion in the amount of information that people could access
and share data at any one time.
3) 2000s to 2010s – Controlling Data Volume, Social Media and Cloud Computing:
• The early 2000s, companies Amazon, eBay, and Google generate large amounts of
web traffic, as well as a combination of structured and unstructured data.
• Amazon launched a beta version of AWS (Amazon Web Services) in 2002, which
opened the Amazon.com platform to all developers. By 2004, over 100 applications
were built for it.
• AWS then relaunched in 2006, offering a wide range of cloud infrastructure services,
including Simple Storage Service (S3) and Elastic Compute Cloud (EC2).
• The public launch of AWS attracted a wide range of customers, such as Dropbox,
Netflix, and Reddit, who were cloud-enabled and all partner with AWS before 2010.
• Social media platforms(MySpace, Facebook, Twitter) led to a rise in the spread of
unstructured data. This would include the sharing of images and audio files,
animated GIFs, videos, status posts, and direct messages.
• These platforms needed new ways to collect, organize, and make sense of this large
amounts of unstructured data being generated at an accelerated rate.
• This led to the creation of Hadoop, an open-source framework created specifically
to manage big data sets, and the adoption of NoSQL database queries, which made
it possible to manage unstructured data – data does not comply with a relational
database model.
• With these new technologies, companies now collect large amounts of disparate
data, and then extract meaningful insights for more informed decision making.
4) 2010s to now – Optimization Techniques, Mobile Devices and IoT:
• In the 2010s, the biggest challenges facing big data was the advent of mobile
devices and the IoT (Internet of Things).
• millions of people, worldwide, with small, internet-enabled devices in their hands,
able to access the web, wirelessly communicate with other internet-enabled
devices, and upload data to the cloud.
• According to a 2017 Data Never Sleeps report by Domo, we were generating 2.5
quintillion bytes of data daily.
The rise of mobile devices and IoT devices also led to new types of data
being collected, organized, and analyzed.
Some examples include:
• Sensor Data (data collected by internet-enabled sensors to provide
valuable, real-time insight into the inner workings of a piece of
machinery)
• Social Data (publicly available social media data from platforms like
Facebook and Twitter)
• Transactional Data (data from online web stores including receipts,
storage records, and repeat purchases)
• Health-related data (heart rate monitors, patient records, medical
history)
Failure of Traditional Database in Handling Big Data
Traditional databases fail to handle big data because of the following
limitations:
• Scalability:- Traditional systems can't scale up to handle large amounts of
data. Scaling up involves adding resources like memory, CPU, or disk
space to a single server. This can be expensive, time-consuming, and
prone to failure.
• Inflexibility:- Traditional systems are not well-suited for handling
unstructured or semi-structured data.
• Latency:- Batch processing introduces latency, making it difficult to
analyze data in real-time.
• Cost:-Scaling traditional systems can be expensive due to the need for
high-end hardware and software licenses.
Cont..
• Traditional databases are optimized for structured data and smaller
datasets, whereas big data requires advanced tools due to its
complexity, volume, and variety.
• Big data is large, complex, and constantly changing, while traditional
data is typically small in size, structured, and static.
• Big data requires specialized tools and techniques to manage and
analyze effectively.
Big data has many qualities—it’s unstructured, dynamic, and complex. Humans
and IoT sensors are producing trillions of gigabytes of data each year.
It’s modern data, in an increasingly diverse range of formats and from variety of
sources. The data size and scale, along with its speed and complexity, is
challenging on traditional data storage systems.
1. Big Data Is Too Big for Traditional Storage
Facebook stores and analyzes huge quantities of data.Facebook users upload at
least 14.58 million photos per hour. Each photo garners interactions stored along
with it, such as likes and comments. Users have “liked” at least a trillion posts,
comments, and other data points. The more data that is in a relational database,
the longer each operation takes.
2. Big Data Is Too Complex for Traditional Storage
Traditional data is “structured.” A relational database—the type of database
that stores traditional data—consists of records containing clearly defined fields.
You can access this type of database using a relational database management
system (RDBMS) such as MySQL, Oracle DB, or SQL Server.
Big data is largely unstructured, consisting of myriad file types and including images,
videos, audio & socialmedia content. That’s why traditional storage solutions are
unsuitable for working with big data: They can’t properly categorize it.
Using a non-relational (NoSQL) database such as MongoDB, Cassandra, or Redis can
allow you to gain valuable insights into complex and varied sets of unstructured
data.
3. Big Data Is Too Fast for Traditional Storage
Big data grows almost instantaneously, and analysis often needs to occur in real
time. An RDBMS isn’t designed for rapid fluctuations.
for example: Internet of things (IoT) devices need to process large amounts of sensor
data with minimal latency. Sensors transmit data from the “real world” at a near-
constant rate. Traditional storage systems struggle to store and analyze data arriving
at such a velocity.
example: cybersecurity. IT departments must inspect each packet of data arriving
through a company’s firewall to check whether it contains suspicious code. Many
gigabytes might be passing through the network each day. To avoid falling victim to
Characteristics of Big data:
The characteristics of Big data also known as “3V’s” of Big data, are:
• Volume: The large amount of data(terabytes, petabytes, or exabytes)
that is generated and collected from various resources.
• Variety: The different types of data(structured, semi-structured &
unstructured) that can be included in Big data. This data can come
from different formats (text, images, videos & audios, etc.,)
• Velocity: The speed at which data is generated and must processed in
order to extract value from it. This includes real-time data streams
from social media, IoT devices & sensors.
• Veracity: The uncertainty and diversity of data which makes it difficult to clean,
process and analyze.
• Value: The ability to extract insights and make better decisions by analyzing big
data.

Additional characteristics that are important to consider:

• Complexity: Big data is complex & difficult to understand, it makes challenging
to extract insights from it.
• Scalability: Big data needs to be able to scale to handle the growing volume of
data, & the ability to quickly process & analyze it.
• Flexibility: Big data needs to be flexible to handle different types of data and
changing environments.
• Accessibility: Big data needs to be accessible to the right people, at right time,
and in the right format to drive insights & decision-making.
• Seucrity: Big data needs to be secured to protect sensitve information &
prevent unauthorized access.
Sources of Big data:-
Sources of Big data
Big data originates from numerous sources, each contributing unique insights
that help industries make better decisions. Below are the key sources and their
specific big data applications in the real world.
1. Social Media Data:
Social media platforms like Facebook, Instagram, LinkedIn, and Twitter produce a
massive volume of data every second.
What’s Captured:Posts, likes, shares, comments, video views, and hashtags.
Applications:
• Marketing and Advertising: Analyze trends, identify customer preferences, and
craft targeted campaigns.
• Sentiment Analysis: Understand public opinion on brands, products, or social
issues.
Example: Twitter trends provide real-time insights into customer sentiment
during product launches.
2. Machine Data:
Machine data comes from Internet of Things (IoT) devices, sensors, and system
logs, operating in industries like manufacturing, agriculture, and logistics.

What’s Captured:
Equipment performance, operational data, and environmental metrics.
Applications:
• Predictive Maintenance: Anticipate when machines might fail to reduce
downtime.
• Automation: Optimize workflows in smart factories or agricultural irrigation
systems.
Example: Smart home devices like thermostats adjust room temperatures based
on usage data.
3. Transaction Data:
Transaction data includes digital records from financial institutions, e-commerce
websites, and point-of-sale systems.

What’s Captured:
Purchase history, payment methods, inventory levels, and customer details.
Applications:
Fraud Detection: Monitor transactions for unusual activity.
Demand Forecasting: Predict product requirements based on buying patterns.
Example: E-commerce platforms like Amazon analyze purchase history to
recommend products.
4. Healthcare Data:
The healthcare industry collects and processes critical information from
hospitals, clinics, diagnostics labs, and wearable devices.

What’s Captured:
Patient records, genetic data, diagnostic images, and treatment
outcomes.
Applications:
• Personalized Medicine: Tailor treatments based on patient history.
• Epidemic Prediction: Use patient data to identify and contain
outbreaks.
Example: Fitness trackers provide real-time health metrics, which
doctors can use to monitor patients remotely.
5. Government and Public Data:
Government agencies and public organizations generate data from
weather monitoring, census collection, and transportation systems.

What’s Captured:
Population statistics, weather forecasts, traffic patterns, and public records.
Applications:
• Policy Making: Use demographic data to create impactful public policies.
• Urban Planning: Optimize infrastructure projects based on traffic and
population data.
Example: Smart traffic systems use data to reduce congestion in urban
areas.
6. Media and Entertainment Data:
Streaming services, gaming platforms, and digital publishers track user
activity and preferences.

What’s Captured:
Viewing habits, subscription details, social media engagement, and user
feedback.
Applications:
• Content Personalization: Recommend movies, songs, or games based on
user preferences.
• Engagement Analytics: Identify what content performs well to optimize
strategies.
Example: Netflix uses data analytics to recommend shows based on viewing
history.
7. Industrial Data:
Collected from robotics, manufacturing systems, and supply chains,
industrial data is critical for process optimization.

What’s Captured:
Production efficiency, inventory levels, shipment statuses, and machine
performance.
Applications:
• Supply Chain Optimization: Ensure timely delivery of goods by
monitoring logistics.
• Quality Assurance: Analyze production data to maintain high standards.
Example: Automotive companies monitor assembly line data to detect
defects early.
8. Scientific Research Data:
Fields like genomics, climate studies, and astronomy generate extensive
datasets from experiments and observations.

What’s Captured:
Satellite imagery, genome sequences, and experimental data.
Applications:
• Climate Models: Predict changes in weather patterns to combat global
warming.
• Medical Research: Develop new treatments or drugs using genomic
data.
Example: Space agencies use satellite data to monitor planetary
conditions.
What are the Main Components of
Big Data?
Organizations integrate these following components effectively can
unlock the potential of big data.
1. Data Sources
What It Includes:
Social media interactions, IoT devices, business transactions, and
customer feedback.
Purpose:
Provide the raw data required for analysis.
2. Data Storage
Key Systems:
• Hadoop Distributed File System (HDFS): For distributed and scalable storage.
• Data Lakes: Store large volumes of unstructured and semi-structured data.
• Cloud Storage: Solutions like Azure, AWS, and Google Cloud for flexible storage.
Purpose:
Organize and securely store data for easy access.

3. Data Processing
Techniques:
• Batch Processing: Tools like MapReduce process large data sets in chunks.
• Real-Time Streaming: Platforms like Apache Spark handle live data streams.
Purpose:
Convert raw data into structured and actionable formats.
4. Data Analytics
Methods Used:
Statistical models, machine learning algorithms, and predictive analytics.
Tools:
Python libraries like Pandas and Scikit-learn, and platforms like SAS and Tableau.
Purpose:
Derive insights, identify trends, and make data-driven predictions.
5. Data Visualization
How It’s Done:
Dashboards, heatmaps, and interactive graphs using tools like Power BI and
Tableau.
Purpose:
Present findings in an understandable way to help decision-makers.
How Does Big Data Analytics Work?
Big data analytics involves transforming vast amounts of raw data into
actionable insights. Here's a clear and concise step-by-step explanation:
1. Data Collection
What Happens: Data is gathered from diverse sources like:
• Social media platforms.
• Internet of Things (IoT) devices.
• Business databases.
• Online transactions.
Goal: Compile data in all formats—structured, unstructured, and semi-
structured—for analysis.
2. Data Cleaning
What Happens: Errors, duplicates, and irrelevant entries are removed. Common
tasks include:
• Fixing typos and standardizing formats.
• Filling missing values to avoid incomplete analysis.
Goal: Ensure the data is accurate and reliable for processing.
3. Data Processing
What Happens: Organize and structure data using powerful tools like:
• Apache Hadoop: For distributed storage and processing.
• Apache Spark: For faster, real-time data operations.
Goal: Convert raw data into manageable formats like tables or graphs for further
analysis.
4. Data Analysis
What Happens: Use statistical techniques and machine learning models to
extract insights. Popular methods include:
• Regression analysis for identifying trends.
• Clustering to group similar data points.
• Predictive modeling to forecast future trends.
Goal: Solve key business problems and predict outcomes.
5. Data Visualization
What Happens: Present the results in clear, intuitive visuals using tools like:
• Tableau and Power BI for creating interactive dashboards.
• Charts, heatmaps, and graphs to make data easy to understand.
Goal: Help stakeholders make informed decisions quickly.

Hourglass Workout Program by Luisagiuliet 2
76% (21)
Hourglass Workout Program by Luisagiuliet 2
51 pages
12 Week Program: Summer Body Starts Now
87% (46)
12 Week Program: Summer Body Starts Now
70 pages
Read People Like A Book by Patrick King-Edited
58% (81)
Read People Like A Book by Patrick King-Edited
12 pages
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
77% (13)
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
260 pages
Cheat Code To The Universe
94% (79)
Cheat Code To The Universe
34 pages
Facial Gains Guide (001 081)
91% (45)
Facial Gains Guide (001 081)
81 pages
Curse of Strahd
95% (467)
Curse of Strahd
258 pages
The Psychiatric Interview - Daniel Carlat
91% (34)
The Psychiatric Interview - Daniel Carlat
473 pages
The Borax Conspiracy
91% (57)
The Borax Conspiracy
14 pages
The Secret Language of Attraction
86% (107)
The Secret Language of Attraction
278 pages
How To Develop and Write A Grant Proposal
83% (542)
How To Develop and Write A Grant Proposal
17 pages
Penis Enlargement Secret
60% (124)
Penis Enlargement Secret
12 pages
Workbook For The Body Keeps The Score
89% (53)
Workbook For The Body Keeps The Score
111 pages
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
83% (1016)
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
13 pages
KamaSutra Positions
78% (69)
KamaSutra Positions
55 pages
7 Hermetic Principles
93% (30)
7 Hermetic Principles
3 pages
27 Feedback Mechanisms Pogil Key
77% (13)
27 Feedback Mechanisms Pogil Key
6 pages
Frank Hammond - List of Demons
92% (92)
Frank Hammond - List of Demons
3 pages
Phone Codes
79% (28)
Phone Codes
5 pages
36 Questions That Lead To Love
91% (35)
36 Questions That Lead To Love
3 pages
How 2 Setup Trust
97% (307)
How 2 Setup Trust
3 pages
The 36 Questions That Lead To Love - The New York Times
94% (34)
The 36 Questions That Lead To Love - The New York Times
3 pages
100 Questions To Ask Your Partner
78% (36)
100 Questions To Ask Your Partner
2 pages
Satanic Calendar
25% (56)
Satanic Calendar
4 pages
The 36 Questions That Lead To Love - The New York Times
95% (21)
The 36 Questions That Lead To Love - The New York Times
3 pages
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
100% (8)
14 Easiest & Hardest Muscles To Build (Ranked With Solutions)
27 pages
Jeffrey Epstein39s Little Black Book Unredacted PDF
75% (12)
Jeffrey Epstein39s Little Black Book Unredacted PDF
95 pages
1001 Songs
69% (72)
1001 Songs
1,798 pages
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
23% (954)
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
38 pages
Zodiac Sign & Their Most Common Addictions
63% (30)
Zodiac Sign & Their Most Common Addictions
9 pages
Exam Revision - INF3708
No ratings yet
Exam Revision - INF3708
5 pages
Big Data PPT 55b0fc01e7543
No ratings yet
Big Data PPT 55b0fc01e7543
31 pages
Big Data Project
100% (3)
Big Data Project
61 pages
Big Data Unit 1
No ratings yet
Big Data Unit 1
29 pages
What Is Big Data
No ratings yet
What Is Big Data
18 pages
Unit I-KCS-061
No ratings yet
Unit I-KCS-061
42 pages
Unit 1 Bda Complete Notes
No ratings yet
Unit 1 Bda Complete Notes
15 pages
Unit 1 Bigdata
No ratings yet
Unit 1 Bigdata
30 pages
Da Unit - I - Notes
No ratings yet
Da Unit - I - Notes
30 pages
Big Data Unit 1 Notes - 240311 - 100703
No ratings yet
Big Data Unit 1 Notes - 240311 - 100703
15 pages
DSBDA_UNIT1
No ratings yet
DSBDA_UNIT1
221 pages
Big Data
No ratings yet
Big Data
30 pages
Geie 112 - S19 - LCN - 5
No ratings yet
Geie 112 - S19 - LCN - 5
24 pages
UNIT-1_Big Data and Hadoop
No ratings yet
UNIT-1_Big Data and Hadoop
41 pages
BDA NOTES With Questions Included
No ratings yet
BDA NOTES With Questions Included
108 pages
Introduction to information and big data security
No ratings yet
Introduction to information and big data security
39 pages
Hamid Seminar Doc
No ratings yet
Hamid Seminar Doc
57 pages
BD U1.PDF.crdownload
No ratings yet
BD U1.PDF.crdownload
65 pages
Bda Unit 1
No ratings yet
Bda Unit 1
47 pages
BDA-1
No ratings yet
BDA-1
26 pages
Big Data Basics Unit 1
No ratings yet
Big Data Basics Unit 1
12 pages
Seminar Report BIG DATA
No ratings yet
Seminar Report BIG DATA
28 pages
UNIT- 1_DA_Notes
No ratings yet
UNIT- 1_DA_Notes
51 pages
Big Data Analytics
No ratings yet
Big Data Analytics
21 pages
Unit 1
No ratings yet
Unit 1
17 pages
Prepared By: Asmita Deshmukh
No ratings yet
Prepared By: Asmita Deshmukh
51 pages
NIIT Project-1
No ratings yet
NIIT Project-1
22 pages
big Data
No ratings yet
big Data
21 pages
Introduction To Bda
No ratings yet
Introduction To Bda
67 pages
Unit 1 and Unit 2 notes bda
No ratings yet
Unit 1 and Unit 2 notes bda
11 pages
Big Data
No ratings yet
Big Data
24 pages
Unit-III CC&BD Cs62 Ab
No ratings yet
Unit-III CC&BD Cs62 Ab
85 pages
Unit-1 Final sgs
No ratings yet
Unit-1 Final sgs
24 pages
001 Introduction Big Data
No ratings yet
001 Introduction Big Data
12 pages
Big Data
No ratings yet
Big Data
9 pages
Unit I-Ch 01-Big Data Introduction
No ratings yet
Unit I-Ch 01-Big Data Introduction
40 pages
I Jcs It 2015060405
No ratings yet
I Jcs It 2015060405
6 pages
BDCC Unit 1
No ratings yet
BDCC Unit 1
165 pages
Report of Big Data
No ratings yet
Report of Big Data
14 pages
Bigdatappt
No ratings yet
Bigdatappt
31 pages
unit 1
No ratings yet
unit 1
24 pages
Detailednotes_unit1_Big Data
No ratings yet
Detailednotes_unit1_Big Data
22 pages
1. Data Science
No ratings yet
1. Data Science
54 pages
BDA Unit 1
No ratings yet
BDA Unit 1
68 pages
Fundamentals of Big Data Analytics
No ratings yet
Fundamentals of Big Data Analytics
151 pages
Big Data study 1
No ratings yet
Big Data study 1
77 pages
Introduction To Hadoop
No ratings yet
Introduction To Hadoop
60 pages
Unit I: Chapter 1: Introduction To Big Data
No ratings yet
Unit I: Chapter 1: Introduction To Big Data
35 pages
Unit 5 - Principles of Big Data 2
No ratings yet
Unit 5 - Principles of Big Data 2
14 pages
What Is Big Data
No ratings yet
What Is Big Data
8 pages
Module I Big Data
No ratings yet
Module I Big Data
7 pages
The Age of Big Data: Kayvan Tirdad
No ratings yet
The Age of Big Data: Kayvan Tirdad
26 pages
Big Data-Hadoop
No ratings yet
Big Data-Hadoop
6 pages
Cloud computing
No ratings yet
Cloud computing
86 pages
Lecture 07
No ratings yet
Lecture 07
64 pages
Bda CHP1
No ratings yet
Bda CHP1
83 pages
Akash Decap456 Introduction to Big Data
No ratings yet
Akash Decap456 Introduction to Big Data
297 pages
BDA U1 copy
No ratings yet
BDA U1 copy
78 pages
data evolution unit 1 material.docx
No ratings yet
data evolution unit 1 material.docx
28 pages
The Data Whisperer - Making Sense of Big Data
From Everand
The Data Whisperer - Making Sense of Big Data
Keaton Rivers
No ratings yet
Data Decoded - Understanding Big Data and Its Everyday Applications
From Everand
Data Decoded - Understanding Big Data and Its Everyday Applications
Michael Reed
No ratings yet
Recycling Glass Fibers (M6)
No ratings yet
Recycling Glass Fibers (M6)
2 pages
TFC Coaxial Cable Product Information Sheet Type: T10500MS / 109 MESS
No ratings yet
TFC Coaxial Cable Product Information Sheet Type: T10500MS / 109 MESS
2 pages
(Mastering Fashion Management) Fykaa Caan, Angela Lee - Celebrity Fashion Marketing - Developing
No ratings yet
(Mastering Fashion Management) Fykaa Caan, Angela Lee - Celebrity Fashion Marketing - Developing
145 pages
ECA Job Description
83% (6)
ECA Job Description
2 pages
PT_MAPEH 6 - Q4 V1
No ratings yet
PT_MAPEH 6 - Q4 V1
6 pages
Varactor Diode
No ratings yet
Varactor Diode
3 pages
BBC Media Action - A Practical Guide To Prebunking Misinformation
No ratings yet
BBC Media Action - A Practical Guide To Prebunking Misinformation
36 pages
Socio-Economic Impacts of Tourism in Lumbini, Nepal: A Case Study
No ratings yet
Socio-Economic Impacts of Tourism in Lumbini, Nepal: A Case Study
9 pages
Unit 5
No ratings yet
Unit 5
8 pages
Download Complete Computational Methods in Statistics and Econometrics Statistics a Series of Textbooks and Monographs 1st Edition Hisashi Tanizaki PDF for All Chapters
100% (11)
Download Complete Computational Methods in Statistics and Econometrics Statistics a Series of Textbooks and Monographs 1st Edition Hisashi Tanizaki PDF for All Chapters
67 pages
LB Chapter 1 Numbers and The Number System
No ratings yet
LB Chapter 1 Numbers and The Number System
3 pages
Week of Feb. 7 (Total Viewers)
No ratings yet
Week of Feb. 7 (Total Viewers)
1 page
Congestive Heart Failure: National University College of Pharmacy
No ratings yet
Congestive Heart Failure: National University College of Pharmacy
7 pages
DNA App AIIMS JUNE 2020 (Image Based Recall) PDF
No ratings yet
DNA App AIIMS JUNE 2020 (Image Based Recall) PDF
27 pages
SANS 504: What Is Incident Handling ?
No ratings yet
SANS 504: What Is Incident Handling ?
22 pages
Mommy Urts
No ratings yet
Mommy Urts
3 pages
About Ear
No ratings yet
About Ear
7 pages
LS1_G6_PSW_Lesson_1-2
No ratings yet
LS1_G6_PSW_Lesson_1-2
7 pages
GPS BASED ONLINE HOUSE RENTAL MANAGEMENT PROJECTall Onesent
50% (2)
GPS BASED ONLINE HOUSE RENTAL MANAGEMENT PROJECTall Onesent
52 pages
The-Pleasures-of-Ignorance
No ratings yet
The-Pleasures-of-Ignorance
128 pages
Class 8 Science
No ratings yet
Class 8 Science
5 pages
Kerger2012 0316
No ratings yet
Kerger2012 0316
254 pages
1660 ML Parts
No ratings yet
1660 ML Parts
22 pages
Marinef 240 T - GCUBE
100% (1)
Marinef 240 T - GCUBE
21 pages
Release
No ratings yet
Release
2 pages
Exhaust System Design
No ratings yet
Exhaust System Design
4 pages
DOJ's Motion To Stay - 10.28.19
No ratings yet
DOJ's Motion To Stay - 10.28.19
9 pages
Electrical Smart Bicycle Sizing, Design and Prototyping
No ratings yet
Electrical Smart Bicycle Sizing, Design and Prototyping
36 pages
Diagnostics Exam 2
No ratings yet
Diagnostics Exam 2
6 pages

Unit - 1

Uploaded by

Unit - 1

Uploaded by

Introduction to Big

4. The emergence of Big data:

6. Artificial Intelligence and Machine Learning:

Additional characteristics that are important to consider:

You might also like