0% found this document useful (0 votes)
55 views12 pages

Big Data & Machine Learning Guide

documento

Uploaded by

Edupo Palacio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views12 pages

Big Data & Machine Learning Guide

documento

Uploaded by

Edupo Palacio
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Step 2 Big Data Analytics and Machine Learning

Presenta:

Duban Olmedo Palacio Osorio

Curso:

BIG DATA INTEGRATION

Código: 203008077A_1702

Profesor:

Gean Alberto Diaz Sepúlveda

Escuela de Ciencias Básicas, Tecnología e Ingeniería

Universidad Nacional Abierta y a Distancia UNAD

Medellín Mayo 30, 2024

1
Introduction

In today’s world, the amount of data generated is growing exponentially. From


financial transactions to social media posts, IoT sensors, and medical records, we are
inundated with information. Big Data refers to the management and analysis of these
massive volumes of data, while Machine Learning allows us to extract knowledge
and make decisions based on that data.

Despite using different methods, Machine Learning and Big Data complement each
other. Machine Learning leverages Artificial Intelligence to enable computers to
learn from the data processed by Big Data. Together, they automate tasks and enable
more precise decision-making

2
Activity 1. Conceptual Map
For the development of this exercise, it is necessary to review the references in the
Learning Environment (Unit 1 - Historical interpretation and review of Big
DataContents and bibliographic references)
After reviewing the suggested references, the student must make a conceptual map
with the following concepts:
• Business Analytics.
• Data and Statistical Methods.
For the construction of the conceptual map, tools such as Cmaptools, GoCongr,
PowerPoint, among others, can be used; then, this conceptual map has to be shared
in the discussion forum.

3
Activity 2. Description of Data domain
Using an illustrative scheme, you should portray a Venn Diagram of the 5 Vs
attributes of Big Data, including the following points for statistics domain:
• Description of Data processing.
• Data analysis.
• Data visualization.

4
Activity 3. Description of Data training, validation, and test
Taking into account the bibliographic references and others sources, the student
must create a presentation of 3 slides including the explanation of the Data training,
validation and test.

5
6
7
Activity 4. The distinction of Machine Learning for computer processing
Based on the references, you have to make a comparison chart where you include
and explain the importance of Machine Learning for computer processing

(Unsupervised, Supervised and Reinforcement Learning).

Learning Type Description Examples of Use


Supervised Learning In supervised learning, the Predicting house prices
algorithm is trained using a based on features
labeled dataset where the (regression). - Email
target variable is known. classification as spam or not
The goal is to predict or spam (classification).
classify new data based on
observed features
Unsupervised Learning In unsupervised learning, Customer segmentation
the algorithm works with based on shopping habits
unlabeled data and seeks (clustering). -
hidden patterns or structures Dimensionality reduction
within the data. No prior for data visualization
correct answer is provided. (PCA).
Reinforcement Learning In reinforcement learning, Training agents to play
the learning agent interacts video games (e.g.,
with an environment and AlphaGo). - Optimizing
takes actions to maximize drone delivery routes
cumulative reward. It is
based on the trial-and-error
concept.

8
Activity 5. Pass and obtain accreditation Big Data 101, for the IBM
certification.
For the development of this exercise, it is necessary to review the references in the
Learning Environment (Unit 1 - Historical interpretation and review of Big
DataContents and bibliographic references).
After reviewing the suggested references, the student will go on the Cognitive Class
platform
as a continuation of your academic progression. The task involves your enrollment
in a course
that builds upon the concepts and lessons covered in our prior learning guide
activities. By
successfully completing this course, you will not only exhibit your comprehensive
understanding of the material but also obtain the esteemed IBM certification,
symbolizing your
mastery in this domain.

9
Activity 6. Socialization in the Forum
You must share the development of Activity 5 in the forum and provide feedback on
the Exercise that some classmates shared too.

1
0
Conclusions

Interconnected Areas:
Machine learning and big data are closely interconnected and play a fundamental
role in data science and business decision-making.
Big data refers to the management and analysis of massive volumes of data from
various sources, while machine learning enables knowledge extraction and informed
decision-making based on that data.
Understanding the 5 V’s of Big Data:
Big data is characterized by the 5 V’s: Volume, Variety, Velocity, Veracity, and
Value.
Machine learning processes large volumes of data and identifies patterns and trends,
adding value to the information.
Complementary Methods:
Despite using different methods, machine learning and big data complement each
other.
Machine learning leverages artificial intelligence to enable computers to learn from
processed big data.
Together, they automate tasks and enhance decision precision.
Hadoop and MapReduce:
Hadoop, an open-source platform, processes distributed data in server clusters.
MapReduce, used in Hadoop, enables parallel data processing and analysis.
Cloud Computing and Big Data:
Cloud computing provides scalable and flexible resources for data storage and
processing.
Cloud-based solutions (e.g., Google File System, HDFS) are essential for big data
processing.
In summary, machine learning and big data are essential for addressing information
challenges in today’s digitalized world. Analyzing massive data and automating
tasks drive business success.

1
1
Bibliografía

 Roma, J. C., Guerrero, J. N., Julbe, F., & Carrera, D. (2019). Big data:
análisis de datos en entornos masivos. Editorial UOC.
 Holmes, D. E. (2018). Big Data: una breve introducción. Antoni Bosch
editor.
 Chen, Y., Argentinis, J. E., & Weber, G. (2016). IBM Watson: how
cognitive computing can be applied to big data challenges in life sciences
research. Clinical therapeutics, 38(4), 688-701.
 López Murphy, J. J. (2017). La ingeniería del big data: cómo trabajar con
datos.
 Ríos Insua, D. (2019). Big data: Conceptos, tecnologías y aplicaciones.
 Sedkaoui, S. (2018). Data analytics and big data. John Wiley & Sons.

1
2

You might also like