Machine Learning
Machine Learning
It is the field of study that gives the computer the ability to learn without being explicitly
programmed . i.e. it describes how computers perform tasks on their own by previous
experience .
e.g. when we open amazon/flipkart for the first time we search some particular category of
product , 2nd time when we login again on the homescreen we can see products based on
our previous search (i.e. machine automatically sent you the suggestion )
Absorbing the search , absorbing your choices , the machine will learn something . the
machine will draw a pattern / get a pattern and according to that pattern from the second
time onwards it will be automatically displayed to you.
ML example : email spam recognition , spell check , youtube video recommendation
Traditional programmes : machine will work based on the instruction given by you
Suppose a machine has to predict if a customer will buy a particular product or not ? say
antibiotics , if he buys antibiotics every year, then there is a high probability that the
customer is going to buy antibiotics this year as well. This is how machine learning works at
the basic conceptual level.
Some real life example where ML has been used -
UBER : different pricing in real time based on - 1.Demands 2. Number of cars available 3. Bad
Weather 4.Rush Hour . It uses a predictive model to predict where the demand will be high.
Application Of ML:
1.Virtual personal assistants : google assistants , alexa , cortana , siri
They record what you are saying ..send it to a server which is usually a cloud . Decode it
with the help of machine learning and neural network And give you o/p
This system could not work without internet that's because the server couldn't be contacted
2.Traffic prediction: Dum dum to sealdah ,Here we have the path to take to get to sealdah
Here the map is a combination of red yellow and blue
Blue regions signify a clear road that is you wont encountered a traffic there
Yellow indicate slightly congested And red means heavily congested
So google maps predicts where the traffic is clear , slow moving or highly congested , based
on two measures:
Average time taken on specific date at specific times on that route
Real time location data of vehicles from google maps applications and sensors
Some others popular map services - bing maps , MAPS.ME , HERE WeGo
3. Social media personalisation: say i want to buy a smart watch on amazon thats cost
something like 3k so i don't want to buy it right now, next time i went fb and see an
advertisement of that product , even on youtube ,instagram i see the same advertisement of
that kind of product .. so here with the help of machine learning google understand that i
am interested i am interested in that particular product hence it targeted me with this
advertisement
4. Email spam filtering: Now there is a spam in my inbox .so how does gmail know what is
spam and what's not spam , so gmail has entire collection of emails which has already been
labelled as spam/not spam..so after analysing this data gmail able to finds some
characteristics ..so that any new email comes to your inbox goes to those spam filters
GMAIL is one of the many popular email providers who have inbuilt spam filters.
Spam filters are of the following types
1.Content filters
2.Header filters
3.General blacklist filters
4.Rules based filters
5.Permission filters
6.Challenge response filters
5.Online Fraud Detection :
Fraud risk :
1.Identify theft : where they steal your identity
2.Fake accounts: this accounts exist how long transaction takes place and stop existing after
that
3.Man in the middle attacks: they steal your money while the transaction is getting placed
How to make machines intelligent ? when it learns by itself . so There have three different
types of learning - supervised , unsupervised and reinforcement
Supervised Learning:
Supervised learning used labelled data to train the model . labelled data means the o/p is
already known to you. The model just needs to map the inputs to the outputs.
Ex: to train a machine that identify the image of an animal ,coins of differents currencies
Training data means i/p and o/p , we have supervisor/ teacher here in the form of training
data , teacher gives instruction . so training data means we have both i/p and o/p and based
on this i/p and o/p we create a model and check if we can get valid o/p or not
Supervised learning is when the model is getting trained on a labelled dataset. A labelled
dataset is one that has both input and output parameters. Learning of both training and
validation . While training the model, data is usually split in the ratio of 80:20 i.e. 80% as
training data and the rest as testing data. In training data, we feed input as well as output for
80% of data. The model learns from training data only .Once the model is ready then it is
good to be tested. At the time of testing, the input is fed from the remaining 20% of data
that the model has never seen before, the model will predict some value and we will
compare it with the actual output and calculate the accuracy.
Unsupervised Learning:
Unsupervised learning uses unlabeled data to train machines . unlabeled data means there
is no fixed output variable the model learns from the data , discovering patterns and
features in the data returns the output .
Ex: here machine uses images of vehicles to classify if its a bus or a truck . so the model
learns by identifying the parts of the vehicle such as length and width of the vehicle , the
frontend and rear end covers , roof heads , types of wheels used etc.. based on these
features the model classifies the vehicles as bus or truck
Example - pk movie
Reinforcement Learning:
Reinforcement learning is a feedback-based learning method, in which a learning
agent gets a reward for each right action and gets a penalty for each wrong action.
The agent learns automatically with these feedbacks and improves its performance.
In reinforcement learning, the agent interacts with the environment and explores it.
The goal of an agent is to get the most reward points, and hence, it improves its
performance.
1. RL in Marketing
2. RL in Broadcast Journalism
3. RL in Healthcare
4. RL in Robotics
5. RL in Gaming
6. RL in Image Processing
7. RL in Manufacturing
Supervised Learning
1. Linear regression
2. Logistic regression
3. Support vector machines
4. K nearest neighbour
5. Decision tree
6. Random forest
7. Naive bayes
Unsupervised Learning
1. K-means clustering
2. Hierarchical clustering
3. DBSCAN
4. Principal component analysis
Choosing the right algorithm depending on the type of problem you are trying to solve
Reinforcement Learning
1. Q-learning
2. Monte carlo
3. Sarsa
4. Deep Q network
Data preprocessing in ml:
○ Importing libraries
○ Importing datasets
○ Feature scaling
Regression
Regression is used when o/p value is real / continuous. Regression algorithms are used to
predict the continuous values such as price, salary, age, Temperature etc.
Example: Suppose we want to do weather forecasting, so for this, we will use the Regression
algorithm. In weather prediction, the model is trained on the past data, and once the
training is completed, it can easily predict the weather for future days.
Classification :
Classification algorithms are used to predict/Classify the discrete values such as 0 or 1 , Yes
or No , Male or Female, True or False, Spam or Not Spam,
Example: The best example to understand the Classification problem is Email Spam
Detection. The model is trained on the basis of millions of emails on different parameters,
and whenever it receives a new email, it identifies whether the email is spam or not. If the
email is spam, then it is moved to the Spam folder.
The task of the regression algorithm is The task of the classification algorithm is to map
to map the input value (x) with the the input value(x) with the discrete output
continuous output variable(y). variable(y).
Regression Algorithms are used with Classification Algorithms are used with discrete
continuous data. data.
In Regression, we try to find the best In Classification, we try to find the decision
fit line, which can predict the output boundary, which can divide the dataset into
more accurately. different classes.