0% found this document useful (0 votes)
37 views4 pages

Easy ML Complete Notes 2

The document provides comprehensive notes on data representation and supervised machine learning in the context of machine learning. It covers the importance of data, types of data, data preparation processes, and various algorithms used for regression and classification tasks. Additionally, it includes examples and challenges associated with data handling and model training.

Uploaded by

umakanttc594
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views4 pages

Easy ML Complete Notes 2

The document provides comprehensive notes on data representation and supervised machine learning in the context of machine learning. It covers the importance of data, types of data, data preparation processes, and various algorithms used for regression and classification tasks. Additionally, it includes examples and challenges associated with data handling and model training.

Uploaded by

umakanttc594
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Machine Learning - Easy & Complete Notes

UNIT 2: DATA REPRESENTATION

1. Introduction to Data in ML

- Data = Raw facts (numbers, text, images, etc).

- Without data, machine learning models cannot be trained.

Example: Facebook bought WhatsApp mainly to access user data and improve services.

2. Why is Data Important?

- Good data = Good model.

- Data improves AI, Automation, and Analytics.

3. Properties of Data

- Volume, Variety, Velocity, Value, Veracity, Viability, Security, Accessibility, Integrity, Usability.

4. Types of Data (Based on Structure)

- Structured: Tables (Eg: Excel files, SQL database)

- Unstructured: No fixed format (Eg: Videos, Tweets)

- Semi-structured: Mix of both (Eg: JSON files, XML)

5. Types of Data (Based on Representation)

- Numerical Data: Numbers like Age, Income.

- Categorical Data: Categories like Male/Female.

- Ordinal Data: Ordered categories like Size (S, M, L).

6. Types of Data (Based on Labelling)

- Labelled: Input + Correct Output (Eg: Dog Image + 'Dog' label)

- Unlabelled: Only Input (Eg: Dog Image without label)

7. Data to Information to Knowledge (Example)

- Data: Raw survey responses.


Machine Learning - Easy & Complete Notes

- Information: Summary reports.

- Knowledge: Use information to improve services.

8. How Data is Split

- Training Set: Model learns here.

- Validation Set: Model tuned here.

- Testing Set: Model checked here.

9. Advantages of Using Data

- Better Accuracy, Automation, Personalization, Cost-saving.

10. Challenges with Data

- Poor quality data, Small data, Bias, Overfitting, Privacy issues.

11. Importance of Data Preparation

- Clean data improves model predictions.

12. Data Preparation Process

- Define problem -> Collect data -> Clean -> Analyze -> Feature engineer -> Train -> Evaluate -> Deploy ->

Monitor.

13. Handling Missing Data

- Fill with Mean/Median/Mode or predict with KNN.

14. Example (Handling Missing Data)

import pandas as pd

df = pd.read_csv("data.csv")

df.isnull().sum()

df.fillna(df.mean(), inplace=True)

15. Visualizing Data


Machine Learning - Easy & Complete Notes

- Bar Chart, Pie Chart, Line Plot, Scatter Plot, Heatmap (Use Seaborn).

UNIT 3: SUPERVISED MACHINE LEARNING

1. What is Supervised Learning?

- Learning from labelled data.

- Example: Spam email detection.

2. How it Works

- Train model on inputs and correct outputs.

Example: Input = Shape with 4 equal sides -> Output = Square.

3. Types of Supervised Learning

- Regression (Predict numbers)

- Classification (Predict class/label)

4. Popular Regression Algorithms

- Linear, Polynomial, Ridge, Lasso, Decision Tree, Random Forest Regression.

5. Popular Classification Algorithms

- Logistic Regression, SVM, Decision Tree, Random Forest, KNN, Neural Networks, Naive Bayes.

6. Regression vs Classification

- Regression: Predict numbers (Eg: Salary prediction).

- Classification: Predict class (Eg: Spam detection).

7. Algorithms Quick Summary

- Linear Regression: Predict house price.

- Logistic Regression: Spam detection.

- Decision Tree: Loan approval.


Machine Learning - Easy & Complete Notes

- Random Forest: Disease detection.

- KNN: Movie recommendation.

- SVM: Digit recognition.

- Naive Bayes: Spam filter.

8. Example: KNN

- Find nearest 5 neighbors -> Pick most common class.

9. Example: Decision Tree

- Ask questions -> Split data -> Predict outcome.

10. Example: Naive Bayes

- Predict based on probability.

11. Logistic Regression

- Used for binary classification.

12. Random Forest

- Combines multiple decision trees to improve accuracy.

13. Simple Formulae

- Linear Regression: y = mx + b

- Logistic Regression: Uses sigmoid function (Probability output).

You might also like