0% found this document useful (0 votes)
3 views4 pages

AWS Certified AI Practitioner DAY-2

Uploaded by

adefault720
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views4 pages

AWS Certified AI Practitioner DAY-2

Uploaded by

adefault720
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Session Topic: ML Problem Types & Use Cases

1. Introduction to Machine Learning Problem Types


Machine Learning problems can be broadly categorized based on the type of task and the
nature of data. The four major problem types are Classification, Regression, Clustering,
and Reinforcement Learning (RL). Understanding these is critical for choosing the correct
algorithm and approach in real-world scenarios.

2. Classification

 Definition: Classification is a supervised learning technique where the goal is to


predict categorical outcomes (discrete labels) based on input features.
 Examples:
o Email spam detection (spam or not spam)
o Sentiment analysis (positive, negative, neutral)
o Fraud detection (fraudulent or legitimate transaction)
 Common Algorithms: Logistic Regression, Decision Trees, Random Forest, Support
Vector Machines, Neural Networks

3. Regression

 Definition: Regression is a supervised learning task used to predict a continuous


numeric value based on input data.
 Examples:
o Predicting house prices based on size, location, etc.
o Forecasting stock prices
o Estimating sales revenue
 Common Algorithms: Linear Regression, Ridge/Lasso Regression, Gradient
Boosting, Neural Networks

4. Clustering

 Definition: Clustering is an unsupervised learning method used to group similar


data points into clusters without predefined labels.
 Examples:
o Customer segmentation for marketing
o Grouping documents by topics
o Image segmentation
 Common Algorithms: K-Means, DBSCAN, Hierarchical Clustering
5. Reinforcement Learning (RL)

 Definition: RL is a learning paradigm where an agent learns to make decisions by


interacting with an environment to maximize cumulative rewards.
 Examples:
o Game playing (e.g., AlphaGo)
o Autonomous driving
o Robotics control systems
 Key Components: Agent, Environment, Actions, Rewards, Policy
 Common Algorithms: Q-Learning, Deep Q-Network (DQN), Policy Gradient

Business Use Cases

 Classification: Fraud detection in banking


 Regression: Sales forecasting for retail businesses
 Clustering: Market segmentation in e-commerce
 RL: Dynamic pricing strategies in airlines and hotels

Activity – Case Study Walkthrough

 Case Study:
o Scenario: A retail company wants to understand customer behavior, forecast
sales, and detect fraudulent transactions.
o Step 1: Apply Clustering to segment customers based on purchase behavior.
o Step 2: Use Regression for predicting next month’s sales.
o Step 3: Use Classification for fraud detection in transactions.
 Discussion Questions:
1. Which algorithm would you choose for each task?
2. What data would you need?

How do these approaches help in decision-making?

Practical:
STEP 1 – CLUSTERING
# Install dependencies
!pip install scikit-learn matplotlib pandas --quiet

import pandas as pd
import numpy as np
from sklearn.cluster import KMeans
import matplotlib.pyplot as plt

# Generate sample data: [Annual Spending, Purchase Frequency]


np.random.seed(42)
data = np.random.rand(100, 2) * 100
df = pd.DataFrame(data, columns=['Annual Spending', 'Purchase
Frequency'])

# Apply K-Means clustering


kmeans = KMeans(n_clusters=3, random_state=42)
df['Cluster'] = kmeans.fit_predict(df[['Annual Spending', 'Purchase
Frequency']])

# Visualize clusters
plt.figure(figsize=(6, 4))
plt.scatter(df['Annual Spending'], df['Purchase Frequency'],
c=df['Cluster'], cmap='viridis')
plt.xlabel("Annual Spending ($)")
plt.ylabel("Purchase Frequency")
plt.title("Customer Segmentation")
plt.show()

STEP 2 – REGRESSION
from sklearn.linear_model import LinearRegression

# Generate sample monthly sales data


months = np.array(range(1, 13)).reshape(-1, 1)
sales = np.array([200, 220, 250, 270, 300, 320, 350, 370, 400, 420,
450, 470])

# Train model
model = LinearRegression()
model.fit(months, sales)

# Predict for month 13


next_month_sales = model.predict([[13]])
print(f"Predicted sales for month 13: ${next_month_sales[0]:.2f}")

# Visualization
plt.figure(figsize=(6, 4))
plt.scatter(months, sales, color='blue', label="Actual Sales")
plt.plot(months, model.predict(months), color='red', label="Regression
Line")
plt.scatter(13, next_month_sales, color='green', label="Predicted
Sales")
plt.xlabel("Month")
plt.ylabel("Sales ($)")
plt.title("Sales Forecasting")
plt.legend()
plt.show()
STEP 3 – CLASSIFICATION
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report

# Generate sample transaction data


np.random.seed(42)
amounts = np.random.rand(100, 1) * 500
fraud_labels = (amounts > 400).astype(int).ravel() # Fraud if amount >
$400

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(amounts,
fraud_labels, test_size=0.2, random_state=42)

# Train classifier
clf = LogisticRegression()
clf.fit(X_train, y_train)

# Predictions
y_pred = clf.predict(X_test)

print("Classification Report:")
print(classification_report(y_test, y_pred))

You might also like