0% found this document useful (0 votes)

17 views60 pages

M2 AI Chap1 Neural-Network

The document outlines the course objectives and content for an Advanced Machine Learning class at Université Blida 1, focusing on complex data science problems using various learning techniques such as reinforcement learning, deep learning, and transfer learning. It explains key concepts in artificial intelligence and machine learning, including different learning approaches, models, and the architecture of neural networks. Additionally, it discusses the principles of training neural networks and the importance of monitoring performance during training.

Uploaded by

pfemalek9

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views60 pages

M2 AI Chap1 Neural-Network

Uploaded by

pfemalek9

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 60

Université Blida 1

Faculté des Sciences

Département d’Informatique
ISI (Ingénierie des Systèmes Intelligents)

Advance Machine
Learning
Remmide
2024/2025
Course Objectives
This subject covers advanced learning concepts to address
complex problems in data science using :
● data streams
● incremental and constructive learning,
● reinforcement learning
● complex neural networks
● deep learning
● multi-task learning and
● transfer learning between domains.
Course content

01 Neural Network

02 CNN

03 RNN

04 Transfer Learning

05 GAN

06 Reinforcement Learning
Université Blida 1
Faculté des Sciences
Département d’Informatique
ISI (Ingénierie des Systèmes Intelligents)

Introduction to
Advance Machine
Learning
Remmide
2024/2025
Artificial intelligence

Artificial Intelligence (AI), noun: The field of computer science focused

on creating systems capable of performing tasks that typically require
human intelligence. This encompasses the theory, development, and
deployment of algorithms and models that can:

1. Process and analyze complex data

2. Learn from experience and adapt behavior
3. Recognize patterns and make predictions
4. Understand and generate natural language
5. Perceive and interpret visual information
6. Make decisions and solve problems
7. Engage in reasoning and planning
Machine Learning

Machine Learning (ML), noun: A subset of artificial intelligence that

develops algorithms and statistical models enabling computer systems
to improve their performance on a specific task through experience,
without explicit programming. ML systems learn patterns from data to
make predictions, decisions, or generate insights.

Key characteristics:

1. Data-driven approach to problem-solving

2. Ability to automatically adapt and improve with exposure to more
data
3. Focus on creating models that can generalize from examples
4. Emphasis on statistical and probabilistic methods
Machine Learning

Most of them follow the same general structure:

Learning from examples
3 main ingredients

1. Training set / examples:

{x1, x2, . . . , xN }

2. Machine or model:

x → f(x; θ) | {z } function / algorithm → y |{z}

prediction θ: parameters of the model

3. Loss, cost, objective function / energy:

argmin θ E(θ; x1, x2, . . . , xN )

Learning from examples
Tools: ( Data ↔ Statistics Loss ↔ Optimization

Goal: to extract information from the training set

● relevant for the given task,

● relevant for other data of the same kind.
Terminology

● Sample (Observation or Data): item to process (e.g., classify).

Example: an individual, a document, a picture, a sound, a video. . .
● Features (Input): set of distinct traits that can be used to describe
each sample in a quantitative manner. Represented as a
multi-dimensional vector usually denoted by x. Example: size,
weight, citizenship, . . .
● Training set: Set of data used to discover potentially predictive
relationships.
● Validation set: Set used to adjust the model hyperparameters.
● Testing set: Set used to assess the performance of a model.
● Label (Output): The class or outcome assigned to a sample. The actual
prediction is often denoted by y and the desired/targeted class by d or t.
Example: man/woman, wealth, education level, . . .
Learning approaches

● Unsupervised learning: Discovering patterns in unlabeled data.

Example: cluster similar documents based on the text content.
● Supervised learning: Learning with a labeled training set.
Example: email spam detector with training set of already labeled
emails.
● Semisupervised learning: Learning with a small amount of
labeled data and a large amount of unlabeled data. Example: web
content and protein sequence classifications.
● Reinforcement learning: Learning based on feedback or reward.
Example: learn to play chess by winning or losing.
Learning approaches
Machine learning workflow
Problem types
ML Models
Linear Regression:
A supervised learning model that assumes a linear relationship between input features and a
continuous output. It finds the best-fit line by minimizing prediction errors, serving as a simple
but interpretable model.

Logistic Regression:
A classification model that predicts the probability of class membership using a logistic function.
It's commonly used for binary classification tasks and produces easy-to-interpret, probabilistic
outputs.

Decision Trees:
A model that splits data into if-then rules based on feature values, forming a tree structure. It
works for both classification and regression, is easy to understand, but can overfit if too deep.
ML Models
Random Forests:
An ensemble of decision trees built on random data subsets to prevent overfitting. Random
forests offer better accuracy and feature importance insights, handling high-dimensional data
well.

Support Vector Machines (SVM):

SVMs find the optimal hyperplane to separate classes in high-dimensional space. Using kernels,
they handle non-linear classification effectively, particularly when there are more features than
samples.

k-Nearest Neighbors (k-NN):

An instance-based algorithm that predicts based on the majority class (classification) or average
value (regression) of the k nearest data points. It’s intuitive but can be slow on large datasets.
ML Models

K-Means Clustering:
An unsupervised algorithm that groups data into k clusters based on similarity. It’s simple and
scalable but requires choosing the number of clusters in advance and assumes similar-sized
clusters.
Neural Networks:
Inspired by the brain, neural networks consist of layers of nodes that learn complex patterns.
They are powerful for tasks like image recognition and natural language processing.
What is deep learning?
● Part of the machine learning field of learning representations of data. Exceptionally
effective at learning patterns.
● Utilizes learning algorithms that derive meaning out of data by using a hierarchy of
multiple layers that mimic the neural networks of our brain.
● If you provide the system tons of information, it begins to understand it and respond in
useful ways.
● Rebirth of artificial neural networks
Actors and applications
● Very active technology adopted by big actors
● Success story for many different academic problems
○ Image processing
○ Computer vision
○ Speech recognition
○ Natural language processing
○ Translation
○ etc
● Today all industries wonder if DL can improve their process
Timeline of (deep) learning
Limitations of Linear Classifiers
● Linear classifiers (e.g., logistic regression)
● classify inputs based on linear combinations of features xi
● Many decisions involve non-linear functions of the input
● Canonical example: do 2 input elements have the same value?
● The positive and negative cases cannot be separated by a plane
How to Construct Nonlinear Classifiers?
● We would like to construct non-linear discriminative classifiers that utilize functions of
input variables
● Use a large number of simpler functions
○ If these functions are fixed (Gaussian, sigmoid, polynomial basis functions), then
optimization still involves linear combinations of (fixed functions of) the inputs
○ Or we can make these functions depend on additional parameters → need an
efficient method of training extra parameters
A simple decision
Say you want to decide whether you are going to attend a cheese festival this upcoming
weekend. There are three variables that go into your decision:

1. Is the weather good?

2. Does your friend want to go with you?
3. Is it near public transportation?

We’ll assume that answers to these questions are the only factors that go into your decision.
A simple decision
I will write the answers to these question as binary variables xi with zero being the answer ‘no’
and one being the answer ‘yes’:

1. Is the weather good? x1

2. Does your friend want to go with you? x2
3. Is it near public transportation? x3

Now, what is an easy way to describe the decision statement resulting from these inputs.
A simple decision
We could determine weights wi indicating how important each feature is to whether you would
like to attend. We can then see if:

For some predetermined threshold. If this statement is true, we would attend the

festival, and otherwise we would not.

A simple decision
For example, if we really hated bad weather but care less about going with our friend and public
transit, we could pick the weights 6, 2 and 2.

With a threshold of 5, this causes us to go if and only if the weather is good.

What happens if the threshold is decreased to 3? What about if it is decreased to 1?

A simple decision
If we define a new binary variable y that represents whether we go to the festival, we can write
this variable as:
A simple decision
Now, if I rewrite this in terms of a dot product between the vector of of all binary inputs (x), a
vector of weights (w), and change the threshold to the negative bias (b), we have:

So we are really just finding separating hyperplanes again, much as we did with logistic
regression and support vector machines!
A perceptron
We can graphically represent this decision algorithm as an object that takes 3 binary inputs and
produces a single binary output:

This object is called a perceptron when using the type of weighting scheme we just developed
A network of perceptrons
A perceptron takes a number of binary inputs and emits a binary output. Therefore it is easy to
build a network of such perceptrons, where the output from some perceptrons are used in the
inputs of other perceptrons:

Notice that some perceptrons seem to have multiple output arrows, even though we have
defined them as having only one output. This is only meant to indicate that a single output is
being sent to multiple new perceptrons.
A network of perceptrons
The input and outputs are typically represented as their own neurons, with the other neurons
named hidden layers
A network of perceptrons
The biological interpretation of a perceptron is this: when it emits a 1 this is equivalent to ‘firing’
an electrical pulse, and when it is 0 this is when it is not firing. The bias indicates how difficult it is
for this particular node to send out a signal.
Inspiration: The Brain
● Many machine learning methods inspired by biology, e.g., the (human) brain
● Our brain has ∼ 1011 neurons, each of which communicates (is connected) to ∼ 104 other
neurons
Principle
1. Data are represented as vectors:

2. Collect training data with positive and negative examples:

Principle
3. Training: find w and b so that:

<w, x> + b is positive for positive samples x,

<w, x> + b is negative for negative samples x.

Principle
3. Training: find w and b so that:

<w, x> + b is positive for positive samples x,

<w, x> + b is negative for negative samples x.

The equation <w, x> + b = 0 defines a

hyperplane.
The hyperplane acts as a linear

separator.
w is a normal vector to the hyperplane.
Principle
4. Testing: the perceptron can now classify new examples.
Principle
4. Testing: the perceptron can now classify new examples.
● A new example x is classified positive if <w, x> + b is positive,
Principle
4. Testing: the perceptron can now classify new examples.
● A new example x is classified positive if <w, x> + b is positive,
● and negative if hw, xi + b is negative.
Mathematical Model of a Neuron
● Neural networks define functions of the inputs (hidden features), computed by neurons
● Artificial neurons are called units
Activation Functions
Most commonly used activation functions:

Sigmoid:

Tanh:

ReLU (Rectified Linear Unit):

Neuron in Python
Example in Python of a neuron with a sigmoid activation function
Neural Network Architecture (Multi-Layer
Perceptron)
Network with one layer of four hidden units:

Each unit computes its value based on linear combination of values of units that point into it,
and an activation function
Neural Network Architecture (Multi-Layer
Perceptron)
Naming conventions; a 2-layer neural network: (shallow network)

● One layer of hidden units

● One output layer (we do not count the inputs as a layer)
Neural Network Architecture (Multi-Layer
Perceptron)
Going deeper: a 3-layer neural network with two layers of hidden units
Neural Network Architecture (Multi-Layer
Perceptron)
Naming conventions; a N-layer neural network: (deep network),

● N − 1 layers of hidden units

● One output layer
Representational Power
Neural network with at least one hidden layer is a universal approximator (can represent any
function).

The capacity of the network increases with more hidden units and more hidden layers
Neural Networks
We only need to know two algorithms

● Forward pass: performs inference

● Backward pass: performs learning
Forward Pass: What does the Network
Compute?
● Output of the network can be written as:

(j indexing hidden units, k indexing the output units, D number of inputs)

● Activation functions f , g: sigmoid/logistic, tanh, or rectified linear (ReLU)

Forward Pass in Python
Example code for a forward pass for a 3-layer network in Python:
Special Case
What is a single layer (no hiddens) network with a sigmoid act. function?
Special Case
Network:

Logistic regression!
Example Application
● Classify image of handwritten digit (32x32 pixels): 4 vs non-4

● How would you build your network?

● For example, use one hidden layer and the sigmoid activation function:

● How can we train the network, that is, adjust all the parameters w?
Training Neural Networks
● Find weights:

where o = f (x; w) is the output of a neural network

● Define a loss function, eg:

○ Squared loss:
○ Cross-entropy loss:
● Gradient descent:

where η is the learning rate (and E is error/loss)

Training Neural Networks: Back-propagation
Back-propagation: an efficient method for computing gradients needed to perform
gradient-based optimization of the weights in a multi-layer network
Monitor Loss During Training
Check how your loss behaves during training, to spot wrong hyperparameters, bugs, etc
Monitor Accuracy on Train/Validation During
Training
Check how your desired performance metrics behaves during training
Why ”Deep”?
Convolutional neural networks inside our brains

Human Vision <-> many layers of abstraction <-> Deep learning

Machine Learning Course Overview
No ratings yet
Machine Learning Course Overview
51 pages
Presentation On ML
No ratings yet
Presentation On ML
469 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
48 pages
Mlfa Autumn 22 Lec 01
No ratings yet
Mlfa Autumn 22 Lec 01
43 pages
Data Science & ML Course Guide
No ratings yet
Data Science & ML Course Guide
83 pages
Unit 3
No ratings yet
Unit 3
62 pages
Machine Leaning 1 Unit
No ratings yet
Machine Leaning 1 Unit
10 pages
Maharana Pratap Group of Institutions, Mandhana, Kanpur: Department of Computer Science Engineering)
No ratings yet
Maharana Pratap Group of Institutions, Mandhana, Kanpur: Department of Computer Science Engineering)
115 pages
Ai - Foundations of Machine Learning I
No ratings yet
Ai - Foundations of Machine Learning I
40 pages
Lecture Notes 2016
No ratings yet
Lecture Notes 2016
132 pages
Ai - Foundations of Machine Learning I
No ratings yet
Ai - Foundations of Machine Learning I
39 pages
ML Module 1
No ratings yet
ML Module 1
52 pages
chapter5-AI Approaches
No ratings yet
chapter5-AI Approaches
52 pages
4 Ai ML - 2
No ratings yet
4 Ai ML - 2
31 pages
LN ML Rug
No ratings yet
LN ML Rug
267 pages
ML 01
No ratings yet
ML 01
44 pages
Fundamentals of Machine Learning II
No ratings yet
Fundamentals of Machine Learning II
13 pages
Intro to Machine Learning Concepts
100% (1)
Intro to Machine Learning Concepts
58 pages
Unsupervised Learning Overview
No ratings yet
Unsupervised Learning Overview
32 pages
Ch3-Machine Learning
No ratings yet
Ch3-Machine Learning
124 pages
Module2 ch2
No ratings yet
Module2 ch2
36 pages
Tirth PDF
No ratings yet
Tirth PDF
19 pages
Week-12 - Introduction To ML-NN-CNN
No ratings yet
Week-12 - Introduction To ML-NN-CNN
45 pages
Maths For ML
No ratings yet
Maths For ML
156 pages
Matematics and Machine Learning
No ratings yet
Matematics and Machine Learning
156 pages
Asset-V1 MKAU+SEng9032+DEV 01+type@asset+block@ChapOne
No ratings yet
Asset-V1 MKAU+SEng9032+DEV 01+type@asset+block@ChapOne
29 pages
An Introduction To Machine Learning
No ratings yet
An Introduction To Machine Learning
136 pages
Topic 1
No ratings yet
Topic 1
39 pages
Introduction To ML
No ratings yet
Introduction To ML
48 pages
UNIT I-Part 1
No ratings yet
UNIT I-Part 1
52 pages
Day 2 Part 1
No ratings yet
Day 2 Part 1
52 pages
cp4252 Machine Learning
100% (2)
cp4252 Machine Learning
49 pages
Module 1
No ratings yet
Module 1
22 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
132 pages
Unit1 2
No ratings yet
Unit1 2
101 pages
Introduction To Machine Learning: Course Contents
No ratings yet
Introduction To Machine Learning: Course Contents
17 pages
Basic Concepts of Machine Learning For Beginners
No ratings yet
Basic Concepts of Machine Learning For Beginners
102 pages
Ai Notes
No ratings yet
Ai Notes
8 pages
UNIT-1 Machine Learning
No ratings yet
UNIT-1 Machine Learning
43 pages
Unit 2-Handout
No ratings yet
Unit 2-Handout
5 pages
Introduction to Machine Learning Basics
No ratings yet
Introduction to Machine Learning Basics
606 pages
1 - Introduction
No ratings yet
1 - Introduction
82 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
45 pages
Unit I Machine Learning
No ratings yet
Unit I Machine Learning
78 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
10 pages
Machine Learning Overview
100% (2)
Machine Learning Overview
42 pages
Machine Learning Basics & Techniques
No ratings yet
Machine Learning Basics & Techniques
13 pages
Unit-1 ML
No ratings yet
Unit-1 ML
39 pages
MachineLearning Lecture 2
No ratings yet
MachineLearning Lecture 2
23 pages
ML Module I
No ratings yet
ML Module I
71 pages
2024 Machine Learning Intro
No ratings yet
2024 Machine Learning Intro
50 pages
Machine Learning
No ratings yet
Machine Learning
17 pages
Machine Learning
No ratings yet
Machine Learning
51 pages
1 Introduction
No ratings yet
1 Introduction
24 pages
Machine Learning Basics for Beginners
100% (5)
Machine Learning Basics for Beginners
134 pages
Deep Learning Exam: Key Concepts
No ratings yet
Deep Learning Exam: Key Concepts
32 pages
Ch7 Introduction To Machine Learning
No ratings yet
Ch7 Introduction To Machine Learning
29 pages
DS-05 Introduction To Machine Learning
No ratings yet
DS-05 Introduction To Machine Learning
103 pages
TAT 3.9 UserGuide 20170721
No ratings yet
TAT 3.9 UserGuide 20170721
35 pages
Unified Process
No ratings yet
Unified Process
21 pages
Computer Basics Quiz for Class 2
100% (1)
Computer Basics Quiz for Class 2
3 pages
Array Tasks for Programming Students
No ratings yet
Array Tasks for Programming Students
3 pages
Drive Erazer Ultra Quick Start Guide: 1. Installation Steps
No ratings yet
Drive Erazer Ultra Quick Start Guide: 1. Installation Steps
2 pages
Gcse Computer Science Unit1 Mark Scheme Jun19
No ratings yet
Gcse Computer Science Unit1 Mark Scheme Jun19
28 pages
Abs CCR Clustering and Collaborative Representation
No ratings yet
Abs CCR Clustering and Collaborative Representation
2 pages
Wilcom E2
100% (1)
Wilcom E2
1,133 pages
Mann 2006
No ratings yet
Mann 2006
4 pages
Mohammed Amin Fuggawala
100% (1)
Mohammed Amin Fuggawala
6 pages
Machine Learning Kme074
No ratings yet
Machine Learning Kme074
2 pages
Advanced Database Chapter One
100% (1)
Advanced Database Chapter One
60 pages
GENG 111 - Lecture 05 - Orthographic Projection
No ratings yet
GENG 111 - Lecture 05 - Orthographic Projection
38 pages
(For Activities Falling Under Paragraph 2 of Annex B of UNSC Res. 2231 (2015) )
No ratings yet
(For Activities Falling Under Paragraph 2 of Annex B of UNSC Res. 2231 (2015) )
4 pages
Online RTI Request Form Details
No ratings yet
Online RTI Request Form Details
2 pages
Induction Insights for AJ Turbo Staff
100% (1)
Induction Insights for AJ Turbo Staff
3 pages
PHP Tutorial
No ratings yet
PHP Tutorial
20 pages
Innovation Secrets of Steve Jobs
No ratings yet
Innovation Secrets of Steve Jobs
26 pages
Techniques at String Art Fun
No ratings yet
Techniques at String Art Fun
4 pages
Data Flow Diagrams & UML Guide
No ratings yet
Data Flow Diagrams & UML Guide
28 pages
Security Underwater - DSIT's AquaShield DDS
No ratings yet
Security Underwater - DSIT's AquaShield DDS
1 page
8051 CH7 Progrmg in C
No ratings yet
8051 CH7 Progrmg in C
63 pages
5th Sem Syllabus
No ratings yet
5th Sem Syllabus
7 pages
4-3 Ambler - Ambler - UML - Persistence
No ratings yet
4-3 Ambler - Ambler - UML - Persistence
10 pages
Algorithm Complexity Analysis
No ratings yet
Algorithm Complexity Analysis
6 pages
Mohammed Munawar MIT SOP Master of Networking
No ratings yet
Mohammed Munawar MIT SOP Master of Networking
3 pages
AI Course Overview
80% (5)
AI Course Overview
165 pages
The Posting Date For Period 2011 For Payroll Area SP Is Not Maintained
No ratings yet
The Posting Date For Period 2011 For Payroll Area SP Is Not Maintained
2 pages
Admitting Is Gold! Randy Ray
No ratings yet
Admitting Is Gold! Randy Ray
19 pages
Vessel Traffic Service
No ratings yet
Vessel Traffic Service
3 pages

M2 AI Chap1 Neural-Network

Uploaded by

M2 AI Chap1 Neural-Network

Uploaded by

Université Blida 1

Faculté des Sciences

Artificial Intelligence (AI), noun: The field of computer science focused

1. Process and analyze complex data

Machine Learning (ML), noun: A subset of artificial intelligence that

1. Data-driven approach to problem-solving

Most of them follow the same general structure:

1. Training set / examples:

x → f(x; θ) | {z } function / algorithm → y |{z}

prediction θ: parameters of the model

3. Loss, cost, objective function / energy:

argmin θ E(θ; x1, x2, . . . , xN )

Goal: to extract information from the training set

● relevant for the given task,

● Sample (Observation or Data): item to process (e.g., classify).

● Unsupervised learning: Discovering patterns in unlabeled data.

Support Vector Machines (SVM):

k-Nearest Neighbors (k-NN):

1. Is the weather good?

1. Is the weather good? x1

festival, and otherwise we would not.

With a threshold of 5, this causes us to go if and only if the weather is good.

What happens if the threshold is decreased to 3? What about if it is decreased to 1?

2. Collect training data with positive and negative examples:

<w, x> + b is positive for positive samples x,

<w, x> + b is negative for negative samples x.

<w, x> + b is positive for positive samples x,

<w, x> + b is negative for negative samples x.

ReLU (Rectified Linear Unit):

● One layer of hidden units

● N − 1 layers of hidden units

● Forward pass: performs inference

(j indexing hidden units, k indexing the output units, D number of inputs)

● Activation functions f , g: sigmoid/logistic, tanh, or rectified linear (ReLU)

● How would you build your network?

where o = f (x; w) is the output of a neural network

● Define a loss function, eg:

where η is the learning rate (and E is error/loss)

Human Vision <-> many layers of abstraction <-> Deep learning

You might also like