0% found this document useful (0 votes)

16 views

Ai Application

Uploaded by

aakankshakelkar123123

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views

Ai Application

Uploaded by

aakankshakelkar123123

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 28

Linear Algebra- How it is

used in AI ?

Understand How Linear Algebra is applying in AI.

How Linear Algebra (Mathematical Objects) is using in Artificial

Intelligence.

Sub-fields in AI

Well, Artificial Intelligence is not a single subject it has sub-fields

like Learning (Machine Learning & Deep Learning), Communication
using NLP, Knowledge Representation & Reasoning, Problem
Solving, Uncertain Knowledge & Reasoning.

In this article will explain how objects and its properties are using in
AI’s sub-fields ML, NLP, DL,etc., algorithms.

Describing the sub-field concepts where LA Objects can be applied.

Going through each sub-field explain bit to the concerned topic and
how applying it. The following diagram explain the areas where we
apply Linear Algebra in AI.
LA Objects applying in these areas of AI

N ote: Please note that Data representation, Data Processing are

not the sub areas of AI, these are using in ML,DL & NLP areas.
In the above diagram other sub-areas like Problem solving,
Knowledge representation and knowledge reasoning LA objects are
used but not as much in Learning(ML/DL) and NLP.

Describing LA Objects & properties in these sub-fields

Linear Algebra or Mathematical objects are Vectors, Matrices and

Tensors. Depend upon the dimensions of your data you have to
choose the right object to store and process, Title diagram describes
this.

Before starting how to use Mathematical Objects in AI, it is better to

refresh Linear Algebra.
Data representation: Explained in terms of Mathematical
Objects Vector, Matrix and Tensor.

Data set: It is a collection of examples or data points or objects.

Each example is a collection of features. Each example is a row and
feature is a column.

Design Matrix: A Data set can be described through Design

Matrix. A Design Matrix is a matrix containing a different example
in each row. For example:

Design Matrix representation

If the data is not in particular order, i.e., columns are not same for
each example/row. In such cases we describe as set containing ‘m’
elements of which has different vector size.

In supervised learning, data set contains a Label or target as well as

collection of features.
Design Matrix for Supervised Learning

Data Processing: Before we use Data sets in our ML algorithms or

in any sub-field of AI, it is necessary that the data set should be
ready (cleansed & filtered).

There are 3 forms of Data processing Mean subtraction,

Normalization and PCA &whitening. These forms described in short
in the below diagram.
Data processing operations explained through Numpy

These 3 form access Matrix and produces the desired. The 3rd form
PCA is used for dimensionality reduction and it is totally works in
pure linear algebra, following algorithm describes it.

PCA Algorithm for Training Data

Some of the operations used in Data selection, engineering, Data

cleansing, etc., argmin and argmax are the operations in Data
Processing. It works on Matrices and vectors and selecting the rows
or columns of minimum or maximum respectively.
Here axis can be column or row. Axis 0 (zero) means Column, axis
1(one) means Row.

argmin: Returns the indices of the minimum values along an axis.

argmax: Returns the indices of the maximum values along an axis.

Machine Learning (ML) : ML is an algorithmic based approach

that learns from training data and give decisions on unseen data.
There are many algorithms exists in ML for supervised and
unsupervised learning.

How LA concepts applied in ML-Regression

Algorithm: Here describes how Linear Algebra applies to
Regression analysis. Explaining the concepts through Linear
Multiple Regression Algorithm. The following diagram describes LA
concepts in ML and DL.
LA Objects, properties and usages in ML and DL

Regression Analysis explain in terms of Vectors, Matrices and

their properties.

What is Regression? It is a statistical technique for estimating the

relationships between a dependent and independent variables.

The most common form of regression analysis is Linear

Regression
In the following equations will describe Simple and Multiple Linear
Regression.

Simple & Multiple Regression with examples

This technique predicts continuous responses — for example,

forecasting stock prices, House Rent, etc.,.

Residual: In Machine Learning/statistical terminology, it is a

difference between the observed value and the estimated value of the
target variable.

Notation is given below:

Notation for observed & estimated value of target variable

Residual in Multiple regression

Sum of Squares of Residuals: Let’s define residual as ‘r’.

Least Square method: Least squares method is the standard

approach and it minimizes the Sum of Squares of Residuals ‘S’.

Ordinary Least Squares (OLS) or Linear Least Squares estimate the

parameters in a regression model by minimizing the sum of the
squares of residuals. It draws a line through the data points that
minimize the SSE between observed and predicted (or fitted or
estimated ) values.

The most important application is data fitting.

Data Fitting: It is the process of constructing a curve fitting or
mathematical function, that has the best fit to a set of data points.

Curve fitting can be linear or non-linear. The following describes

both curves.

Linear Curves:

Linear Curve

After the introduction of Regression Analysis let us define the loss

and cost function of it.

Loss Function: The Loss function of Linear Regression is defined

is as follows

The loss function of Regression

Finding out the parameters by differentiation w.r.to parameters.

Finding weights or parameters by applying a gradient on the Loss
function

What is Regularization: To avoid the over-fitting problem, the

regularization technique is used to shrink the magnitude of
Parameters. This can be achieved by adding a penalty (a function of
the sum of parameters) into the cost function. L1, L2, Drop out and
Max norm constraints used in DL, whereas L1, L2, L1+L2 used in
ML.

If you are using neural networks for ML algorithms you can apply all
of the above 4 regularization techniques.

L2 Regularization: It is the most common form of Regularization.

It can be implemented by penalizing the squared magnitude of all
parameters directly in the objective.
L1 Regularization: Each weight w we add the term param*|w| to
the objective function, both L1, L2 is defined is as follows:

Generalized Regularization

Use of Vector Norms in Machine Learning for

Regularization:

Vector Norms in Regularization to avoid the Over fitting problem

D eep Learning (DL): It is a branch of ML and deeply learns

the text, images, or videos. Unstructured data like images or videos

can be processed using DL. There are many applications of DL like
Image Processing ( Computer Vision using CNNs), Video Processing
(Computer Vision using RNNs), Text Processing (NLP using RNNs,
LSTMs ), etc., we can combine with Reinforcement Learning (DEEP
RL).

DL is inspired by Neurons. One neuron gets connected with multiple

neurons and applies activation function at the neuron.

Vectors, Matrices and Tensors are objects used in DL area. The

Following diagram is the sample of a Neural Network and describes
Input, Neurons, Layers, Feed Forward Propagation, Back-
propagation, etc.,

There are many Mathematical Subjects involved in Deep Learning,

in this article Linear Algebra is considered. Describing how
Mathematical objects are being used in each stage.

Common Neural Network Architecture

Input: Input is in the form of Vectors, Matrices or Tensors to the

Neural Network. Finally each data object/sample will be in Vectors.
Here Input is a vector of n - dimensions. It is an example or data
point in the data set.

Neurons or Nodes: Here we apply activation function for the

input of previous layer and weights or connections. It is an
interconnected group of natural or Artificial Neurons that uses
mathematical or computational model for information processing
based on a connectionistic approach to computation.

Connections: Connections of the biological neuron are modeled as

weights.

Each Neuron will be connected to other neurons in next layer

Layer: Each layer contains set of neurons the following picture

depicts.

Layer Contains neurons and will operated in vector level.

Feedforward Propagation: These are called Deep feedforward
networks or feedforward neural networks or Multilayer perceptrons
(MLPs). These are called feedforward because information flows
through the function being evaluated from x, through the
intermediate computations used to define f, and finally to the output
y.

Feedforward neural networks are called networks because they are

typically represented by composing together many different
functions.

Let us say for an example our network has 3 functions connected in

a chain, to form

These chain structures are used structure of neural networks.

Let us see how we are applying using vectors and matrices in Feed
forward Networks.

1. Vectorizing Inputs, Weights and Bias : x: input vector of n-

dimensions; w-weight matrix of n rows and m neurons in
the next layer, and bias of m neurons in the next layer.

Overall Calculation of Input, Weights and Bias into temporary

variable Z
From this it is concluded that

Generalized Approach

2. Apply intermediate Variable Z into activation function

Feedforward into next layer

3. The above steps repeated and results getting feed to next layer in
the forward way.

At each neuron Intermediate calculation & activation function will

be is as follows:

Consider an example of Neural Network of Input 2-Features, 3-

Hidden Layers and 1-Output Layers with dimensions 3,5,4,2,1
hidden units.

Neural Network with 1-Input, 4 Hidden and 1 Output Layers

Let us apply Vector, Matrix operations for forward propagation.

Forward Propagation for 4 layers

Know your Matrix dimensions:

Dimensions of the Matrix, Vector in Feed forward propagation

Feedforward Propagation = Matrix-Vector product rule,

addition of matrices along with activation functions.

Back Propagation = Matrix Calculus + Linear Algebra Product

Rules — will cover in the next article.

Natural Language Processing (NLP): NLP is concerned with

the interactions between human and computers, in particular how to
program computers to process and analyze large amounts of natural
language data.

Here we describe Word2Vector (W2V)technique that is for NLP. In

Word2Vec represents each distinct word with a particular list of
numbers called a Vector. Based on W2V we can apply vector
properties for checking the similarity and semantic similarity
between vectors.

In NLP we use Vectors and Matrices is as follows:

Vectors and Matrices using in W2V Algorithms

W2V used in many of the tasks in NLP and it is the base of capturing
word in to vector. Natural Language Text = Sequence of discrete
symbols

Produce Dense vector representation based on the context /use of

words.

What is Target & Context words: Consider a text instance with

context window size =2. Following describes
Context and Target/Current Words

How to represent One-hot representation?

Vocabulary: The set of words encoded in to the feature vector is

called the Vocabulary, so the dimension of vector is equal to the size
of the Vocabulary. In short, |V| = size of the Vocabulary.

Let us say our text data set contains the following lines

1. “And the Cute kitten purred and then …

2. “ Cute furry cat purred and miaowed…”

3. “ That the small kitten miaowed and she ..”

4. “ the loud furry dog ran and bit… ”

From these 4 sentences basis vocabulary : { bit, cute, furry, loud,

miaowed, purred, ran, small} — 8 is the vocabulary length. Let us
define target and context words.

Target Word: Kitten, Context words: { Cute, purred, small,

miaowed}
Target Word: Cat, Context words: { Cute, Furry, miaowed}

Target Word: Dog, Context words: { Loud, Furry, ran, bit }.

Now we represent as a vector of vocabulary length 8.

Words as Vectors

We defined the vectors as when context words appears then specify

‘1’, otherwise ‘0’ at the dimension of vector.

Checking the similarity between vectors: To check the

similarity we can use the Inner product (or) cosine as similarity
kernel.

Sim(Kitten,Cat)=Cosine(Kitten,Cat)~0.58; Sim(Kitten, Dog)=

Cosine(Kitten, Dog) ~ 0.00; Sim(Cat,Dog)=Cosine(Cat,Dog)~0.29

Cosine, Dot and Cross product between vectors

Embedding Matrix: Embedding Matrix can be defined as Rows ->

Target words and Columns -> Number of Context words are length
of context window
Embedding Matrix Dimension

Rows are word vectors, so we can retrieve them with one hot vectors

Word representation using one hot

Embedding Matrix with row as Target word and its context words

Algorithm for constructing Embedding Matrix:

Steps to construct Embedding Matrix

A Vector that captures the meaning of a Word. It can also be known

as Word2Vec, Word Emebedding. Following are the algorithms
1. Skip-gram (SG) : Predicting Context words given by the
target word

2. Continuous bag of words (CBOW): Predicts target

word given by the context words

3. Glove: It makes use of global co-occurrence statistics.

Glove consists of a weighted least squares model that trains
on global word-word co-occurrence counts.

The above 3 algorithms explained in the usage of Linear Algebra .

Step-1: Skip-gram (SG): The objective of the skip-gram (SG)

model is to maximize the average log probability

Describing Context and Target words

Step-2: Project into Vocabulary Softmax

Step 3: Learn to estimate likelihood of Context words

SKIP-GRAM

Continuous bag of words (CBOW): It predicts target or current

word based on its context words. Its possibility distribution would
be

 Project back to vocabulary size / softmax

 Embed context words, add them.

Expressing current word in the form of softmax of vector-matrix

product rules in LA

CBOW
GLOVE: Like word2vec, Glove is a set of vectors that capture the
semantic information (i.e., meaning about words. It consists of a
weighted least squares model that trains on global word-word co-
occurrence counts.

Glove makes use of Global-occurrence statistics.

Co-occurrence matrix: We define the this matrix using the

following corpus.

I like deep learning; I like NLP; I enjoy flying.

Co-occurrence Matrix

Let X be the word-word co-occurrence counts matrix.

Like the case in word2vec, each word has 2 vectors, input(v) and
output(u)

Cost Function of Glove model

Conclusion: Described how Linear Algebra applied in various
fields of AI, it is better to be keen in Linear Algebra stuff before we
move on ML, DL or NLP. I tried to cover how to apply Linear
Algebra stuff in algorithmic perspective, I hope it may give strength
to be involve more into Linear Algebra.

Linear Algebra promotes to other subjects like Matrix Calculus

which is heavily used in Back propagation in DL.

Thanks for reading this article, please drop a note if there are any
mistake(s) and appreciated your feedback.

References :

1. Artificial Intelligence: A Modern Approach by Stuart

Russell, Peter Norvig,

2. Deep Learning Book by Ian Goodfellow and Yoshua Bengio

and Aaron Courville

3. https://en.wikipedia.org/wiki/Regression_analysis

4. http://web.stanford.edu/class/cs224n/

5. Efficient Estimation of Word Representations in vector

space

6. https://nlp.stanford.edu/projects/glove/

ML Interview Questions and Answers
No ratings yet
ML Interview Questions and Answers
105 pages
Basic Concepts for Understanding ML & DL
No ratings yet
Basic Concepts for Understanding ML & DL
8 pages
Advance ML - Unit 1
No ratings yet
Advance ML - Unit 1
12 pages
Deep Learning
No ratings yet
Deep Learning
21 pages
Fin Irjmets1653474126
No ratings yet
Fin Irjmets1653474126
4 pages
Reviewer
No ratings yet
Reviewer
7 pages
Examples-of-Linear-Algebra-in-Machine-Learning-11022025-060334pm (1)
No ratings yet
Examples-of-Linear-Algebra-in-Machine-Learning-11022025-060334pm (1)
12 pages
Leniear Algebra Operation For Machine Learning
No ratings yet
Leniear Algebra Operation For Machine Learning
10 pages
ml notes question bank exstraction from notes
No ratings yet
ml notes question bank exstraction from notes
30 pages
Unit1 ML
No ratings yet
Unit1 ML
36 pages
009-Neural_Networks-Complete
No ratings yet
009-Neural_Networks-Complete
61 pages
Statistics book
No ratings yet
Statistics book
36 pages
UNIT2
No ratings yet
UNIT2
20 pages
Shaik MubarakMlGZ
No ratings yet
Shaik MubarakMlGZ
15 pages
FDS Module II-I
No ratings yet
FDS Module II-I
27 pages
Linear Algebra - A Powerful Tool For Data Science
No ratings yet
Linear Algebra - A Powerful Tool For Data Science
6 pages
MLBasicsBook
No ratings yet
MLBasicsBook
287 pages
Machine Learning: The Basics
No ratings yet
Machine Learning: The Basics
288 pages
Dasar Statistika Dan Matematika
No ratings yet
Dasar Statistika Dan Matematika
30 pages
4. Deep Learning
No ratings yet
4. Deep Learning
110 pages
DL QA
No ratings yet
DL QA
15 pages
Machine Learning NN
100% (2)
Machine Learning NN
16 pages
Deep Learning
No ratings yet
Deep Learning
78 pages
1. Fundamentals of AI
No ratings yet
1. Fundamentals of AI
114 pages
Deep Learning
No ratings yet
Deep Learning
142 pages
Unit 3 Self Made
No ratings yet
Unit 3 Self Made
23 pages
Ad3451 ML Unit 4 Notes Eduengg
No ratings yet
Ad3451 ML Unit 4 Notes Eduengg
36 pages
Main
No ratings yet
Main
183 pages
NN unit_1
No ratings yet
NN unit_1
27 pages
Unit - 1 MACHINE LEARNING BASICS, LINEAR ALGEBRA
No ratings yet
Unit - 1 MACHINE LEARNING BASICS, LINEAR ALGEBRA
41 pages
Unit_4_Dsc
No ratings yet
Unit_4_Dsc
30 pages
Basic Linear Algebra For Deep Learning and Machine Learning Python Tutorial - by Towards AI Team - Towards AI - Oct, 2020 - Medium PDF
No ratings yet
Basic Linear Algebra For Deep Learning and Machine Learning Python Tutorial - by Towards AI Team - Towards AI - Oct, 2020 - Medium PDF
33 pages
Machine Learning
No ratings yet
Machine Learning
13 pages
Linear Assignment
No ratings yet
Linear Assignment
5 pages
Algebra Lineal
No ratings yet
Algebra Lineal
195 pages
Linear Algebra Operations For Machine Learning - GeeksforGeeks
No ratings yet
Linear Algebra Operations For Machine Learning - GeeksforGeeks
18 pages
AssigmentByM - Farhan Khan SS BSSE F22 E21
No ratings yet
AssigmentByM - Farhan Khan SS BSSE F22 E21
13 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
Unit 3
No ratings yet
Unit 3
7 pages
tensorflow-deep-learning-and-artificial-intelligence-machine-learning
No ratings yet
tensorflow-deep-learning-and-artificial-intelligence-machine-learning
97 pages
TFM Lichtner Bajjaoui Aisha
No ratings yet
TFM Lichtner Bajjaoui Aisha
51 pages
Khadeejah ConferenceExtract 2024
No ratings yet
Khadeejah ConferenceExtract 2024
16 pages
Use of Linear Algebr
No ratings yet
Use of Linear Algebr
4 pages
Types of Machine Learning: Supervised Learning: The Computer Is Presented With Example Inputs and Their
No ratings yet
Types of Machine Learning: Supervised Learning: The Computer Is Presented With Example Inputs and Their
50 pages
Machine Learning Intro
No ratings yet
Machine Learning Intro
13 pages
Unit 2.1
No ratings yet
Unit 2.1
37 pages
UNIT 1
No ratings yet
UNIT 1
38 pages
Deep Learning (1)
No ratings yet
Deep Learning (1)
19 pages
LinearAlgebra Lect2 Karan
No ratings yet
LinearAlgebra Lect2 Karan
62 pages
ML_UNIT-1
No ratings yet
ML_UNIT-1
64 pages
Linear Algebra
No ratings yet
Linear Algebra
59 pages
The Functions of Deep Learning: Gilbert Strang
No ratings yet
The Functions of Deep Learning: Gilbert Strang
1 page
Module 2
No ratings yet
Module 2
44 pages
2-Mathematical Optimization and Deep Learning
No ratings yet
2-Mathematical Optimization and Deep Learning
53 pages
Deep Learning
No ratings yet
Deep Learning
13 pages
AI and Linear Algebra
No ratings yet
AI and Linear Algebra
2 pages
Deep Learning A Tutorial
No ratings yet
Deep Learning A Tutorial
16 pages
Symbolic Mathematics in Data Science. Algebra, Calculus, and Geometry with Matlab
From Everand
Symbolic Mathematics in Data Science. Algebra, Calculus, and Geometry with Matlab
César Pérez López
No ratings yet
Mathematics for Data Science: Linear Algebra with Matlab
From Everand
Mathematics for Data Science: Linear Algebra with Matlab
César Pérez López
No ratings yet
Design And Analysis Of Algorithm
From Everand
Design And Analysis Of Algorithm
Bhupendra Mandloi
No ratings yet
STA2604 2022 TL 014 0 B Assignment 2
No ratings yet
STA2604 2022 TL 014 0 B Assignment 2
4 pages
Practical No. 1 Aim: The Euclid Problem Theory
No ratings yet
Practical No. 1 Aim: The Euclid Problem Theory
4 pages
Department of Computer Engineering Css Sem Vi Academic Year 2021-2022
No ratings yet
Department of Computer Engineering Css Sem Vi Academic Year 2021-2022
2 pages
Image Enhancement Histogram Processing: by Dr. K. M. Bhurchandi
No ratings yet
Image Enhancement Histogram Processing: by Dr. K. M. Bhurchandi
25 pages
SSL/TLS Multiple Vulnerabilities SSL 64-Bit Block Size Cipher Suites Supported (Sweet32)
No ratings yet
SSL/TLS Multiple Vulnerabilities SSL 64-Bit Block Size Cipher Suites Supported (Sweet32)
4 pages
Unit 4-Health care and Deep Learninh
No ratings yet
Unit 4-Health care and Deep Learninh
87 pages
Third Law of Thermodynamics
No ratings yet
Third Law of Thermodynamics
1 page
Control System Term Paper
No ratings yet
Control System Term Paper
4 pages
Module 4: Non-Interacting Systems: Two-Level Systems and Ideal Gases
No ratings yet
Module 4: Non-Interacting Systems: Two-Level Systems and Ideal Gases
15 pages
Skema Trial STPM 2021 S2 Stu
No ratings yet
Skema Trial STPM 2021 S2 Stu
8 pages
Unit 2
No ratings yet
Unit 2
38 pages
M. Tech. in VLSI Design (VL) : Suggested Plan of Study: Program Core (PC)
No ratings yet
M. Tech. in VLSI Design (VL) : Suggested Plan of Study: Program Core (PC)
17 pages
Contact Search Algo
No ratings yet
Contact Search Algo
17 pages
Signal Processing: Qiang Zhang, Li Zhuo, Jiafeng Li, Jing Zhang, Hui Zhang, Xiaoguang Li
No ratings yet
Signal Processing: Qiang Zhang, Li Zhuo, Jiafeng Li, Jing Zhang, Hui Zhang, Xiaoguang Li
8 pages
Assignment 4 of Soft Computing PDF
No ratings yet
Assignment 4 of Soft Computing PDF
5 pages
ML Unit 3
No ratings yet
ML Unit 3
40 pages
Prediction of Heat Transfer Rates For Shell-and-Tube Heat Exchangers by Artificial Neural Networks Approach
No ratings yet
Prediction of Heat Transfer Rates For Shell-and-Tube Heat Exchangers by Artificial Neural Networks Approach
6 pages
Application of Artificial Neural Network (ANN) For Estimating Reliable Service Life of Reinforced Concrete (RC) Structure Bookkeeping Factors Responsible For Deterioration Mechanism
No ratings yet
Application of Artificial Neural Network (ANN) For Estimating Reliable Service Life of Reinforced Concrete (RC) Structure Bookkeeping Factors Responsible For Deterioration Mechanism
15 pages
Circular_structures_of_puffer_fish_A_new_metaheuristic_optimization_algorithm
No ratings yet
Circular_structures_of_puffer_fish_A_new_metaheuristic_optimization_algorithm
5 pages
Lec 8
No ratings yet
Lec 8
91 pages
PCA Code-Checkpoint
No ratings yet
PCA Code-Checkpoint
4 pages
Cross-Lingual Contextualized Topic Models With Zero-Shot Learning
No ratings yet
Cross-Lingual Contextualized Topic Models With Zero-Shot Learning
8 pages
Tabu Search
No ratings yet
Tabu Search
25 pages
B.A. (Prog.) Mathematics 6th Semeester-2023
No ratings yet
B.A. (Prog.) Mathematics 6th Semeester-2023
8 pages
PI With Anti Windup
No ratings yet
PI With Anti Windup
61 pages
Transportation Problem
No ratings yet
Transportation Problem
30 pages
Foundations of Learning and Adaptive Systems: Evolutionary Algorithms
No ratings yet
Foundations of Learning and Adaptive Systems: Evolutionary Algorithms
26 pages
Lecture 05 - Queuing Theory
No ratings yet
Lecture 05 - Queuing Theory
9 pages
Algorithmic Game Theory 1st edition Edition Nisan N. (Ed) - The full ebook version is just one click away
100% (1)
Algorithmic Game Theory 1st edition Edition Nisan N. (Ed) - The full ebook version is just one click away
57 pages
AIML Brochure
No ratings yet
AIML Brochure
13 pages