Unit 3: Statistics and Machine Learning
Linear Algebra: Matrix and vector algebra, systems of linear equations using matrices, linear independence,
Matrix factorization concept/LU decomposition, Eigen values and eigenvectors. Understanding of calculus:
concept of function and derivative, Multivariate calculus: concept, Partial Derivatives, chain rule, the Jacobian
and the Hessian
Questions and Answers
1 Explain the concept of Partial Derivatives with an example. [9]
https://www.youtube.com/watch?v=RYZXXR6ztKk
https://www.youtube.com/watch?v=EzePYR390dw
A partial derivative is a fundamental concept in calculus used for functions with multiple variables. It measures
how a function changes with respect to one specific variable, while keeping all other variables constant.
Example:
Consider a function f(x,y) that represents the height of a hill at any point (x,y) on a map.
∂f/∂x ; means how steep the hill is when moving east or west (changing x), while keeping the north-south
position (y) constant.
Similarly, ∂f/∂y, steepness when moving north or south, keeping x constant.
Notations:
Partial derivatives can be written in multiple ways:
∂f/∂x (most common notation)
fx (subscript notation)
Calculation:
Taking a partial derivative follows the same principles as regular derivatives, except that all other variables are
treated as constants while differentiating with respect to one variable.
Applications:
Partial derivatives are widely used in various fields:
Physics: Examining how pressure changes with temperature, assuming volume remains constant.
Economics: Studying how demand for a product changes with price, assuming other factors are fixed.
Machine Learning & Data Science: Understanding how different features in a dataset impact predictions.
Heat Transfer (Conduction Equation)
The rate of heat flow in a solid object depends on how temperature varies with position and time.
The heat conduction equation: ∂T/∂x ; means how temperature T changes with respect to position x,
assuming other variables (like time) are constant.
Used in designing heat exchangers, engine cooling systems and insulation materials.
Fluid Mechanics (Navier-Stokes Equations)
Fluid velocity (v) in a pipe or around an aircraft depends on position and time.
Partial derivatives like: ∂v/∂x ; ∂v/∂y,∂v/∂z ; describe how velocity changes in different directions.
Essential for aerodynamics, hydrodynamics, and CFD (Computational Fluid Dynamics).
Stress and Strain Analysis
In materials science, stress (σ) and strain (ε) depend on multiple directions.
Partial derivatives like: ∂σ/∂x ; show how stress changes along a specific axis, crucial for predicting material
failure in bridges, aircraft, and machines.
Robotics & Kinematics
The motion of a robotic arm depends on multiple joint angles.
The velocity of the end effector is obtained using Jacobian matrices, which involve partial derivatives of
position with respect to time.
Helps in trajectory planning and movement optimization.
Thermodynamics (Gas Laws & Efficiency Calculations)
In an engine, pressure P depends on temperature T and volume V.
Partial derivatives like: ∂P/∂T, ∂P/∂V; helps to analyze how pressure changes with temperature (keeping
volume constant) or with volume (keeping temperature constant).
Used in combustion engine design and refrigeration systems.
1
In toto:
A function can have multiple partial derivatives (one for each variable).
Partial derivatives extend naturally to functions with three or more variables.
They are essential in optimization, gradient-based learning, and differential equations.
2 What is the significance of chain rule in calculus? Explain chain rule with suitable example. [9]
The chain rule is a key calculus technique for differentiating composite functions—functions where one function's
output serves as another function’s input. It helps us efficiently compute derivatives without breaking functions into
multiple steps. The chain rule is indispensable for differentiating complex functions in mathematics and applied
sciences. Its efficiency, broad applicability, and integration into higher-order calculus make it a fundamental tool for
solving real-world problems in physics, engineering, economics, and beyond.
Importance:
Simplifies Differentiation: Instead of manually breaking down complex functions, the chain rule provides a
structured approach.
Widely Used in Science & Engineering: It applies to physics, engineering, economics, and machine learning,
where relationships between multiple variables exist.
Foundation for Advanced Calculus: It’s essential for higher-order derivatives, multivariable calculus, and
optimization techniques.
Real-World Relevance: Many natural and engineered systems involve nested relationships, such as
temperature changes, population growth, and signal processing.
Mathematical Significance of the Chain Rule with Examples
Computational Efficiency
Without the chain rule, differentiating composite functions would require explicit function decomposition and
stepwise differentiation, increasing computational complexity.
Example: Consider y=sin(x²+3x). Without the chain rule, we would need to separately handle the inner function
g(x)=x²+3x and the outer function f(x)=sin(x). Instead, the chain rule simplifies the process: dy/dx =
cos(x²+3x)⋅(2x+3)
Broad Applicability
The chain rule is fundamental for differentiating composite functions in scientific modeling, control systems, and
engineering applications.
Example: In robotics, if a robotic arm's position is a function of time-dependent joint angles θ(t), and the end-effector
position depends on these angles as f(θ), then its velocity requires the chain rule: df/dt = (df/dθ)*(dθ/dt). This helps
in real-time motion planning.
Foundation for Advanced Calculus
The chain rule extends to higher-order differentiation, Jacobian matrices in multivariable calculus, and implicit
differentiation.
Example: For multivariate functions, if z=f(x,y) where x and y are functions of t, the total derivative is given by:
dz/dt = (∂f/∂x)*(dx/dt) + (∂f/∂y)*(dy/dt)
This is essential in fluid dynamics for tracking particle motion.
Multivariable Dependency Analysis
https://www.youtube.com/watch?v=zomvvohLwr4
Many real-world systems involve interdependent variables. The chain rule enables precise computation of partial
derivatives in functions with implicit relationships.
Example: In economics, if consumer demand Q depends on price P, which in turn is influenced by production cost
C
i.e., Q=f(P), P=g(C),
Then the rate of demand change with respect to cost is found using:
dQ/dC = (dQ/dP)*(dP/dC). This helps in market analysis and pricing strategies.
Essential in Applied Fields
The chain rule is crucial in differential equations, optimization, physics, engineering, and finance. Examples:
2
Physics: In thermodynamics, entropy S depends on temperature T, and T depends on volume V, leading to:
dS/dV = (dS/dT)*(dT/dV)
Finance: In risk modeling, if an asset price S depends on time t, and volatility σ depends on S, the rate of
change of volatility w.r.t. time is:
dσ/dt = (dσ/dS)*(dS/dt). This is key in option pricing models like Black-Scholes (pricing financial
derivatives).
3 Explain Multivariate calculus with two suitable examples [9]
Multivariate calculus:
Branch of Calculus: Deals with functions that involve multiple variables (inputs) and their relationships.
Builds upon single-variable calculus concepts like derivatives and integrals but extends them to functions
with multiple inputs.
It unlocks the ability to analyze and understand how these functions change and behave in response to
adjustments in their various inputs.
It includes the study of limits, continuity, partial derivatives, multiple integrals, and vector calculus.
Main Tools:
Partial Derivatives: Finding rates of change with respect to one variable.
Gradient: A vector representing the direction of steepest ascent for a scalar field (function with one
output).
Jacobian: A matrix capturing how a function with multiple outputs changes with respect to its inputs.
Hessian: A matrix containing second-order partial derivatives, used to analyze critical points of scalar
fields.
Applications:
Computer Graphics: Representing and manipulating 3D objects, lighting effects.
Data Science: Analyzing relationships between multiple features in datasets.
Machine Learning: Training algorithms on multidimensional data sets.
Example:
Temperature Distribution on a Surface
T(x,y) = x² + 2xy + y²:
∂T/∂x = 2x + 2y (rate of temperature change in x-direction)
∂T/∂y = 2x + 2y (rate of temperature change in y-direction)
Fluid Flow in Three Dimensions
Consider the velocity field of a fluid, where each point in space has a velocity vector v(x,y,z) with components in
three directions:
v = P(x,y,z), Q(x,y,z), R(x,y,z)
Divergence = ∂P/∂x + ∂Q/∂y + ∂R/∂z
Multivariate calculus concepts:
Divergence (∇ · v): fluid is flowing into or out of a point
Curl (∇ × v): Indicates rotation of the fluid
Line integrals: Calculate work done along a path
Surface integrals: Determine fluid flux through a surface
4 Explain what it means for a function to be continuous and differentiable? [9]
https://www.youtube.com/watch?v=nRv1lPqrx3k
Mathematical Representation of Functions, Derivatives, Continuity, and Differentiability
1. Function
A function is a rule that assigns each input x exactly one output f(x).
Mathematical Form:
y = f(x)
For example, if f(x) = x^2, then:
f(3) = 3^2 = 9
So, input x = 3 gives output f(3) = 9.
3
2. Derivative
The derivative of a function measures the rate at which f(x) changes with respect to x.
Mathematical Form:
f'(x) = d/dx f(x)
For f(x) = x^2, the derivative is:
f'(x) = d/dx (x^2) = 2x
At x = 3:
f'(3) = 2(3) = 6
This means at x = 3, increasing x by a tiny amount increases f(x) by approximately 6 times that amount.
3. Continuity
A function f(x) is continuous at x = a if:
1. Limit exists:
lim (x -> a) f(x) exists
2. Function value exists:
f(a) is defined
3. Limit equals function value:
lim (x -> a) f(x) = f(a)
For example, f(x) = x^2 is continuous for all x because there are no jumps or breaks in its graph.
4. Differentiability
A function f(x) is differentiable at x = a if:
1. It is continuous at x = a
2. The derivative exists at x = a:
lim (h -> 0) [f(a+h) - f(a)] / h exists
Since f(x) = x^2 has a smooth curve everywhere, it is differentiable for all x.
Example Summary
For f(x) = x^2:
- It is a function since each x has one output.
- Its derivative is f'(x) = 2x, showing the rate of change.
- It is continuous for all real numbers since there are no breaks.
- It is differentiable because it has a smooth curve everywhere.
4
Engineering Applications of Functions, Derivatives, Continuity, and Differentiability
1. Function Applications in Engineering
A function maps an input to a unique output, which is fundamental in engineering modeling.
Applications:
Electrical Engineering: Voltage-current relationships in circuits follow functions, such as Ohm’s Law:
V = IR
where V is voltage, I is current, and R is resistance.
2. Derivative Applications in Engineering
The derivative measures the rate of change, which is crucial in dynamic and control systems.
Applications:
Electrical Engineering: Capacitor and inductor behavior is described by derivatives:
i_C = C dV/dt, V_L = L di/dt
where i_C is capacitor current, C is capacitance, V_L is inductor voltage, and L is inductance.
3. Continuity Applications in Engineering
A function is continuous if there are no sudden jumps or breaks.
Applications:
Thermal Engineering: Heat flow analysis in materials requires temperature functions to be continuous:
lim (x -> a) T(x) = T(a); ensuring no sudden temperature changes (which could cause thermal stress).
4. Differentiability Applications in Engineering
A function is differentiable if it has a smooth slope at all points, essential for motion and control systems.
Applications:
Signal Processing: Fourier transforms and wave equations require differentiability to analyze signals effectively:
X(f) = ∫ x(t) e^(-j2πft) dt
where X(f) is the frequency response of the signal x(t).
5 What is the difference between Eigen value and Eigen vector? How do you find the Eigen value of Eigen
vector? [9]
https://www.youtube.com/watch?v=h8sg_XBp6VA
https://www.youtube.com/watch?v=R13Cwgmpuxc
5
Eigenvalues and eigenvectors are key concepts in linear algebra, often used in real-world applications such as image
processing, machine learning, and physics.
Eigenvector: A non-zero vector that only changes its magnitude (not direction) when a linear transformation
(like scaling or rotation) is applied.
Eigenvalue: A scalar that represents how much the eigenvector is scaled during the transformation.
Eigenvalue (λ):
A special scalar value associated with a square matrix.
When you multiply a matrix by its eigenvector, the resulting vector gets stretched (or shrunk) in magnitude
by the eigenvalue, but its direction (relative to the origin) remains unchanged.
We find the eigenvalues by solving a characteristic equation:
det(A - λI) = 0, where:
det represents the determinant.
A is the square matrix.
λ (lambda) is the eigenvalue we're trying to find.
I is the identity matrix (same size as A) with 1s on the diagonal and 0s elsewhere.
scalar (single value)
Solution to a specific equation involving a matrix
Not all matrices have eigenvalues (only square matrices)
Requires solving the characteristic equation (determinant of (A - λI) = 0)
Eigenvalues tell you "how much" a transformation changes a vector.
Applications: Solving systems of linear equations
Stretches or shrinks the eigenvector
Eigenvector (v):
A non-zero vector that gets scaled by the eigenvalue when multiplied by the corresponding matrix.
Geometrically, the eigenvector points in the direction that the transformation preserves.
For each eigenvalue (λ), we solve a system of equations:
(A - λI)v = 0, where v is the eigenvector corresponding to λ.
This essentially asks what non-zero vectors get mapped to the zero vector by the transformed matrix (A -
λI).
Vector (with direction and magnitude)
Transformed by a matrix in a special way
An eigenvector exists for every eigenvalue (may be zero vector)
Obtained by solving the system of equations (A - λI)v = 0
Eigenvectors tell you "in which direction" the transformation changes the vector.
Applications: Diagonalization of matrices
Defines the direction along which stretching occurs
The relationship between eigenvalues and eigenvectors is captured by the following equation:
A*v=λ*v
where:
A is the square matrix
v is the eigenvector
λ is the eigenvalue
6
Applications:
Structural analysis: Analysing vibrations, stability of structures.
Control systems: Designing controllers for robots, aircraft, etc.
Machine learning: Principal component analysis (PCA), dimensionality reduction.
Electrical engineering: Circuit analysis.
6 What is linear equation? What are the different methods to solve system of linear equation? Explain with
suitable example. [9]
A linear equation is an algebraic equation where the highest power of any variable is 1.
In simpler terms, it's an equation involving variables but those variables are never squared, cubed, or raised
to any other power.
Linear equations can be written in various forms, but a common one is ax + by = c, where:
a, b, and c are constants.
x and y are variables (unknowns you're trying to solve for).
a and b cannot both be zero.
A linear equation on a graph represents a straight line. This straight line extends infinitely in both directions.
Systems of linear equations involve multiple linear equations working together to solve for multiple
variables.
7
A system of linear equations can be represented using a matrix called an augmented matrix.
The augmented matrix provides a compact way to represent and analyze systems of linear equations.
Each row of the matrix represents one equation.
In each row, place the coefficients of the variables from the corresponding equation in their respective
columns.
In the last column, place the constant term (C) from the corresponding equation.
Example:
2x + y = 5
3x - 2y = 1
Represented as an augmented matrix:
|2 1 |5|
| 3 -2 | 1 |
Methods to solve systems of linear equations:
1. Graphical Method:
Concept: Applicable only for systems with two variables. You plot both equations on the same coordinate
plane. The solution is the point where the lines intersect.
Advantages: Easy to visualize for simple systems.
Disadvantages: Not suitable for systems with more than two variables. Can be imprecise for lines that are
very close or overlap.
Example: Solve the system:
x+y=4
2x - y = 2
Plot both equations and find the intersection point, which is (2, 2). This is the solution.
2. Substitution Method:
Concept: Solve one equation for one variable in terms of the other. Substitute this expression into the other
equation. Solve for the remaining variable. Then back-substitute to find the first variable.
Advantages: Straightforward for small systems.
Disadvantages: Can lead to messy calculations for large systems or equations with complex terms.
Example: Solve the system:
x + 2y = 5 (solve for x: x = 5 - 2y)
3x - y = 1
Substitute the first equation into the second:
* 3(5 - 2y) - y = 1
* Solve for y: y = 2
Back-substitute y = 2 into the first equation:
* x + 2(2) = 5
*x=1
The solution is (x, y) = (1, 2).
3. Elimination Method:
Concept: Manipulate the equations algebraically to eliminate one variable. This often involves adding or
subtracting the equations in a way that cancels out one variable. Then, solve for the remaining variable and
back-substitute to find the other.
Advantages: More efficient than substitution for larger systems.
Disadvantages: May involve more calculations than substitution for simple systems.
Example: Solve the system (same system as substitution example):
x + 2y = 5
3x - y = 1
Add the top and bottom equations to eliminate y:
* 4x = 6
* Solve for x: x = 1.5
Back-substitute x = 1.5 into the first equation:
8
* 1.5 + 2y = 5
*y=2
The solution is (x, y) = (1.5, 2).
4. Matrix Method (Gaussian Elimination):
Concept: Represent the system of equations in an augmented matrix. Apply a series of row operations
(adding/subtracting rows, multiplying rows by constants) to transform the matrix into upper triangular form.
Then, back-solve to find the variables.
Advantages: Efficient and systematic for any size system. Less prone to errors than elimination by hand.
Disadvantages: Can be more time-consuming to set up initially compared to other methods for simple
systems.
Example: This method is best suited for larger systems (> 3 variables).
7 What is the difference between the Jacobian, Hessian and the gradient function? Explain with example the
applications of each function. [9]
https://www.youtube.com/watch?v=N2PpRnFqnqY
https://www.youtube.com/watch?v=FfJtVvQtqTM&list=PLU6SqdYcYsfIIEY1wEAsVWdW-R_A1-KBJ
https://www.youtube.com/watch?v=rB83DpBJQsE
https://www.youtube.com/watch?v=82rEr2UBdtM&list=PLYg8EGlXQ-h1YAnA6DWBPPh3GRRrTQIIV
https://www.youtube.com/watch?v=y8VkCnL9vs8
Gradient:
Represents the direction of steepest ascent for a scalar field (a function that takes multiple inputs but outputs
a single value).
It's a vector containing all the partial derivatives of the function with respect to each of its inputs.
Applications:
Used in optimization algorithms like gradient descent to find minimums of functions.
Helps understand how a function changes in response to changes in its inputs.
Examples: Consider a function f(x, y) that represents the height of a hill at point (x, y). The gradient of f
would tell you the direction of the steepest ascent from that point, which is the direction you'd have to walk
to climb the hill the fastest.
Order of Derivative: first order
input: scalar field
output: vector
The gradient points you in the direction of the steepest slope, which can be uphill or downhill depending on
whether you're minimizing or maximizing the function.
The gradient is a vector, with each dimension corresponding to a partial derivative of the function with
respect to an input variable.
In other words, it tells you how much and in which direction a function changes at a specific point.
Jacobian:
https://www.youtube.com/watch?v=mHJOkHYmzhE&list=PLNKD1qB9ppts7P3EN3YMzDvRTvNaqjpJo
Deals with functions that take multiple inputs and output multiple outputs (vector valued functions).
It's a matrix that captures how the function's outputs change with respect to each of its inputs.
Essentially, it's a collection of multiple gradients, one for each output of the function.
Applications:
9
Used in transformations between coordinate systems, like converting between cartesian and polar
coordinates.
Helps analyze how changes in inputs propagate through a system with multiple outputs.
Examples: A robotic arm with multiple joints can be described by a function that takes joint angles as inputs
and outputs the position of the gripper. The Jacobian would tell you how a small change in each joint angle
affects the gripper's position in all directions (x, y, z).
Order of Derivative: first order
input: vector valued function
output: matrix
It essentially represents the rate of change of a vector valued function.
https://www.youtube.com/watch?v=xv8WKIrv5WU
Hessian:
https://www.youtube.com/watch?v=-uEDRqzesbE
https://www.youtube.com/watch?v=z2yRNZtAvFc
https://www.youtube.com/watch?v=GCIU09KAOjg
Example Application: The Hessian is used in Newton’s optimization method, where its Eigen values help
determine whether a critical point is a local minimum, maximum, or saddle point.
Concerned with second-order derivatives of a scalar field.
It's a matrix that captures how quickly the gradient of the function (which represents the rate of change)
itself is changing.
By analyzing the Hessian, you can understand if a critical point (where the gradient is zero) is a minimum,
maximum, or saddle point.
Applications:
Used in optimization algorithms like Newton-Raphson method for faster convergence towards
minimums.
Helps analyze the stability of systems and equilibrium points.
Examples: Consider a function f(x, y) that represents the height of a hill at point (x, y). The Hessian of f
estimates whether at a peak (maximum height), valley (minimum height), or a sideway slope (saddle point).
Order of Derivative: second order
input: scalar field
output: matrix
In simpler terms, it captures the curvature of the function graph.
8 Differences and Applications of Jacobian, Hessian and Gradient:
Feature Gradient Jacobian Hessian
Vector of partial Matrix of first-order partial
Matrix of second-order partial
Definition derivatives of a scalar derivatives for vector-valued
derivatives for a scalar function.
function. functions.
Order of
First-order First-order Second-order
Derivative
Input Type Scalar field f(x,y,z,… ) Vector-valued function F(x) Scalar field f(x,y,z,… )
Output Type Vector Matrix Matrix
10
Measures how multiple outputs Captures curvature, helps
Determines the direction
Purpose change with respect to multiple determine minima, maxima, and
of steepest ascent/descent.
inputs. saddle points.
- Optimization (gradient - Convexity analysis.
- Coordinate transformations.
Applications descent). - Optimization (Newton’s
- Robotics and kinematics.
- Sensitivity analysis. method).
9 Understanding Linear Independence with Examples
https://www.youtube.com/watch?v=8d9Fo8Hj50M
https://www.youtube.com/watch?v=gqq_aPn4NcI&t=41s
https://www.youtube.com/watch?v=n9Mmuoh7ZfE
1. Theoretical Foundation
Linear independence occurs when no vector in a set can be expressed as a linear combination of the others.
Mathematically, for vectors v₁, v₂, ..., vₙ, if c₁v₁ + c₂v₂ + ... + cₙvₙ = 0 has only the trivial solution (all cᵢ = 0), the
vectors are linearly independent.
2. Detailed Examples
Example 1: Linearly Independent Vectors
Consider two vectors in R²:
v₁ = [1]
[0]
v₂ = [0]
[1]
Testing linear independence:
1. Set up equation: c₁v₁ + c₂v₂ = [0] [0]
2. Expand system:
o c₁(1) + c₂(0) = 0
o c₁(0) + c₂(1) = 0
3. Solve system:
o First equation: c₁ = 0
o Second equation: c₂ = 0
4. Conclusion: Only trivial solution exists (c₁ = c₂ = 0), therefore vectors are linearly independent.
Example 2: Linearly Dependent Vectors
Consider vectors:
v₁ = [2]
[4]
v₂ = [1]
[2]
Testing linear independence:
1. Set up equation: c₁v₁ + c₂v₂ = [0] [0]
2. Expand system:
o 2c₁ + c₂ = 0
o 4c₁ + 2c₂ = 0
3. Solve system:
o Both equations reduce to: 2c₁ + c₂ = 0
o Non-trivial solution exists: e.g., c₁ = 1, c₂ = -2
4. Conclusion: Non-trivial solution exists, therefore vectors are linearly dependent.
3. Geometric Interpretation
Linear independence manifests geometrically in different ways:
In R²: Independent vectors point in different directions
In R³: Independent vectors span different planes
11
General: Independent vectors create maximum dimensional space possible for their quantity
4. Engineering Applications
Structural Mechanics
Ensures unique solutions for force distributions
Validates support reaction calculations
Vibration Analysis
Enables identification of distinct vibration modes
Supports modal analysis calculations
Robotics and Kinematics
Validates joint motion independence
Optimizes movement planning
Control Systems
Ensures state variable uniqueness
Supports stability analysis
Finite Element Analysis
Validates element shape functions
Ensures solution uniqueness
Summary
Linear independence serves as a fundamental concept in engineering mathematics, enabling:
Unique solution identification
System redundancy elimination
Computational efficiency improvement
Reliable analysis frameworks
This systematic understanding helps engineers to develop more robust and efficient solutions across various
domains.
10 Matrix factorization concept / LU decomposition
https://www.youtube.com/watch?v=Cq7CS2Lgsdg
Matrix factorization is the process of breaking down a given matrix into multiple simpler matrices that, when
multiplied, reproduce the original matrix. This simplifies complex operations such as solving linear systems,
computing determinants, and performing numerical optimizations. One of the most commonly used matrix
factorizations in engineering is LU Decomposition.
LU Decomposition
LU Decomposition factorizes a square matrix A into the product of two matrices:
A=LU
L is a lower triangular matrix (with 1s on its diagonal).
U is an upper triangular matrix.
If partial pivoting is needed,
PA=LU
where P is a permutation matrix that reorders the rows for numerical stability.
LU decomposition is particularly useful for efficiently solving systems of linear equations, inverting matrices, and
computing determinants.
Significance of LU Decomposition in Engineering
LU decomposition is essential in numerical simulations, finite element methods (FEM), control systems, and
robotics. It is particularly beneficial for solving large systems of linear equations that arise in engineering
applications.
Advantages:
Computational Efficiency: Once LU decomposition is done, solving multiple systems Ax=b.
Numerical Stability: Pivoting improves stability in ill-conditioned problems.
Memory Efficiency: It avoids direct inversion of matrices, which is computationally expensive.
Applications of LU Decomposition in Engineering
(a) Structural Analysis (Finite Element Method)
12
Used to solve large systems of linear equations that arise from discretizing structures into finite elements.
Example: Solving displacement equations in a bridge structure under load.
(b) Circuit Analysis (Electrical Engineering)
Used in nodal and mesh analysis to solve large networks of resistors, capacitors, and inductors.
Example: Computing voltages and currents in complex electrical networks.
(c) Control Systems and Robotics
Solving matrix equations in dynamic systems, state-space models, and feedback control loops.
Example: Computing the inverse of a system dynamics matrix to find optimal control parameters.
(d) Fluid Mechanics and Heat Transfer
Solving partial differential equations (PDEs) that describe fluid flow and thermal conduction.
Example: Simulating airflow over an aircraft wing using computational fluid dynamics (CFD).
Example: Solving a System of Linear Equations Using LU Decomposition
Initial System
Consider the system of linear equations:
2x + 3y + z = 1
4x + 7y + 3z = 3
6x + 18y + 5z = 5
Matrix form: Ax = b, where:
A = [2 3 1] b = [1]
[4 7 3] [3]
[6 18 5] [5]
Solution Process
LU Decomposition
A=LU
Matrix A is decomposed into lower (L) and upper (U) triangular matrices:
L = [1 0 0] U = [2 3 1]
[2 1 0] [0 1 1]
[3 4 1] [0 0 -1]
Forward Substitution (Ly = b)
Solve the system Ly = b:
y₁ = 1
2y₁ + y₂ = 3 → y₂ = 1
3y₁ + 4y₂ + y₃ = 5 → y₃ = -2
y = [1]
[1]
[-2]
Backward Substitution (Ux = y)
Solve the system Ux = y:
-x₃ = -2 → x₃ = 2
x₂ + x₃ = 1 → x₂ = -1
2x₁ + 3x₂ + x₃ = 1 → x₁ = 1
Result:
x = [1]
[-1]
[2]
13
14