矩阵求导
简介
在工程中经常涉及到矩阵求导、求偏导,特别是做控制进行优化的时候。矩阵AAA对矩阵BBB求导,本质上就是矩阵AAA中的每个元素分别对矩阵BBB中的每个元素进行求导。其实存在歧义的就是如何排列,其实按行排列还是按列排列都是可以的,只要用的时候维度能够对上。
这里引入了分子布局、分母布局来统一矩阵求导结果的维度。分子布局是让求导结果的维度以分子为主,分母布局是让求导结果的维度以分母为主。以下面这个为例
∂A∂B
\frac{\partial A}{\partial B}
∂B∂A
分子布局就是让求导的结果排列以AAA为主,分子布局就是让结果排列以BBB为主。
以分母布局为例的基本原则
原则1
若fff为标量函数,xxx为列向量
f(x1,x2,...,xp)x=[x1,x2,...,xp]T
\begin{aligned}
&f(x_1, x_2, ..., x_p)\\
&x = [x_1, x_2, ..., x_p]^T
\end{aligned}
f(x1,x2,...,xp)x=[x1,x2,...,xp]T
那么fff关于xxx的导数为:
∂f∂x=[∂f∂x1,∂f∂x2,...,∂f∂x3]T
\frac{\partial f}{\partial x} = [\frac{\partial f}{\partial x_1}, \frac{\partial f}{\partial x_2}, ..., \frac{\partial f}{\partial x_3}]^T
∂x∂f=[∂x1∂f,∂x2∂f,...,∂x3∂f]T
若fff为标量函数,xxx为行向量
f(x1,x2,...,xp)x=[x1,x2,...,xp]
\begin{aligned}
&f(x_1, x_2, ..., x_p)\\
&x = [x_1, x_2, ..., x_p]
\end{aligned}
f(x1,x2,...,xp)x=[x1,x2,...,xp]
那么fff关于xxx的导数为:
∂f∂x=[∂f∂x1,∂f∂x2,...,∂f∂x3]
\frac{\partial f}{\partial x} = [\frac{\partial f}{\partial x_1}, \frac{\partial f}{\partial x_2}, ..., \frac{\partial f}{\partial x_3}]
∂x∂f=[∂x1∂f,∂x2∂f,...,∂x3∂f]
原则2
若FFF为列向量函数,xxx为标量:
F=[f1(x),f2(x),...,fp(x)]Tx
\begin{aligned}
&F = [f_1(x), f_2(x), ..., f_p(x)]^T\\
&x
\end{aligned}
F=[f1(x),f2(x),...,fp(x)]Tx
那么FFF关于xxx的导数为:
∂F∂x=[∂f1∂x,∂f2∂x,...,∂fm∂x]=∂fT∂x
\frac{\partial F}{\partial x} = [\frac{\partial f_1}{\partial x}, \frac{\partial f_2}{\partial x}, ..., \frac{\partial f_m}{\partial x}] = \frac{\partial f^T}{\partial x}
∂x∂F=[∂x∂f1,∂x∂f2,...,∂x∂fm]=∂x∂fT
推论
若FFF为列向量函数,xxx为列向量:
F=[f1(x),f2(x),...,fm(x)]Tx=[x1,x2,...,xp]
\begin{aligned}
&F = [f_1(x), f_2(x), ..., f_m(x)]^T\\
&x = [x_1, x_2, ..., x_p]
\end{aligned}
F=[f1(x),f2(x),...,fm(x)]Tx=[x1,x2,...,xp]
则根据原则1和原则2可以推理得到FFF关于xxx的导数为:
∂F∂x=[∂f1∂x1∂f2∂x1...∂fm∂x1∂f1∂x2∂f2∂x2...∂fm∂x2............∂f1∂xp∂f2∂xp...∂fm∂xp]
\frac{\partial F}{\partial x} =
\begin{bmatrix}
\frac{\partial f_1}{\partial x_1} & \frac{\partial f_2}{\partial x_1} & ... & \frac{\partial f_m}{\partial x_1} \\
\frac{\partial f_1}{\partial x_2} & \frac{\partial f_2}{\partial x_2} & ... & \frac{\partial f_m}{\partial x_2} \\
... & ... & ... & ... \\
\frac{\partial f_1}{\partial x_p} & \frac{\partial f_2}{\partial x_p} & ... & \frac{\partial f_m}{\partial x_p}
\end{bmatrix}
∂x∂F=∂x1∂f1∂x2∂f1...∂xp∂f1∂x1∂f2∂x2∂f2...∂xp∂f2............∂x1∂fm∂x2∂fm...∂xp∂fm
分子布局其实同分母布局计算思路一样,最后的结果相差一个转置符号。
实际使用中常采用混合布局:向量对标量求导,用分子布局;标量对向量求导,用分母布局;向量对向量求导存在分歧,如求雅各比矩阵以分子布局为主。
工程中常用的矩阵求导
-
∂xTPx∂x=Px\frac{\partial x^T P x}{\partial x} = Px∂x∂xTPx=Px,其中PPP是对称正定阵;
-
∂Ax∂x=AT\frac{\partial Ax}{\partial x} = A^T∂x∂Ax=AT;
-
求雅各比矩阵:
∂f∂x=[∂f1∂x1∂f2∂x1...∂fm∂x1∂f1∂x2∂f2∂x2...∂fm∂x2............∂f1∂xp∂f2∂xp...∂fm∂xp] \frac{\partial f}{\partial x} = \begin{bmatrix} \frac{\partial f_1}{\partial x_1} & \frac{\partial f_2}{\partial x_1} & ... & \frac{\partial f_m}{\partial x_1} \\ \frac{\partial f_1}{\partial x_2} & \frac{\partial f_2}{\partial x_2} & ... & \frac{\partial f_m}{\partial x_2} \\ ... & ... & ... & ... \\ \frac{\partial f_1}{\partial x_p} & \frac{\partial f_2}{\partial x_p} & ... & \frac{\partial f_m}{\partial x_p} \end{bmatrix} ∂x∂f=∂x1∂f1∂x2∂f1...∂xp∂f1∂x1∂f2∂x2∂f2...∂xp∂f2............∂x1∂fm∂x2∂fm...∂xp∂fm -
(后续补充)