一、基本概念
1.1 协方差矩阵 及推导
方差: v a r ( X ) = ∑ i = 1 n ( X i − X ˉ ) ( X i − X ˉ ) n − 1 var(X) = \frac{\sum_{i=1}^n(X_i-\bar{X})(X_i-\bar{X})}{n-1} var(X)=n−1∑i=1n(Xi−Xˉ)(Xi−Xˉ)
各个维度偏离其均值的程度,协方差: cov ( X , Y ) = ∑ i = 1 n ( X i − X ˉ ) ( Y i − Y ˉ ) n − 1 \text{cov}(X,Y) = \frac{\sum_{i=1}^n(X_i-\bar{X})(Y_i-\bar{Y})}{n-1} cov(X,Y)=n−1∑i=1n(Xi−Xˉ)(Yi−Yˉ)
1.2 Hessian矩阵
其中: Δ x = x − x ( 0 ) , Δ x 2 = ( x − x ( 0 ) ) 2 \Delta x = x-x^{(0)},\Delta x^2 = (x-x^{(0)})^2 Δx=x−x(0),Δx2=(x−x(0))2
二元函数 f ( x 1 , x 2 ) f(x_1,x_2) f(x1,x2)在 X ( 0 ) ( x 1 ( 0 ) , x 2 ( 0 ) ) X^{(0)}(x^{(0)}_1,x^{(0)}_2) X(0)(x1(0),x2(0))点处的泰勒展开式为:
1 2 [ ∂ 2 f ∂ 2 x 1 2 ∣ x ( 0 ) Δ x 1 2 + 2 ∂ 2 f ∂ x 1 ∂ x 2 ∣ x ( 0 ) Δ x 1 Δ x 2 + ∂ 2 f ∂ 2 x 2 2 ∣ x ( 0 ) Δ x 2 2 ] + ⋯ (2) \frac{1}{2}\left [ \frac{\partial^2f}{\partial^2x_1^2}|_{x^{(0)}} \Delta x_1^2 + 2\frac{\partial^2f}{\partial x_1\partial x_2}|_{x^{(0)}}\Delta x_1\Delta x_2+\frac{\partial^2f}{\partial^2x_2^2}|_{x^{(0)}} \Delta x_2^2\right ]+\cdots \tag{2} 21[∂2x12∂2f∣x(0)Δx12+2∂x1∂x2∂2f∣x(0)Δx1Δx2+∂2x22∂2f∣x(0)Δx22]+⋯(2)
其中: Δ x 1 = x 1 − x 1 ( 0 ) , Δ x 2 = x 2 − x 2 ( 0 ) \Delta x_1 = x_1-x^{(0)}_1,\Delta x_2 = x_2-x_2^{(0)} Δx1=x1−x1(0),Δx2=x2−x2(0)
G ( X ( 0 ) ) G(X^{(0)}) G(X(0))是 f ( x 1 , x 2 ) f(x_1,x_2) f(x1,x2) 在 X ( 0 ) X^{(0)} X(0) 点处的Hessian矩阵。它是由函数 f ( x 1 , x 2 ) f(x_1,x_2) f(x1,x2) 在 X ( 0 ) X^{(0)} X(0)点处的二阶偏导数所组成的方阵。我们一般将其表示为:
H ( f ) = [ ∂ 2 f ∂ x 1 2 ∂ 2 f ∂ x 1 ∂ x 2 ⋯ ∂ 2 f ∂ x 1 ∂ x n ∂ 2 f ∂ x 2 ∂ x 1 ∂ 2 f ∂ x 2 2 ⋯ ∂ 2 f ∂ x 2 ∂ x n ⋮ ⋮ ⋱ ⋮ ∂ 2 f ∂ x n ∂ x 1 ∂ 2 f ∂ x n ∂ x 2 ⋯ ∂ 2 f ∂ x n 2 ] H(f) = \begin{bmatrix} \frac{\partial^2f}{\partial x_1^2} & \frac{\partial^2f}{\partial x_1 \partial x_2} & \cdots & \frac{\partial^2f}{\partial x_1 \partial x_n} \\ \frac{\partial^2f}{\partial x_2 \partial x_1} & \frac{\partial^2f}{\partial x_2^2} & \cdots & \frac{\partial^2f}{\partial x_2 \partial x_n}\\ \vdots & \vdots & \ddots &\vdots \\ \frac{\partial^2f}{\partial x_n \partial x_1} & \frac{\partial^2f}{\partial x_n \partial x_2} & \cdots & \frac{\partial^2f}{\partial x_n^2} \end{bmatrix} H(f)=⎣⎢⎢⎢⎢⎢⎡∂x12∂2f∂x2∂x1∂2f⋮∂xn∂x1∂2f∂x1∂x2∂2f∂x22∂2f⋮∂xn∂x2∂2f⋯⋯⋱⋯∂x1∂xn∂2f∂x2∂xn∂2f⋮∂xn2∂2f⎦⎥⎥⎥⎥⎥⎤
简写成: Q H e s s i a n = [ I x x I x y I y x I y y ] \mathbf{Q_{Hessian}} = \begin{bmatrix} I_{xx} & I_{xy}\\ I_{yx} & I_{yy} \end{bmatrix} QHessian=[IxxIyxIxyIyy]

1.3 Hessian矩阵 示例

1.3 正定矩阵定义及性质
1.4 正定矩阵 示例

发布者:全栈程序员-站长,转载请注明出处:https://javaforall.net/232340.html原文链接:https://javaforall.net
