1. 泰勒级数展开
实际优化问题的目标函数往往比较复杂。为了使问题简化,通常将目标函数在某点附近展开为泰勒(Taylor)多项式来逼近原函数。
1.1 (一阶)偏导数的概念
【注】:可以看出偏导数的本质是 一元函数的导数
若 z = f ( x , y ) z=f(x,y) z=f(x,y)在区域 D D D的每一个点 ( x , y ) (x,y) (x,y)处都有偏导数(值),一般来说,它们仍是 x , y x,y x,y的函数,称为 f ( x , y ) f(x,y) f(x,y)的偏导(函)数,简称偏导数,
记为:
f x ′ ( x , y ) f’_x(x,y) fx′(x,y) 或 ∂ f ∂ x \frac{\partial f}{\partial x} ∂x∂f
1.2 二阶偏导数与混合偏导数的概念
若函数 z = f ( x , y ) z=f(x,y) z=f(x,y)的一阶偏导(函)数 ∂ f ∂ x = f x ′ , ∂ f ∂ y = f y ′ \frac{\partial f}{\partial x}=f’_x, \frac{\partial f}{\partial y}=f’_y ∂x∂f=fx′,∂y∂f=fy′关于 x x x和 y y y的偏导数仍然存在,
则,称一阶偏导数的偏导数是 z = f ( x , y ) z=f(x,y) z=f(x,y)的二阶偏导数。
二元函数 z = f ( x , y ) z=f(x,y) z=f(x,y)有四个二阶偏导数:
f x x ′ ( x , y ) f’_{xx}(x,y) fxx′(x,y), f x y ′ ( x , y ) f’_{xy}(x,y) fxy′(x,y) , f y x ′ ( x , y ) f’_{yx}(x,y) fyx′(x,y) , f y y ′ ( x , y ) f’_{yy}(x,y) fyy′(x,y)
类似地可以定义三阶、四阶、n阶偏导数。
其中,对不同自变量求导的高阶偏导数称为混合偏导数。 如 f x y ′ ( x , y ) f’_{xy}(x,y) fxy′(x,y) , f y x ′ ( x , y ) f’_{yx}(x,y) fyx′(x,y)
1.3 函数的泰勒级数展开
- 一元函数 f ( x ) f(x) f(x)在点 x k x_k xk处的泰勒展开式为:
f ( x ) = f ( x k ) + ( x − x k ) f ′ ( x k ) + 1 2 ! ( x − x k ) 2 f ′ ′ ( x k ) + ⋯ + 1 n ! ( x − x k ) n f n ( x k ) + o ( n ) f(x)=f(x_k)+(x-x_k)f'(x_k)+ \frac {1}{2!}(x-x_k)^2f”(x_k)+\cdots+\frac{1}{n!}(x-x_k)^n f^n(x_k)+o(n) f(x)=f(xk)+(x−xk)f′(xk)+2!1(x−xk)2f′′(xk)+⋯+n!1(x−xk)nfn(xk)+o(n) - 二元函数 f ( x , y ) f(x,y) f(x,y)在点 ( x k , y k ) (x_k,y_k) (xk,yk)处的泰勒展开式为:
f ( x , y ) = f ( x k , y k ) + ( x − x k ) f x ′ ( x k , y k ) + ( y − y k ) f y ′ ( x k , y k ) + 1 2 ! ( x − x k ) 2 f x x ′ ′ ( x k , y k ) + 1 2 ! ( x − x k ) ( y − y k ) f x y ′ ′ ( x k , y k ) + 1 2 ! ( y − y k ) ( x − x k ) f y x ′ ′ ( x k , y k ) + 1 2 ! ( y − y k ) 2 f y y ′ ′ ( x k , y k ) + ⋯ + o ( n ) f(x,y)=f(x_k,y_k)+(x-x_k)f’_x(x_k,y_k)+(y-y_k)f’_y(x_k,y_k)+\\ \frac {1}{2!}(x-x_k)^2 f”_{xx}(x_k,y_k)+\frac {1}{2!}(x-x_k)(y-y_k) f”_{xy}(x_k,y_k)+\\ \frac {1}{2!}(y-y_k)(x-x_k) f”_{yx}(x_k,y_k)+\frac {1}{2!}(y-y_k)^2 f”_{yy}(x_k,y_k)+\\ \cdots+o(n) f(x,y)=f(xk,yk)+(x−xk)fx′(xk,yk)+(y−yk)fy′(xk,yk)+2!1(x−xk)2fxx′′(xk,yk)+2!1(x−xk)(y−yk)fxy′′(xk,yk)+2!1(y−yk)(x−xk)fyx′′(xk,yk)+2!1(y−yk)2fyy′′(xk,yk)+⋯+o(n)
- n元函数 f ( x 1 , x 2 , ⋯ , x n ) f(x^1,x^2,\cdots,x^n) f(x1,x2,⋯,xn)在点 ( x k 1 , x k 2 , ⋯ , x k n ) (x^1_k,x^2_k,\cdots,x^n_k) (xk1,xk2,⋯,xkn)处的泰勒展开为:
f ( x 1 , x 2 , ⋯ , x n ) = f ( x k 1 , x k 2 , ⋯ , x k n ) + ∑ i = 1 n ( x i − x k i ) f x i ′ ( x k 1 , x k 2 , ⋯ , x k n ) + 1 2 ! ∑ i , j = 1 n ( x i − x k i ) ( x j − x k j ) f x i x j ′ ( x k 1 , x k 2 , ⋯ , x k n ) + ⋯ + o ( n ) f(x^1,x^2,\cdots,x^n)=f(x^1_k,x^2_k,\cdots,x^n_k)+\\ \sum^n_{i=1}(x^i -x^i_k)f’_{x^i}(x^1_k,x^2_k,\cdots,x^n_k)+\\ \frac{1}{2!}\sum^n_{i,j=1}(x^i-x^i_k)(x^j-x^j_k)f’_{x^i x^j}(x^1_k,x^2_k,\cdots,x^n_k)+\\ \cdots+o(n) f(x1,x2,⋯,xn)=f(xk1,xk2,⋯,xkn)+i=1∑n(xi−xki)fxi′(xk1,xk2,⋯,xkn)+2!1i,j=1∑n(xi−xki)(xj−xkj)fxixj′(xk1,xk2,⋯,xkn)+⋯+o(n)
该式可以表示为矩阵形式,如下:
2. 矩阵形式的泰勒级数展开式
记 X = [ x 1 , x 2 , ⋯ , x n ] T X=[x^1,x^2,\cdots,x^n]^T X=[x1,x2,⋯,xn]T, X k = [ x k 1 , x k 2 , ⋯ , x k n ] T X_k=[x^1_k,x^2_k,\cdots,x^n_k]^T Xk=[xk1,xk2,⋯,xkn]T
则,n元函数 f ( X ) f(X) f(X)在点 X k X_k Xk处的泰勒展开为:
f ( X ) = f ( X k ) + [ ∇ f ( X k ) ] T ( X − X k ) + 1 2 ! ( X − X k ) T H ( X k ) ( X − X k ) + o ( n ) f(X)=f(X_k)+[\nabla f(X_k)]^T(X-X_k)+\\ \frac{1}{2!}(X-X_k)^TH(X_k)(X-X_k)+o(n) f(X)=f(Xk)+[∇f(Xk)]T(X−Xk)+2!1(X−Xk)TH(Xk)(X−Xk)+o(n)
其中, ∇ f ( X k ) = [ ∂ f ( X k ) ∂ x 1 , ∂ f ( X k ) ∂ x 2 , ⋯ , ∂ f ( X k ) ∂ x n ] T \nabla f(X_k)=[\frac{\partial f(X_k)}{\partial x^1},\frac{\partial f(X_k)}{\partial x^2},\cdots,\frac{\partial f(X_k)}{\partial x^n}]^T ∇f(Xk)=[∂x1∂f(Xk),∂x2∂f(Xk),⋯,∂xn∂f(Xk)]T
称为n元函数 f ( X ) f(X) f(X)在点 X k X_k Xk处的梯度(向量);
H ( X k ) = [ ∂ 2 f ( X k ) ∂ x 1 ∂ x 1 ∂ 2 f ( X k ) ∂ x 1 ∂ x 2 ⋯ ∂ 2 f ( X k ) ∂ x 1 ∂ x 3 ∂ 2 f ( X k ) ∂ x 2 ∂ x 1 ∂ 2 f ( X k ) ∂ x 2 ∂ x 2 ⋯ ∂ 2 f ( X k ) ∂ x 2 ∂ x 3 ⋮ ⋮ ⋱ ⋮ ∂ 2 f ( X k ) ∂ x n ∂ x 1 ∂ 2 f ( X k ) ∂ x n ∂ x 2 ⋯ ∂ 2 f ( X k ) ∂ x n ∂ x 1 ] H(X_k)= \begin{bmatrix} \frac{\partial ^2 f(X_k)}{\partial x^1 \partial x^1} & \frac{\partial ^2 f(X_k)}{\partial x^1 \partial x^2} & \cdots & \frac{\partial ^2 f(X_k)}{\partial x^1 \partial x^3} \\ \frac{\partial ^2 f(X_k)}{\partial x^2 \partial x^1} & \frac{\partial ^2 f(X_k)}{\partial x^2 \partial x^2} & \cdots & \frac{\partial ^2 f(X_k)}{\partial x^2 \partial x^3} \\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial ^2 f(X_k)}{\partial x^n \partial x^1} & \frac{\partial ^2 f(X_k)}{\partial x^n \partial x^2} & \cdots & \frac{\partial ^2 f(X_k)}{\partial x^n \partial x^1} \end{bmatrix} H(Xk)=⎣⎢⎢⎢⎢⎡∂x1∂x1∂2f(Xk)∂x2∂x1∂2f(Xk)⋮∂xn∂x1∂2f(Xk)∂x1∂x2∂2f(Xk)∂x2∂x2∂2f(Xk)⋮∂xn∂x2∂2f(Xk)⋯⋯⋱⋯∂x1∂x3∂2f(Xk)∂x2∂x3∂2f(Xk)⋮∂xn∂x1∂2f(Xk)⎦⎥⎥⎥⎥⎤
2.1 雅各比矩阵
2.2 海森矩阵
2.3 变量为向量的泰勒级数展开
发布者:全栈程序员-站长,转载请注明出处:https://javaforall.net/203336.html原文链接:https://javaforall.net
