일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | ||
6 | 7 | 8 | 9 | 10 | 11 | 12 |
13 | 14 | 15 | 16 | 17 | 18 | 19 |
20 | 21 | 22 | 23 | 24 | 25 | 26 |
27 | 28 | 29 | 30 |
- Admissions
- factors
- Weierstrass
- 영국
- algebraic
- integral
- Partial
- Oxford
- solution
- a-level
- fractions
- 교육
- equation
- 바이어슈트라스
- division
- 학년
- differential
- 적분
- 치환
- GCSE
- DENOMINATOR
- Order
- College
- triangle
- mathematics
- test
- factor
- 제도
- Maths
- t-치환
- Today
- Total
Cambridge Maths Academy
A classification of critical points with the Hessian matrix 본문
A classification of critical points with the Hessian matrix
Cambridge Maths Academy 2022. 4. 11. 00:25
수학 모음 (Maths collection) 전체보기
For a function which depends on two variables (x,y), f=f(x,y) we find the critical points by considering 2-dimensional gradient and second-order derivatives. (In 1D, the critical points are usually called the stationary points.)
(i) Critical points: ∇f=(∂f∂x,∂f∂y)=0
(ii) A classification: We consider a 2-dimensional Taylor expansion f(x+Δx)=f(x)+(∂f∂xΔx+∂f∂yΔy)+12[∂2f∂x2(Δx)2+∂2f∂x∂yΔxΔy+∂f∂y(Δy)2]+⋯=f(x)+(∂f∂x∂f∂y)⏟∇f⋅(ΔxΔy)⏟Δx+12(ΔxΔy)⏟Δx⊺ where the Hessian matrix H is defined by \begin{align} H = \begin{pmatrix} \frac{ \partial^2 \textrm f }{ \partial x^2 } & \frac{ \partial^2 \textrm f }{ \partial x \partial y } \\ \frac{ \partial^2 \textrm f }{ \partial y \partial x } & \frac{ \partial^2 \textrm f }{ \partial y^2 } \end{pmatrix} \end{align}
Aside. A multi-dimensional Taylor expansion reads \begin{align} \textrm f( \textbf x + \Delta \textbf x) &= \textrm f( \textbf x ) + \sum_{i=1}^n \frac{ \partial \textrm f }{ \partial x_i } \Delta x_i + \sum_{i,j} \frac{ \partial^2 \textrm f }{ \partial x_i \partial x_j } \Delta x_i \Delta x_j + \sum_{i,j,k} \frac{ \partial^3 \textrm f }{ \partial x_i \partial x_j \partial x_k } \Delta x_i \Delta x_j \Delta x_k + \cdots \end{align}
(iii) Diagonalisation: As we diagonalise the Hessian matrix using the eigenvalue equation, \begin{align} && H \begin{pmatrix} \textbf e_1 & \textbf e_2 \end{pmatrix} = \underbrace{ \begin{pmatrix} \textbf e_1 & \textbf e_2 \end{pmatrix} }_{ P^{-1} } \begin{pmatrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{pmatrix} \\ \\ &\Rightarrow& H = \begin{pmatrix} \textbf e_1 & \textbf e_2 \end{pmatrix} \begin{pmatrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{pmatrix} \begin{pmatrix} \textbf e_1 & \textbf e_2 \end{pmatrix}^{-1} = P^{-1} \Lambda P \end{align} Since H is real symmetric, the eigenvectors form an orthonormal basis and P is thus an orthogonal matrix, i.e. P^\intercal P = \mathbb I \qquad \Leftrightarrow \qquad P^{-1} = P^\intercal which gives H = P^\intercal \Lambda P
The Taylor expansion may be re-written as \begin{align} \textrm f( \textbf x + \Delta \textbf x) &= \textrm f( \textbf x ) + \Delta \textbf x \cdot \nabla \textrm f + \frac12 ( P \Delta \textbf x)^\intercal \Lambda ( P \Delta \textbf x ) + \cdots \end{align} For critical points \textbf x_0, where \nabla \textrm f = 0, this gives \begin{align} \textrm f( \textbf x_0 + \Delta \textbf x) &= \textrm f( \textbf x_0 ) + \frac12 ( P \Delta \textbf x)^\intercal \Lambda ( P \Delta \textbf x ) + \cdots \\ &= \textrm f( \textbf x_0 ) + \frac12 ( P \Delta \textbf x)^\intercal \begin{pmatrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{pmatrix} ( P \Delta \textbf x ) + \cdots \end{align} Thus, the eigenvalues of H tells about the nature of the critical point(s), i.e.
- If \lambda_1 > 0 and \lambda_2 > 0 , \textrm f( \textbf x_0 + \Delta \textbf x) > \textrm f ( \textbf x_0 ) for all \Delta \textbf x , hence the critical point is a local minimum.
- If \lambda_1 < 0 and \lambda_2 < 0 , \textrm f( \textbf x_0 + \Delta \textbf x) < \textrm f ( \textbf x_0 ) for all \Delta \textbf x , hence the critical point is a local maximum.
- If \lambda_1 \lambda_2 < 0 , i.e. the two eigenvalues take the opposite signs, \textrm f( \textbf x_0 + \Delta \textbf x) > \textrm f ( \textbf x_0 ) in one direction while \textrm f( \textbf x_0 + \Delta \textbf x) < \textrm f ( \textbf x_0 ) in its orthogonal direction, henc the critical point is a saddle point.
- If \lambda_1 \lambda_2 = 0 , i.e. at least one of them is zero, then the critical point is degenerate as the surface is flat in the relevant direction - the direction of the eigenvector whose eigenvalue is zero.
(iv) Eigenvalues: The eigenvalues of the Hessian matrix are given by \begin{align} && \begin{vmatrix} \textrm f_{xx} - \lambda & \textrm f_{xy} \\ \textrm f_{xy} & \textrm f_{yy} - \lambda \end{vmatrix} = 0 \\ \\ & \Rightarrow & ( \textrm f_{xx} - \lambda )( \textrm f_{yy} - \lambda ) - \textrm f_{xy}^2 = 0 \\ & \Rightarrow & \lambda^2 - ( \textrm f_{xx} + \textrm f_{yy} ) \lambda + \left( \textrm f_{xx} \textrm f_{yy} - \textrm f_{xy}^2 \right) = 0 \\ & \Rightarrow & \lambda^2 - ( \textrm{tr} \, H ) \lambda + \det H = 0 \\ \\ & \Rightarrow & \lambda = \frac{ \textrm{tr} \, H \pm \sqrt{ ( \textrm{tr} \, H )^2 - 4 \det H } }{ 2 } \end{align} which gives \begin{align} & \Rightarrow & \lambda &= \frac{ ( \textrm f_{xx} + \textrm f_{yy} ) \pm \sqrt{ ( \textrm f_{xx} + \textrm f_{yy} )^2 - 4 \left( \textrm f_{xx} \textrm f_{yy} - \textrm f_{xy}^2 \right) } }{ 2 } \\ &&&= \frac{ ( \textrm f_{xx} + \textrm f_{yy} ) \pm \sqrt{ \left( \textrm f_{xx}^2 + 2 \textrm f_{xx} \textrm f_{yy} + \textrm f_{yy}^2 \right) - 4 \left( \textrm f_{xx} \textrm f_{yy} - \textrm f_{xy}^2 \right) } }{ 2 } \\ &&&= \frac{ ( \textrm f_{xx} + \textrm f_{yy} ) \pm \sqrt{ \left( \textrm f_{xx}^2 - 2 \textrm f_{xx} \textrm f_{yy} + \textrm f_{yy}^2 \right) + 4 \textrm f_{xy}^2 } }{ 2 } \\ &&&= \frac{ ( \textrm f_{xx} + \textrm f_{yy} ) \pm \sqrt{ ( \textrm f_{xx} - \textrm f_{yy} )^2 + 4 \textrm f_{xy}^2 } }{ 2 } \end{align} So we see that the discriminant is always non-negative, i.e. \Delta = ( \textrm{tr} \, H )^2 - 4 \det H = ( \textrm f_{xx} - \textrm f_{yy} )^2 + 4 \textrm f_{xy}^2 \ge 0
(1) For \Delta = 0, i.e. ( \textrm{tr} \, H )^2 = 4 \det H \qquad \Leftrightarrow \qquad \textrm f_{xx} = \textrm f_{yy} \quad \textrm{and} \quad \textrm f_{xy} = 0 the eigenvalues are double roots and \det H > 0. p = \frac{ \textrm{tr} \, H }{ 2 } = \frac{ \textrm f_{xx} + \textrm f_{yy} }{ 2 }
- For \textrm{tr} \, H > 0, it gives a local minimum.
- For \textrm{tr} \, H < 0, it gives a local maximum.
- For \textrm{tr} \, H = 0, we also find \det H = 0 and the critical point is degenerate.
(2) For \Delta > 0: then there are two distinct eigenvalues.
- If \det H = \textrm f_{xx} \textrm f_{yy} - \textrm f_{xy}^2 <0, the two eigenvalues are of the opposite signs. The critical point is a saddle point.
- If \det H = \textrm f_{xx} \textrm f_{yy} - \textrm f_{xy}^2 = 0, one of the eigenvalues is zero. p_1 = 0 \quad \textrm{and} \quad p_2 = \textrm{tr} \, H The critical point is degenerate as the surface is flat in the relevant direction - the direction of the eigenvector whose eigenvalue is zero.
If \det H = \textrm f_{xx} \textrm f_{yy} - \textrm f_{xy}^2 >0,
(i) if \textrm{tr} \, H = \textrm f_{xx} + \textrm f_{yy} > 0 , the critical point is a local minimum;
(ii) if \textrm{tr} \, H = \textrm f_{xx} + \textrm f_{yy} < 0 , the critical point is a local maximum.
If \textrm{tr} \, H = \textrm f_{xx} + \textrm f_{yy} = 0 , then p = \pm \sqrt{ - \det H } and \begin{align} \det H &= \textrm f_{xx} \textrm f_{yy} - \textrm f_{xy}^2 \\ &= - \textrm f_{xx}^2 - \textrm f_{xy}^2 \le 0 \end{align}
(i) if \det H = 0 , i.e. \textrm f_{xx} = \textrm f_{yy} = \textrm f_{xy} = 0 , the two eigenvalues are both zero so the critical point is degenerate;
(ii) if \det H < 0 , the two eigenvalues take the opposite signs so the critical point is a saddle point.
The results can be summarised: \begin{align} \begin{array}{|c|c|c|c|} \hline & \det H = \textrm f_{xx} \textrm f_{yy} - \textrm f_{xy}^2 > 0 & \det H = \textrm f_{xx} \textrm f_{yy} - \textrm f_{xy}^2 = 0 & \det H = \textrm f_{xx} \textrm f_{yy} - \textrm f_{xy}^2 < 0 \\\hline \textrm{tr} \, H = \textrm f_{xx} + \textrm f_{yy} > 0 & \textrm{A lcoal minimum} & \textrm{Degenerate} & \textrm{A saddle point} \\\hline \textrm{tr} \, H = \textrm f_{xx} + \textrm f_{yy} < 0 & \textrm{A lcoal maximum} & \textrm{Degenerate} & \textrm{A saddle point} \\\hline \textrm{tr} \, H = \textrm f_{xx} + \textrm f_{yy} = 0 & \textrm{Not possible} & \textrm{Degenerate} & \textrm{A saddle point} \\\hline \end{array} \end{align}
\det H = \textrm f_{xx} \textrm f_{yy} - \textrm f_{xy}^2 > 0 | \det H = \textrm f_{xx} \textrm f_{yy} - \textrm f_{xy}^2 = 0 | \det H = \textrm f_{xx} \textrm f_{yy} - \textrm f_{xy}^2 < 0 | |
\textrm{tr} \, H = \textrm f_{xx} + \textrm f_{yy} > 0 | A local minimum | Degenerate | A saddle point |
\textrm{tr} \, H = \textrm f_{xx} + \textrm f_{yy} < 0 | A local maximum | Degenerate | A saddle piont |
\textrm{tr} \, H = \textrm f_{xx} + \textrm f_{yy} = 0 | Not possible | Degenerate | A saddle point |
'수학 모음 (Maths collection) > Technical A - Exploring ideas' 카테고리의 다른 글
Integration of a product of an exponential and a trigonometric/hyperbolic function (0) | 2022.03.08 |
---|---|
A first-order differential equation with complex coefficients (0) | 2022.03.05 |
16. Integration of powers of the sine function (Wallis integral) (0) | 2021.12.09 |
15. Integration of the square-root of cot x (0) | 2021.11.24 |
14. Integration of the square-root of tan x (0) | 2021.11.23 |