This is a learning note related to the Jacobian matrix and Hessian matrix.

  1. Jacobian matrix

    Assume $f: \mathbb{R}^n \rightarrow \mathbb{R}^m$ is a function that takes a point $\boldsymbol{x}\in\mathbb{R}^n$ as input and produces the vector $f(\boldsymbol{x})\in\mathbb{R}^m$. The function consists of $m$ real functions:

    \[\nonumber f = [f_1(x_1,...,x_n),\cdot\cdot\cdot, f_m(x_1,...,x_n)]\]

    The partial derivatives of these functions (if they exist) can form a $m×n$ matrix called Jacobian matrix. It can be denoted as $\boldsymbol{J}$.

  2. Hessian matrix

    Hessian matrix consists of second-order partial derivatives.

    Assume $f: \mathbb{R}^n \rightarrow \mathbb{R}$ is a function that takes a point $\boldsymbol{x}\in\mathbb{R}^n$ as input and produces a scalar $f(\boldsymbol{x})\in\mathbb{R}$. The Hessian matrix $\boldsymbol{H}$ of $f$ is a square $n×n$ matrix, usually defined as:

  3. Summary

    • If $f(\boldsymbol{x})$ is a scalar function, then the Jacobian matrix is a vector, equal to the gradient of $f(\boldsymbol{x})$, and the Hessian matrix is a two-dimensional matrix. While $f(\boldsymbol{x})$ is a vector function, then the Jacobian matrix is a two-dimensional matrix and the Hessian matrix is a three-dimensional matrix.
    • The gradient is a special case of the Jacobian matrix, and the jacobian matrix of the gradient is the Hessian matrix (the relationship between the first-order partial derivative and the second-order partial derivative)

Reference

Jacobian matrix and determinant

机器学习中导数最优化方法(基础篇)

Hessian matrix