This is a learning note related to linear algebra.

  1. Scalars

    Most everyday mathematics consists of manipulating numbers one at a time. Formally, we call these values scalars. Scalars are denoted by ordinary lower-cased letters (e.g., $x$, $y$, and $z$)

  2. Vectors

    Vectors are fixed-length arrays of scalars and denoted by bold lowercase letters, (e.g., $\mathbf{x}$, $\mathbf{y}$, and $\mathbf{z}$). To indicate that a vector contains $n$ elements, we write $\mathbf{x}\in \mathbb{R}^n$. Formally, we call $n$ the dimensionality of the vector.

\[\nonumber\begin{split}\mathbf{x} =\begin{bmatrix}x_{1} \\ \vdots \\x_{n}\end{bmatrix},\end{split}\]
  1. Matrices

    Metrices are usually denoted by bold capital letters (e.g., $\mathbf{X}$, $\mathbf{Y}$, and $\mathbf{Z}$). The expression $\mathbf{X}\in \mathbb{R}^{m\times n}$ indicates that a matrix $\mathbf{X}$ contains $m\times n$ real-valued scalars, arranged as $m$ rows and $n$ columns. We say the matrix is square when $m=n$. Symmetric matrices are the subset of square matrices that are equal to their own transposes: $\mathbf{X} = \mathbf{X}^\text{T}$

  2. Basic arithmetic operations

    • Hadamard product (denoted $\odot$): elementwise product of two matrices of the same size

      \[\nonumber\begin{split}\mathbf{A} \odot \mathbf{B} = \begin{bmatrix} a_{11} b_{11} & a_{12} b_{12} & \dots & a_{1n} b_{1n} \\ a_{21} b_{21} & a_{22} b_{22} & \dots & a_{2n} b_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} b_{m1} & a_{m2} b_{m2} & \dots & a_{mn} b_{mn} \end{bmatrix}.\end{split}\]
    • Dot product: given two vectors $\mathbf{x}, \mathbf{y} \in \mathbb{R}^d$, their dot product $\mathbf{x}^\top \mathbf{y}$ is a sum over the products of the elements at the same position \(\nonumber\mathbf{x}^\top \mathbf{y} = \sum_{i=1}^{d} x_i y_i\)

    • matrix-matrix multiplication: \(\nonumber\begin{split}\mathbf{A}=\begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1k} \\ a_{21} & a_{22} & \cdots & a_{2k} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{nk} \\ \end{bmatrix},\quad \mathbf{B}=\begin{bmatrix} b_{11} & b_{12} & \cdots & b_{1m} \\ b_{21} & b_{22} & \cdots & b_{2m} \\ \vdots & \vdots & \ddots & \vdots \\ b_{k1} & b_{k2} & \cdots & b_{km} \\ \end{bmatrix}.\end{split}\\ \begin{split}\mathbf{C} = \mathbf{AB} = \begin{bmatrix} \mathbf{a}^\top_{1} \\ \mathbf{a}^\top_{2} \\ \vdots \\ \mathbf{a}^\top_n \\ \end{bmatrix} \begin{bmatrix} \mathbf{b}_{1} & \mathbf{b}_{2} & \cdots & \mathbf{b}_{m} \\ \end{bmatrix} = \begin{bmatrix} \mathbf{a}^\top_{1} \mathbf{b}_1 & \mathbf{a}^\top_{1}\mathbf{b}_2& \cdots & \mathbf{a}^\top_{1} \mathbf{b}_m \\ \mathbf{a}^\top_{2}\mathbf{b}_1 & \mathbf{a}^\top_{2} \mathbf{b}_2 & \cdots & \mathbf{a}^\top_{2} \mathbf{b}_m \\ \vdots & \vdots & \ddots &\vdots\\ \mathbf{a}^\top_{n} \mathbf{b}_1 & \mathbf{a}^\top_{n}\mathbf{b}_2& \cdots& \mathbf{a}^\top_{n} \mathbf{b}_m \end{bmatrix}.\end{split}\)

    • Norm:The norm is actually a function that converts vectors or matrices that cannot be compared into real numbers that can be compared. In general, the norm is used to measure distance. It includes vector norm and matrix norm.

      • vector norm
        • $\ell_1$ norm (Manhattan distance): $|\mathbf{x}|1 = \sum{i=1}^n \left x_i \right $
        • $\ell_2$ norm (Euclidean length): $|\mathbf{x}|2 = \sqrt{\sum{i=1}^n x_i^2}$
        • $\ell_p$ norm: $|\mathbf{x}|p = \left(\sum{i=1}^n \left x_i \right ^p \right)^{1/p}$
      • matrix norm
        • Frobenius norm: $|\mathbf{X}|F = \sqrt{\sum{i=1}^m \sum_{j=1}^n x_{ij}^2}$

Reference:

Dive Into Deep Learning

带你秒懂向量与矩阵的范数(Norm)