Determinants: The Volume Factor of Linear Maps

Linear Algebra Series 4 / 13

Introduction

The determinant is a single number that encodes deep information about a square matrix: whether it is invertible, how it scales volumes, and the sign of the orientation change it produces. While you rarely compute determinants by hand in ML practice, understanding what they mean geometrically gives you strong intuition for concepts like Jacobians, change of variables in probability, and the behavior of linear transformations.

This article builds on systems of linear equations and prepares the ground for linear transformations.

Definition for 2x2 and 3x3

For a $2 \times 2$ matrix:

\det\begin{bmatrix} a & b \\ c & d \end{bmatrix} = ad - bc

For a $3 \times 3$ matrix, expand along the first row (cofactor expansion):

\det\begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix} = a(ei - fh) - b(di - fg) + c(dh - eg)

The pattern alternates signs: $+a \cdot M_{11} - b \cdot M_{12} + c \cdot M_{13}$ , where $M_{ij}$ is the minor — the determinant of the submatrix obtained by deleting row $i$ and column $j$ .

The Geometric Meaning

The determinant has a beautiful geometric interpretation:

Geometric interpretation: $|\det(\mathbf{A})|$ is the factor by which $\mathbf{A}$ scales volumes. The sign indicates whether $\mathbf{A}$ preserves ( $+$ ) or reverses ( $-$ ) orientation.

Consider the columns of $\mathbf{A}$ as vectors. In 2D, $|\det(\mathbf{A})|$ is the area of the parallelogram spanned by these vectors. In 3D, it is the volume of the parallelepiped.

$\det(\mathbf{A})$	Geometric Meaning
$\det(\mathbf{A}) > 0$	Preserves orientation, scales volume by $\det(\mathbf{A})$
$\det(\mathbf{A}) < 0$	Reverses orientation, scales volume by $
$\det(\mathbf{A}) = 0$	Collapses space to lower dimension (not invertible)
$\det(\mathbf{A}) = 1$	Preserves volume exactly (e.g., rotations)

Example: A $2 \times 2$ matrix with $\det = 0$ maps all of $\mathbb{R}^2$ onto a line (or a point). The parallelogram collapses to zero area. This is why $\det = 0$ means the matrix is singular — it loses information.

Properties of Determinants

These properties make determinants powerful tools for reasoning about matrices:

Multiplicativity

\det(\mathbf{A}\mathbf{B}) = \det(\mathbf{A}) \cdot \det(\mathbf{B})

This is the most important property. Composing two transformations multiplies their volume scaling factors.

Transpose

\det(\mathbf{A}^T) = \det(\mathbf{A})

Row operations and column operations affect the determinant identically.

Inverse

\det(\mathbf{A}^{-1}) = \frac{1}{\det(\mathbf{A})}

This follows from $\det(\mathbf{A}\mathbf{A}^{-1}) = \det(\mathbf{I}) = 1$ .

Scalar Multiplication

\det(c\mathbf{A}) = c^n \det(\mathbf{A})

where $n$ is the size of the matrix. Each of the $n$ rows gets scaled by $c$ .

Row Operations

Operation	Effect on $\det$
Swap two rows	Multiplies $\det$ by $-1$
Multiply a row by $c$	Multiplies $\det$ by $c$
Add a multiple of one row to another	Does not change $\det$

This is why Gaussian elimination is an efficient way to compute determinants: reduce to triangular form (tracking sign flips from row swaps), then multiply the diagonal entries.

Triangular Matrices

For upper or lower triangular matrices, the determinant is the product of diagonal entries:

\det(\mathbf{U}) = u_{11} \cdot u_{22} \cdot \ldots \cdot u_{nn}

This makes the identity matrix’s determinant obvious: $\det(\mathbf{I}) = 1$ .

Computing Determinants

Cofactor Expansion

For an $n \times n$ matrix, expand along row $i$ :

\det(\mathbf{A}) = \sum_{j=1}^{n} (-1)^{i+j} a_{ij} M_{ij}

where $M_{ij}$ is the minor. The term $(-1)^{i+j} M_{ij}$ is called the cofactor $C_{ij}$ .

Cofactor expansion has complexity $O(n!)$ , which is impractical for large matrices. It is primarily a theoretical tool.

LU Decomposition

The practical method: factor $\mathbf{A} = \mathbf{P}\mathbf{L}\mathbf{U}$ (with row permutation matrix $\mathbf{P}$ ), then:

\det(\mathbf{A}) = \det(\mathbf{P}) \cdot \det(\mathbf{L}) \cdot \det(\mathbf{U}) = (-1)^s \cdot 1 \cdot \prod_{i=1}^{n} u_{ii}

where $s$ is the number of row swaps. This costs $O(n^3)$ .

import numpy as np

A = np.array([[2, 1, 3],
              [4, 5, 6],
              [7, 8, 9]])

det = np.linalg.det(A)
print(f"det(A) = {det:.4f}")  # det(A) = -3.0000

Cramer’s Rule

For a system $\mathbf{A}\mathbf{x} = \mathbf{b}$ with $\det(\mathbf{A}) \neq 0$ , each component of the solution is:

x_j = \frac{\det(\mathbf{A}_j)}{\det(\mathbf{A})}

where $\mathbf{A}_j$ is $\mathbf{A}$ with column $j$ replaced by $\mathbf{b}$ .

Warning: Cramer’s rule is elegant but computationally expensive ( $O(n \cdot n!)$ with cofactor expansion). It is useful for theoretical analysis and for very small systems, but never for practical computation. Use LU decomposition or iterative methods instead.

Worked Example

Compute $\det(\mathbf{A})$ where:

\mathbf{A} = \begin{bmatrix} 1 & 3 & 2 \\ 0 & 4 & 1 \\ 2 & 6 & 5 \end{bmatrix}

Method 1: Cofactor expansion along column 1 (chosen because it has a zero):

\begin{aligned} \det(\mathbf{A}) &= 1 \cdot \det\begin{bmatrix} 4 & 1 \\ 6 & 5 \end{bmatrix} - 0 \cdot \det\begin{bmatrix} 3 & 2 \\ 6 & 5 \end{bmatrix} + 2 \cdot \det\begin{bmatrix} 3 & 2 \\ 4 & 1 \end{bmatrix} \\[6pt] &= 1 \cdot (20 - 6) - 0 + 2 \cdot (3 - 8) \\[6pt] &= 14 - 10 = 4 \end{aligned}

Method 2: Row reduction

\begin{bmatrix} 1 & 3 & 2 \\ 0 & 4 & 1 \\ 2 & 6 & 5 \end{bmatrix} \xrightarrow{R_3 - 2R_1} \begin{bmatrix} 1 & 3 & 2 \\ 0 & 4 & 1 \\ 0 & 0 & 1 \end{bmatrix}

No row swaps, so $\det(\mathbf{A}) = 1 \cdot 4 \cdot 1 = 4$ . Same answer.

The Adjugate and Matrix Inverse

The adjugate (or classical adjoint) of $\mathbf{A}$ is the transpose of the cofactor matrix:

\text{adj}(\mathbf{A}) = \mathbf{C}^T

The inverse can be expressed as:

\mathbf{A}^{-1} = \frac{1}{\det(\mathbf{A})} \text{adj}(\mathbf{A})

This formula is mainly theoretical. For a $2 \times 2$ matrix, it gives a compact formula:

\begin{bmatrix} a & b \\ c & d \end{bmatrix}^{-1} = \frac{1}{ad - bc} \begin{bmatrix} d & -b \\ -c & a \end{bmatrix}

Determinants and Eigenvalues

A deep connection we will explore in the eigenvalues article:

\det(\mathbf{A}) = \prod_{i=1}^{n} \lambda_i

The determinant equals the product of all eigenvalues. Combined with the trace being the sum of eigenvalues, these two scalar quantities capture the most essential spectral information about a matrix.

Why This Matters for ML

While you rarely compute determinants directly in ML code, the concept appears in several important contexts:

Invertibility check: $\det(\mathbf{A}) = 0$ means the matrix is singular — the system $\mathbf{A}\mathbf{x} = \mathbf{b}$ has no unique solution. This signals multicollinearity in features.
Jacobian determinant: In normalizing flows (a class of generative models), the change-of-variables formula involves $|\det(\mathbf{J})|$ , where $\mathbf{J}$ is the Jacobian of the transformation.
Gaussian distribution: The multivariate Gaussian PDF includes $\det(\boldsymbol{\Sigma})$ :

f(\mathbf{x}) = \frac{1}{(2\pi)^{n/2} |\det(\boldsymbol{\Sigma})|^{1/2}} \exp\left(-\frac{1}{2}(\mathbf{x} - \boldsymbol{\mu})^T \boldsymbol{\Sigma}^{-1} (\mathbf{x} - \boldsymbol{\mu})\right)

Volume interpretation: The determinant of the covariance matrix measures the “spread” of the distribution — larger determinant means more spread.
Log-determinant: In optimization (e.g., maximum likelihood for Gaussians), we work with $\log\det(\boldsymbol{\Sigma})$ for numerical stability.

Summary

The determinant is a scalar that encodes volume scaling, orientation, and invertibility.
$|\det(\mathbf{A})|$ gives the factor by which $\mathbf{A}$ scales volumes; the sign indicates orientation.
$\det(\mathbf{A}) = 0$ means $\mathbf{A}$ is singular — it collapses space and is not invertible.
$\det(\mathbf{A}\mathbf{B}) = \det(\mathbf{A}) \cdot \det(\mathbf{B})$ — composition multiplies volume factors.
The determinant equals the product of eigenvalues and can be computed efficiently via LU decomposition.
In ML, determinants appear in the multivariate Gaussian, normalizing flows, and invertibility checks.
Next, we explore how matrices act as functions in linear transformations.

References

Strang, G. (2016). Introduction to Linear Algebra (5th ed.). Wellesley-Cambridge Press. math.mit.edu/~gs/linearalgebra
Axler, S. (2024). Linear Algebra Done Right (4th ed.). Springer. linear.axler.net
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning, Chapter 2. MIT Press. deeplearningbook.org