- 01 Vectors and Vector Spaces: The Language of Data 02 Matrices and Matrix Operations: Organizing Linear Computation 03 Systems of Linear Equations: From Geometry to Algorithms 04 Determinants: The Volume Factor of Linear Maps 05 Linear Transformations: Matrices as Functions 06 Inner Products, Norms, and Orthogonality: Measuring Geometry 07 Eigenvalues and Eigenvectors: The DNA of a Matrix 08 Matrix Decompositions: Breaking Matrices into Simpler Pieces 09 Linear Algebra in Machine Learning: Putting It All Together 10 Matrix Calculus: Derivatives for Machine Learning 11 Tensor Operations: Beyond Matrices 12 Sparse Matrices and Efficient Computation 13 Randomized Linear Algebra: Speed Through Randomness
Introduction
The determinant is a single number that encodes deep information about a square matrix: whether it is invertible, how it scales volumes, and the sign of the orientation change it produces. While you rarely compute determinants by hand in ML practice, understanding what they mean geometrically gives you strong intuition for concepts like Jacobians, change of variables in probability, and the behavior of linear transformations.
This article builds on systems of linear equations and prepares the ground for linear transformations.
Definition for 2x2 and 3x3
For a matrix:
For a matrix, expand along the first row (cofactor expansion):
The pattern alternates signs: , where is the minor — the determinant of the submatrix obtained by deleting row and column .
The Geometric Meaning
The determinant has a beautiful geometric interpretation:
Geometric interpretation: is the factor by which scales volumes. The sign indicates whether preserves () or reverses () orientation.
Consider the columns of as vectors. In 2D, is the area of the parallelogram spanned by these vectors. In 3D, it is the volume of the parallelepiped.
| Geometric Meaning | |
|---|---|
| Preserves orientation, scales volume by | |
| Reverses orientation, scales volume by $ | |
| Collapses space to lower dimension (not invertible) | |
| Preserves volume exactly (e.g., rotations) |
Example: A matrix with maps all of onto a line (or a point). The parallelogram collapses to zero area. This is why means the matrix is singular — it loses information.
Properties of Determinants
These properties make determinants powerful tools for reasoning about matrices:
Multiplicativity
This is the most important property. Composing two transformations multiplies their volume scaling factors.
Transpose
Row operations and column operations affect the determinant identically.
Inverse
This follows from .
Scalar Multiplication
where is the size of the matrix. Each of the rows gets scaled by .
Row Operations
| Operation | Effect on |
|---|---|
| Swap two rows | Multiplies by |
| Multiply a row by | Multiplies by |
| Add a multiple of one row to another | Does not change |
This is why Gaussian elimination is an efficient way to compute determinants: reduce to triangular form (tracking sign flips from row swaps), then multiply the diagonal entries.
Triangular Matrices
For upper or lower triangular matrices, the determinant is the product of diagonal entries:
This makes the identity matrix’s determinant obvious: .
Computing Determinants
Cofactor Expansion
For an matrix, expand along row :
where is the minor. The term is called the cofactor .
Cofactor expansion has complexity , which is impractical for large matrices. It is primarily a theoretical tool.
LU Decomposition
The practical method: factor (with row permutation matrix ), then:
where is the number of row swaps. This costs .
import numpy as np
A = np.array([[2, 1, 3],
[4, 5, 6],
[7, 8, 9]])
det = np.linalg.det(A)
print(f"det(A) = {det:.4f}") # det(A) = -3.0000
Cramer’s Rule
For a system with , each component of the solution is:
where is with column replaced by .
Warning: Cramer’s rule is elegant but computationally expensive ( with cofactor expansion). It is useful for theoretical analysis and for very small systems, but never for practical computation. Use LU decomposition or iterative methods instead.
Worked Example
Compute where:
Method 1: Cofactor expansion along column 1 (chosen because it has a zero):
Method 2: Row reduction
No row swaps, so . Same answer.
The Adjugate and Matrix Inverse
The adjugate (or classical adjoint) of is the transpose of the cofactor matrix:
The inverse can be expressed as:
This formula is mainly theoretical. For a matrix, it gives a compact formula:
Determinants and Eigenvalues
A deep connection we will explore in the eigenvalues article:
The determinant equals the product of all eigenvalues. Combined with the trace being the sum of eigenvalues, these two scalar quantities capture the most essential spectral information about a matrix.
Why This Matters for ML
While you rarely compute determinants directly in ML code, the concept appears in several important contexts:
- Invertibility check: means the matrix is singular — the system has no unique solution. This signals multicollinearity in features.
- Jacobian determinant: In normalizing flows (a class of generative models), the change-of-variables formula involves , where is the Jacobian of the transformation.
- Gaussian distribution: The multivariate Gaussian PDF includes :
- Volume interpretation: The determinant of the covariance matrix measures the “spread” of the distribution — larger determinant means more spread.
- Log-determinant: In optimization (e.g., maximum likelihood for Gaussians), we work with for numerical stability.
Summary
- The determinant is a scalar that encodes volume scaling, orientation, and invertibility.
- gives the factor by which scales volumes; the sign indicates orientation.
- means is singular — it collapses space and is not invertible.
- — composition multiplies volume factors.
- The determinant equals the product of eigenvalues and can be computed efficiently via LU decomposition.
- In ML, determinants appear in the multivariate Gaussian, normalizing flows, and invertibility checks.
- Next, we explore how matrices act as functions in linear transformations.
References
- Strang, G. (2016). Introduction to Linear Algebra (5th ed.). Wellesley-Cambridge Press. math.mit.edu/~gs/linearalgebra
- Axler, S. (2024). Linear Algebra Done Right (4th ed.). Springer. linear.axler.net
- Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning, Chapter 2. MIT Press. deeplearningbook.org