Calculus & Optimization
Derivatives, gradients, and optimization methods used to train models.
- 01 Limits and Continuity: The Foundation of CalculusUnderstand limits, continuity, and the epsilon-delta definition — the bedrock on which derivatives, integrals, and optimization are built.
- 02 Derivatives and Differentiation: Measuring Rates of ChangeMaster the derivative from first principles — definition, rules, common functions, and how differentiation drives machine learning optimization.
- 03 Partial Derivatives and Gradients: Calculus in Multiple DimensionsLearn partial derivatives, the gradient vector, directional derivatives, the Jacobian, and the Hessian — the multivariable toolkit for ML optimization.
- 04 The Chain Rule and Computational Graphs: The Engine Behind BackpropagationHow the chain rule powers backpropagation — from single-variable compositions to computational graphs and automatic differentiation.
- 05 Taylor Series and Approximation: Local Models of Complex FunctionsUnderstand Taylor expansions, linearization, quadratic approximation, and Newton's method — the math connecting derivatives to optimization.
- 06 Gradient Descent: The Workhorse of Machine Learning OptimizationMaster gradient descent from first principles — the algorithm, learning rate selection, convergence analysis, and local minima in loss landscapes.
- 07 Stochastic Gradient Descent: Trading Precision for SpeedLearn SGD, mini-batch methods, momentum, Nesterov acceleration, and learning rate schedules — the practical optimizers that train modern ML models.
- 08 Adaptive Learning Rate Methods: From AdaGrad to AdamUnderstand AdaGrad, RMSProp, Adam, and AdamW — adaptive optimizers that tune per-parameter learning rates for faster, more robust training.
- 09 Constrained Optimization: Lagrange Multipliers and KKT ConditionsMaster Lagrange multipliers, KKT conditions, and duality — the tools for optimization with equality and inequality constraints in ML.
- 10 Convexity and Convergence Theory: When Optimization SucceedsUnderstand convex functions, global vs local optima, convergence rates, and the theoretical guarantees that underpin ML optimization algorithms.
- 11 Integration and Expectation: The Continuous Side of ProbabilityFrom Riemann integrals to Monte Carlo estimation — how integration underpins probability densities, expectations, and marginalizations in ML.
- 12 Calculus of Variations: Optimizing Over FunctionsLearn the Euler-Lagrange equation, variational inference, and the ELBO — how optimizing over functions powers VAEs and Bayesian deep learning.
- 13 Second-Order and Natural Gradient MethodsGo beyond first-order optimization with Newton's method, Fisher information, natural gradient descent, and K-FAC for deep learning.
- 14 Numerical Stability in Optimization: Making Training Work in PracticeMaster the log-sum-exp trick, gradient clipping, mixed precision, and other techniques that prevent numerical disasters during model training.
- 15 Non-Smooth Optimization and Proximal MethodsHandle non-differentiable objectives with subgradients, proximal operators, and ADMM — the tools behind L1 sparsity, pruning, and robust losses.
- 16 Optimization Landscape of Neural Networks: Why Deep Learning WorksExplore loss surface geometry, sharp vs flat minima, mode connectivity, the lottery ticket hypothesis, and why SGD finds generalizable solutions.
- 17 Implicit Differentiation and Differentiable ProgrammingBackpropagate through optimization, fixed points, and ODEs — learn implicit differentiation for meta-learning, hyperparameter tuning, and Neural ODEs.
- 18 Min-Max Optimization: Games, GANs, and Adversarial TrainingMaster min-max optimization for GANs, adversarial robustness, and RLHF — two-player games where one player minimizes while the other maximizes.