Calculus, AI, and linear algebra: a compact field guide
You write or review ML code and want a fast, code-first refresher on the calculus and linear algebra behind gradients, Jacobians, and SVD.
Most ML code is just calculus and linear algebra in disguise. Here is a concise refresher with runnable snippets.
Gradients in plain sight
A gradient is the vector of partial derivatives. For a scalar function :
Example: yields .
import numpy as np
def f(xy):
x, y = xy
return x**2 + x*y + 3*y**2
# analytic gradient
def grad(xy):
x, y = xy
return np.array([2*x + y, x + 6*y])
pt = np.array([2.0, -1.0])
print("f:", f(pt))
print("grad:", grad(pt))Finite differences are a quick sanity check:
def finite_diff(fn, pt, eps=1e-5):
g = np.zeros_like(pt)
for i in range(len(pt)):
step = np.zeros_like(pt)
step[i] = eps
g[i] = (fn(pt + step) - fn(pt - step)) / (2 * eps)
return g
print("finite diff:", finite_diff(f, pt))Jacobians: vector outputs
For , the Jacobian stacks the gradients of each output component. A simple two-output function:
Its Jacobian is:
def g(xy):
x, y = xy
return np.array([x**2 + y, x*y])
def jacobian(xy):
x, y = xy
return np.array([[2*x, 1], [y, x]])
pt = np.array([1.5, 0.5])
print("g(pt):", g(pt))
print("J(pt):\n", jacobian(pt))Linear algebra fuel: projections and SVD
Principal component analysis (PCA) is just the singular value decomposition (SVD): . The top right singular vectors in are the principal directions.
rng = np.random.default_rng(7)
X = rng.normal(size=(6, 3)) # 6 samples, 3 features
# center
Xc = X - X.mean(axis=0, keepdims=True)
# SVD
U, S, Vt = np.linalg.svd(Xc, full_matrices=False)
print("singular values:", S)
print("first principal direction:", Vt[0])
# project to 2D
X2 = Xc @ Vt[:2].T
print("projected shape:", X2.shape)Projection of a vector onto a direction is:
v = np.array([2.0, 1.0, -1.0])
u = Vt[0] # principal direction
proj = (v @ u) / (u @ u) * u
print("projection:", proj)
graph LR;
Data["High-dimensional data X"] --> Center["Center columns"];
Center --> SVD["SVD: X = U Σ Vᵀ"];
SVD --> PCs["Take top k rows of Vᵀ (principal directions)"];
PCs --> Project["Project: X · V_kᵀ"];
Project --> Embeddings["Lower-dimensional embeddings"];
Why this matters for AI
- Gradients drive optimizers (SGD, Adam); Jacobians underpin backprop.
- SVD/PCA reduces dimensionality and denoises embeddings.
- Projections help in retrieval and similarity search by isolating informative axes.
If you keep these primitives sharp, most model code becomes easier to reason about and debug.