Linear Algebra

A complete interactive guide — from vectors to eigenvalues, with worked examples at every step.
Estimated reading time: 2-3 hours.

1. What Does "Linear" Mean?

"Algebra" means relationships between quantities. "Linear Algebra" means line-like relationships — predictable, proportional, no surprises.

Think of a rooftop: move 3 feet forward, rise 1 foot. Move 30 feet forward — you expect a 10-foot rise. That's linear. Climbing a dome? Each foot forward raises you a different amount. Not linear.

A function $F$ is linear if it obeys two rules:

Rule 1 — Scaling (homogeneity): $$F(a \cdot x) = a \cdot F(x)$$

If I double the input, the output doubles. If I triple it, the output triples.


Rule 2 — Addition (superposition): $$F(x + y) = F(x) + F(y)$$

The effect of combined inputs = the combination of individual effects.

Key insight: These two rules together mean we can break any complex problem into small pieces, solve each piece separately, and combine the answers. This is why linear algebra is so powerful — it makes hard problems tractable.

You can combine both rules into one: $F(ax + by) = aF(x) + bF(y)$. This is called superposition.

Quick check — is it linear?

FunctionLinear?Why?
$F(x) = 3x$Yes$F(2x) = 6x = 2 \cdot 3x = 2F(x)$ ✓
$F(x) = x^2$No$F(2x) = 4x^2 \neq 2x^2 = 2F(x)$ ✗
$F(x) = x + 3$No$F(0) = 3 \neq 0$. Linear functions must pass through the origin ✗
$F(x) = \sin(x)$No$\sin(2x) \neq 2\sin(x)$ in general ✗
$F(x) = 0$Yes$F(ax) = 0 = a \cdot 0 = aF(x)$ ✓ (the "boring" linear function)
Common confusion: In everyday language, "$y = 2x + 1$" is called "linear" (it's a line!). But in linear algebra, it's not linear — it's affine. The "$+1$" violates $F(0) = 0$. We'll see later how homogeneous coordinates let us sneak addition back in (Section 19).

Interactive: Linear vs Non-Linear

Compare $F(x) = ax$ (linear) with $F(x) = x^n$ (non-linear). Watch how doubling the input affects the output.

2.0 2.0
Worked Example: Testing Linearity of $F(x) = 5x$
Step 1: Check scaling. $F(3 \cdot 2) = F(6) = 30$. Also $3 \cdot F(2) = 3 \cdot 10 = 30$. ✓
Step 2: Check addition. $F(2 + 7) = F(9) = 45$. Also $F(2) + F(7) = 10 + 35 = 45$. ✓
Conclusion: $F(x) = 5x$ is linear.
Worked Example: Testing Linearity of $F(x) = x^2 + 1$
Step 1: Check $F(0) = 0^2 + 1 = 1 \neq 0$. Already fails! A linear function must give $F(0) = 0$.
Step 2: Even ignoring that: $F(2 + 3) = F(5) = 26$. But $F(2) + F(3) = 5 + 10 = 15 \neq 26$. ✗
Conclusion: Not linear. The $x^2$ term makes it grow too fast, and the $+1$ shifts the origin.
Real-World Example: Hooke's Law — Springs in Engineering

The force in a spring is $F = kx$, where $k$ is the spring constant and $x$ is the displacement:

$$F(2x) = k(2x) = 2 \cdot kx = 2 \cdot F(x)$$

Double the stretch, double the force. Automotive suspension engineers, bridge designers, and seismologists rely on this linearity. When deformation exceeds the elastic limit the relationship becomes non-linear — exactly when the math gets hard.

Why linearity matters: Superposition in Physics

Maxwell's equations for electromagnetism are linear. This means if $E_1$ is the electric field from charge A and $E_2$ is the field from charge B, then the total field is simply $E_1 + E_2$. You can analyse each charge separately.

If the equations were non-linear (like in general relativity), you couldn't just add fields — the presence of one charge would change how the other charge's field works. This is why gravity is so much harder to calculate than electromagnetism.

2. Vectors — The Building Blocks

Before we can do linear algebra, we need something to do it to. That something is the vector.

What IS a vector?

A vector is an ordered list of numbers. That's it. But depending on context, it can represent many things:

We write vectors as columns (by convention) and use bold or arrows:

$$\mathbf{v} = \vec{v} = \begin{bmatrix} 3 \\ 4 \end{bmatrix} \qquad \text{a 2D vector}$$ $$\mathbf{w} = \begin{bmatrix} 1 \\ -2 \\ 5 \end{bmatrix} \qquad \text{a 3D vector}$$

Vector Addition

Add vectors component by component. Geometrically: put them tip-to-tail.

$$\begin{bmatrix} 3 \\ 1 \end{bmatrix} + \begin{bmatrix} 1 \\ 3 \end{bmatrix} = \begin{bmatrix} 3+1 \\ 1+3 \end{bmatrix} = \begin{bmatrix} 4 \\ 4 \end{bmatrix}$$

Scalar Multiplication

Multiply every component by the same number (the "scalar"). This stretches or shrinks the vector.

$$2 \cdot \begin{bmatrix} 3 \\ 1 \end{bmatrix} = \begin{bmatrix} 6 \\ 2 \end{bmatrix} \qquad -1 \cdot \begin{bmatrix} 3 \\ 1 \end{bmatrix} = \begin{bmatrix} -3 \\ -1 \end{bmatrix}$$

Multiplying by 2 doubles the length. Multiplying by $-1$ flips the direction.

Interactive: Vector Playground

Drag the sliders to change vectors $\mathbf{u}$ and $\mathbf{v}$. See their sum and scalar multiples.

3, 1 1, 3
1
Worked Example: Vector Arithmetic

Given $\mathbf{a} = \begin{bmatrix} 2 \\ -1 \\ 3 \end{bmatrix}$ and $\mathbf{b} = \begin{bmatrix} -1 \\ 4 \\ 2 \end{bmatrix}$, compute $3\mathbf{a} - 2\mathbf{b}$.

Step 1: $3\mathbf{a} = 3 \begin{bmatrix} 2 \\ -1 \\ 3 \end{bmatrix} = \begin{bmatrix} 6 \\ -3 \\ 9 \end{bmatrix}$
Step 2: $2\mathbf{b} = 2 \begin{bmatrix} -1 \\ 4 \\ 2 \end{bmatrix} = \begin{bmatrix} -2 \\ 8 \\ 4 \end{bmatrix}$
Step 3: $3\mathbf{a} - 2\mathbf{b} = \begin{bmatrix} 6 \\ -3 \\ 9 \end{bmatrix} - \begin{bmatrix} -2 \\ 8 \\ 4 \end{bmatrix} = \begin{bmatrix} 6-(-2) \\ -3-8 \\ 9-4 \end{bmatrix} = \begin{bmatrix} 8 \\ -11 \\ 5 \end{bmatrix}$

Vector Length (Magnitude / Norm)

The length of a vector comes from the Pythagorean theorem:

$$\|\mathbf{v}\| = \sqrt{v_1^2 + v_2^2 + \cdots + v_n^2}$$

Example: $\left\|\begin{bmatrix} 3 \\ 4 \end{bmatrix}\right\| = \sqrt{9 + 16} = \sqrt{25} = 5$

A unit vector has length 1. To "normalise" any vector, divide by its length: $\hat{\mathbf{v}} = \frac{\mathbf{v}}{\|\mathbf{v}\|}$. This keeps the direction but makes the length exactly 1.

Worked Example: Normalising a Vector

Normalise $\mathbf{v} = \begin{bmatrix} 3 \\ 4 \end{bmatrix}$.

Step 1: Find length: $\|\mathbf{v}\| = \sqrt{3^2 + 4^2} = \sqrt{25} = 5$
Step 2: Divide: $\hat{\mathbf{v}} = \frac{1}{5}\begin{bmatrix} 3 \\ 4 \end{bmatrix} = \begin{bmatrix} 0.6 \\ 0.8 \end{bmatrix}$
Check: $\sqrt{0.6^2 + 0.8^2} = \sqrt{0.36 + 0.64} = \sqrt{1} = 1$ ✓
Real-World Example: GPS — "Where am I?"

Your phone's position is a vector $\mathbf{p} = (x, y, z)$ in 3D space relative to the Earth's centre. The satellite's position is another vector $\mathbf{s}$. The distance between you and the satellite is $\|\mathbf{p} - \mathbf{s}\|$ — the length of the difference vector. Your phone receives timing signals from 4+ satellites and solves for the position vector $\mathbf{p}$ where all the distances are consistent. Pure vector arithmetic.

3. Linear Operations & Combinations

Now the payoff: combining linear operations. Which operations are linear?

The crucial insight: we can combine multiple linear functions and the result is still linear:

$$G(x, y, z) = ax + by + cz$$

This is called a linear combination. Each input is scaled by a constant (a "weight"), then the results are added.

Key insight: Because it's linear, we can split inputs apart, analyse them individually, and combine the results: $$G(x,y,z) = G(x,0,0) + G(0,y,0) + G(0,0,z) = ax + by + cz$$ This "decompose → solve → recombine" pattern is the heart of linear algebra.

Let's verify the linearity of $G$ explicitly:

Scaling check: $G(2x, 2y, 2z) = a(2x) + b(2y) + c(2z) = 2(ax + by + cz) = 2 \cdot G(x,y,z)$ ✓

Addition check: $G(x_1+x_2, y_1+y_2, z_1+z_2) = a(x_1+x_2) + b(y_1+y_2) + c(z_1+z_2)$ $= (ax_1+by_1+cz_1) + (ax_2+by_2+cz_2) = G(x_1,y_1,z_1) + G(x_2,y_2,z_2)$ ✓

Interactive: Linearity Tester

Enter coefficients $a, b, c$ and two input vectors. Verify that $G(\mathbf{u} + \mathbf{v}) = G(\mathbf{u}) + G(\mathbf{v})$.

, , , ,
Worked Example: Linear Combination

Express $\begin{bmatrix} 7 \\ 1 \end{bmatrix}$ as a linear combination of $\begin{bmatrix} 1 \\ 2 \end{bmatrix}$ and $\begin{bmatrix} 3 \\ -1 \end{bmatrix}$.

We need $a$ and $b$ such that $a\begin{bmatrix} 1 \\ 2 \end{bmatrix} + b\begin{bmatrix} 3 \\ -1 \end{bmatrix} = \begin{bmatrix} 7 \\ 1 \end{bmatrix}$.

Step 1: Write the system: $a + 3b = 7$ and $2a - b = 1$.
Step 2: From equation 2: $b = 2a - 1$. Substitute into equation 1: $a + 3(2a - 1) = 7 \Rightarrow 7a - 3 = 7 \Rightarrow a = \frac{10}{7}$.
Step 3: $b = 2(\frac{10}{7}) - 1 = \frac{20}{7} - \frac{7}{7} = \frac{13}{7}$.
Check: $\frac{10}{7}\begin{bmatrix} 1 \\ 2 \end{bmatrix} + \frac{13}{7}\begin{bmatrix} 3 \\ -1 \end{bmatrix} = \frac{1}{7}\begin{bmatrix} 10+39 \\ 20-13 \end{bmatrix} = \frac{1}{7}\begin{bmatrix} 49 \\ 7 \end{bmatrix} = \begin{bmatrix} 7 \\ 1 \end{bmatrix}$ ✓
Real-World Example: Audio Mixing

A sound engineer mixes three microphone tracks — each channel is scaled by a gain coefficient and summed:

$$\text{mix}(t) = 0.6 \cdot \text{vocal}(t) + 0.3 \cdot \text{guitar}(t) + 0.1 \cdot \text{drums}(t)$$

This is a linear combination. Doubling all input volumes doubles the mix — no distortion (in the linear regime). Every DAW, mixing console, and hearing aid processes audio this way.

4. The Dot Product — Measuring Alignment

The dot product (or "inner product") takes two vectors and produces a single number. It measures how much two vectors point in the same direction.

$$\mathbf{a} \cdot \mathbf{b} = a_1 b_1 + a_2 b_2 + \cdots + a_n b_n$$

Multiply corresponding components, then add them all up.

Worked Example: Computing a Dot Product

$\begin{bmatrix} 2 \\ 3 \\ -1 \end{bmatrix} \cdot \begin{bmatrix} 4 \\ -2 \\ 5 \end{bmatrix} = (2)(4) + (3)(-2) + (-1)(5) = 8 - 6 - 5 = -3$

The Geometric Meaning

The dot product has an elegant geometric interpretation:

$$\mathbf{a} \cdot \mathbf{b} = \|\mathbf{a}\| \cdot \|\mathbf{b}\| \cdot \cos\theta$$

where $\theta$ is the angle between the two vectors.

This tells us everything about how two vectors relate:

$\mathbf{a} \cdot \mathbf{b}$$\cos\theta$Meaning
Positive$> 0$Vectors point roughly the same way ($\theta < 90°$)
Zero$= 0$Vectors are perpendicular / orthogonal ($\theta = 90°$)
Negative$< 0$Vectors point roughly opposite ($\theta > 90°$)
$= \|\mathbf{a}\|\|\mathbf{b}\|$$= 1$Vectors point the exact same direction ($\theta = 0°$)
Key insight: The dot product lets you decompose a vector into "how much is along this direction" and "how much is perpendicular." This is called projection.

Projection

The projection of $\mathbf{a}$ onto $\mathbf{b}$ tells you "how far does $\mathbf{a}$ go in the direction of $\mathbf{b}$?"

$$\text{proj}_{\mathbf{b}} \mathbf{a} = \frac{\mathbf{a} \cdot \mathbf{b}}{\mathbf{b} \cdot \mathbf{b}} \mathbf{b}$$

The scalar part $\frac{\mathbf{a} \cdot \mathbf{b}}{\|\mathbf{b}\|}$ is called the scalar projection (a signed length).

Interactive: Dot Product & Projection

Adjust vectors $\mathbf{a}$ and $\mathbf{b}$. Watch the dot product value and the projection change.

3, 2 4, 0
Worked Example: Finding the Angle Between Vectors

Find the angle between $\mathbf{a} = \begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix}$ and $\mathbf{b} = \begin{bmatrix} 4 \\ -5 \\ 6 \end{bmatrix}$.

Step 1: $\mathbf{a} \cdot \mathbf{b} = (1)(4) + (2)(-5) + (3)(6) = 4 - 10 + 18 = 12$
Step 2: $\|\mathbf{a}\| = \sqrt{1+4+9} = \sqrt{14}$, $\|\mathbf{b}\| = \sqrt{16+25+36} = \sqrt{77}$
Step 3: $\cos\theta = \frac{12}{\sqrt{14}\sqrt{77}} = \frac{12}{\sqrt{1078}} \approx \frac{12}{32.83} \approx 0.365$
Step 4: $\theta = \arccos(0.365) \approx 68.6°$
Worked Example: Projecting One Vector onto Another

Project $\mathbf{a} = \begin{bmatrix} 3 \\ 4 \end{bmatrix}$ onto $\mathbf{b} = \begin{bmatrix} 1 \\ 0 \end{bmatrix}$ (the x-axis).

Step 1: $\mathbf{a} \cdot \mathbf{b} = (3)(1) + (4)(0) = 3$
Step 2: $\mathbf{b} \cdot \mathbf{b} = 1 + 0 = 1$
Step 3: $\text{proj}_{\mathbf{b}} \mathbf{a} = \frac{3}{1}\begin{bmatrix} 1 \\ 0 \end{bmatrix} = \begin{bmatrix} 3 \\ 0 \end{bmatrix}$

This makes intuitive sense: the "shadow" of $(3,4)$ onto the x-axis is just $(3,0)$.

Real-World Example: Recommendation Engines

Netflix represents each user as a vector of movie ratings: User A = (5, 3, 0, 4, ...) and User B = (4, 3, 0, 5, ...). The dot product (or cosine similarity $\frac{\mathbf{a}\cdot\mathbf{b}}{\|\mathbf{a}\|\|\mathbf{b}\|}$) measures how similar their tastes are. A high value means "these users like the same things" → recommend User B's top-rated movies to User A. This is called collaborative filtering.

Real-World Example: Work in Physics

In physics, work = force $\cdot$ displacement = $\mathbf{F} \cdot \mathbf{d} = \|\mathbf{F}\|\|\mathbf{d}\|\cos\theta$. Push a box at an angle? Only the component of force along the direction of motion does work. That's the dot product in action — it extracts "the useful part" of the force vector.

5. Span, Basis & Linear Independence

With vectors and linear combinations under our belt, three big questions arise:

  1. Given some vectors, what can I reach by combining them? (Span)
  2. Are any of my vectors redundant? (Linear independence)
  3. What's the minimal set of vectors I need to reach everything? (Basis)

Span

The span of a set of vectors = all possible linear combinations of those vectors. It's the set of every point you can reach by scaling and adding them.

$$\text{span}\{\mathbf{v}_1, \mathbf{v}_2, \ldots, \mathbf{v}_k\} = \{c_1\mathbf{v}_1 + c_2\mathbf{v}_2 + \cdots + c_k\mathbf{v}_k \;\;|\;\; c_i \in \mathbb{R}\}$$

Think of it like a colour mixing analogy:

Linear Independence

Vectors are linearly independent if none of them can be written as a combination of the others. Equivalently:

$$c_1\mathbf{v}_1 + c_2\mathbf{v}_2 + \cdots + c_k\mathbf{v}_k = \mathbf{0} \;\;\Longrightarrow\;\; c_1 = c_2 = \cdots = c_k = 0$$

The only way to combine them to get zero is if all coefficients are zero.

Intuition: each independent vector adds a genuinely new direction. If a vector is dependent, it's "wasted" — it points somewhere the others already covered.

Worked Example: Testing Linear Independence

Are $\mathbf{v}_1 = \begin{bmatrix} 1 \\ 2 \end{bmatrix}$ and $\mathbf{v}_2 = \begin{bmatrix} 3 \\ 6 \end{bmatrix}$ linearly independent?

Test: Notice that $\mathbf{v}_2 = 3\mathbf{v}_1$. So $3\mathbf{v}_1 - 1\mathbf{v}_2 = \mathbf{0}$ with non-zero coefficients.
Conclusion: Dependent. $\mathbf{v}_2$ is just $\mathbf{v}_1$ scaled by 3. They point in the same direction and only span a line, not a plane.
Worked Example: Independent Vectors

Are $\mathbf{v}_1 = \begin{bmatrix} 1 \\ 0 \end{bmatrix}$ and $\mathbf{v}_2 = \begin{bmatrix} 0 \\ 1 \end{bmatrix}$ independent?

Suppose $c_1\begin{bmatrix} 1 \\ 0 \end{bmatrix} + c_2\begin{bmatrix} 0 \\ 1 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix}$. Then $c_1 = 0$ and $c_2 = 0$.
Conclusion: Independent. They point in completely different directions (perpendicular, even).

Basis and Dimension

A basis for a space is a set of vectors that is:

  1. Linearly independent (no redundancy), AND
  2. Spans the entire space (reaches everything).

The standard basis for $\mathbb{R}^3$ is:

$$\mathbf{e}_1 = \begin{bmatrix} 1\\0\\0 \end{bmatrix}, \quad \mathbf{e}_2 = \begin{bmatrix} 0\\1\\0 \end{bmatrix}, \quad \mathbf{e}_3 = \begin{bmatrix} 0\\0\\1 \end{bmatrix}$$

Any 3D vector is a combination: $\begin{bmatrix} 3\\-2\\5 \end{bmatrix} = 3\mathbf{e}_1 - 2\mathbf{e}_2 + 5\mathbf{e}_3$.

Key insight: The dimension of a space = the number of vectors in any basis. $\mathbb{R}^2$ has dimension 2 (you need exactly 2 independent vectors). $\mathbb{R}^n$ has dimension $n$. There's no basis with fewer or more vectors — any other basis for the same space has the same count.

Interactive: Span Visualiser (2D)

Two vectors in 2D. When they're independent, they span the whole plane (shown in blue). When dependent, they only span a line.

, ,
Real-World Example: RGB Colour Space

Your screen creates colours using three basis vectors: Red, Green, Blue. The "span" of {R, G, B} is every colour your screen can display. The dimension is 3 (you need exactly 3 primaries). If you only had R and G (dimension 2), you'd be stuck in a plane of the colour cube — no blues at all.

Any colour = $r \cdot \text{Red} + g \cdot \text{Green} + b \cdot \text{Blue}$ with $r,g,b \in [0,1]$. This is literally a linear combination with the colour primaries as basis vectors.

6. Organizing Inputs & Operations

We have a bunch of inputs to track, and predictable linear operations to perform. How do we organise?

Inputs go in vertical columns (vectors):

$$\mathbf{x} = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}$$

Operations go in horizontal rows. If $F(x,y,z) = 3x + 4y + 5z$, we abbreviate the entire function as the row $[3\;\;4\;\;5]$. Each row is a recipe: "take this much of input 1, this much of input 2, etc."

Multiple operations stack into rows; multiple inputs sit side-by-side as columns:

$$\underbrace{\begin{bmatrix} 3 & 4 & 5 \\ 3 & 0 & 0 \end{bmatrix}}_{\text{Operations } M} \underbrace{\begin{bmatrix} a & x \\ b & y \\ c & z \end{bmatrix}}_{\text{Inputs } A} = \underbrace{\begin{bmatrix} 3a+4b+5c & 3x+4y+5z \\ 3a & 3x \end{bmatrix}}_{\text{Outputs } B}$$

Read it as: "Row 1 says: take 3 of the first input + 4 of the second + 5 of the third. Row 2 says: take 3 of the first and ignore the rest."

The Mechanics: How to Multiply

To find entry $(i, j)$ of the output: take row $i$ of the operations matrix and column $j$ of the input matrix. Multiply corresponding entries and sum.

$$(\text{output})_{ij} = \sum_{k} M_{ik} \cdot A_{kj}$$

It's a dot product of row $i$ with column $j$.

Size convention: $m \times n$ means $m$ rows, $n$ columns. Multiply $[m \times \mathbf{n}] \cdot [\mathbf{n} \times p] = [m \times p]$. The inner dimensions must match — the number of columns in the first must equal the number of rows in the second.

Worked Example: Matrix-Vector Multiplication

Compute $\begin{bmatrix} 2 & -1 \\ 3 & 4 \end{bmatrix}\begin{bmatrix} 5 \\ 2 \end{bmatrix}$.

Row 1 · column: $(2)(5) + (-1)(2) = 10 - 2 = 8$
Row 2 · column: $(3)(5) + (4)(2) = 15 + 8 = 23$
Result: $\begin{bmatrix} 8 \\ 23 \end{bmatrix}$
Worked Example: Matrix-Matrix Multiplication

Compute $\begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}\begin{bmatrix} 5 & 6 \\ 7 & 8 \end{bmatrix}$.

(1,1): $(1)(5) + (2)(7) = 5 + 14 = 19$
(1,2): $(1)(6) + (2)(8) = 6 + 16 = 22$
(2,1): $(3)(5) + (4)(7) = 15 + 28 = 43$
(2,2): $(3)(6) + (4)(8) = 18 + 32 = 50$
Result: $\begin{bmatrix} 19 & 22 \\ 43 & 50 \end{bmatrix}$
Matrix multiplication is NOT commutative! In general, $AB \neq BA$. Think of it as "rotate then scale" vs "scale then rotate" — the order matters. Always read right to left: in $AB\mathbf{x}$, $B$ acts first, then $A$.
Real-World Example: RGB Colour Transformation

Digital cameras apply a $3 \times 3$ colour correction matrix to each pixel's RGB values:

$$\begin{bmatrix} R' \\ G' \\ B' \end{bmatrix} = \begin{bmatrix} 1.2 & -0.1 & 0.0 \\ -0.05 & 1.1 & -0.05 \\ 0.0 & -0.1 & 1.2 \end{bmatrix} \begin{bmatrix} R \\ G \\ B \end{bmatrix}$$

Each row is an operation: "How much of each input colour contributes to this output channel." Every phone camera and Instagram filter is a matrix applied to millions of pixel vectors.

7. Visualizing the Matrix

Imagine "pouring" each input column through each operation row. As an input passes an operation, it creates one output entry.

There's an even more powerful way to think about it: each column of the output is a linear combination of the columns of the operations matrix, with the input vector providing the coefficients.

$$M\mathbf{x} = x_1 \begin{bmatrix} | \\ \mathbf{m}_1 \\ | \end{bmatrix} + x_2 \begin{bmatrix} | \\ \mathbf{m}_2 \\ | \end{bmatrix} + x_3 \begin{bmatrix} | \\ \mathbf{m}_3 \\ | \end{bmatrix}$$

This is the "column picture" — the output is $x_1$ of column 1 $+ x_2$ of column 2 $+ x_3$ of column 3.

Key insight — Two ways to read $M\mathbf{x}$:
Row picture: each output entry is a dot product (row $\cdot$ input).
Column picture: the output is a blend of the columns of $M$, mixed according to $\mathbf{x}$.
Both give the same answer, but the column picture is often more intuitive for understanding what a matrix "does."

Interactive: Matrix Multiplier

Enter two matrices and watch the multiplication step by step. (2×3) × (3×2) = (2×2).

M (operations):
[
]
A (inputs):
[
]
Real-World Example: Neural Network Forward Pass

In a neural network, each layer computes $\mathbf{y} = W\mathbf{x} + \mathbf{b}$, where $W$ is a weight matrix. For a layer with 3 inputs and 2 outputs:

$$\begin{bmatrix} y_1 \\ y_2 \end{bmatrix} = \begin{bmatrix} w_{11} & w_{12} & w_{13} \\ w_{21} & w_{22} & w_{23} \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}$$

GPT, image classifiers, self-driving car vision — they're all stacks of matrix multiplications. Training adjusts the weights; inference is just pouring data through matrices.

8. Special Matrices & Their Superpowers

Some matrices appear everywhere because they do useful things:

Identity

$$I = \begin{bmatrix} 1&0&0 \\ 0&1&0 \\ 0&0&1 \end{bmatrix}$$

Copies input to output unchanged. $IA = A$ and $AI = A$. It's the "do nothing" matrix — the multiplicative equivalent of 1.

Zero Matrix

$$O = \begin{bmatrix} 0&0&0 \\ 0&0&0 \\ 0&0&0 \end{bmatrix}$$

Destroys everything. $OA = O$. The additive identity — $A + O = A$.

Diagonal

$$D = \begin{bmatrix} 2&0&0 \\ 0&-3&0 \\ 0&0&0.5 \end{bmatrix}$$

Scales each input independently. No mixing between components. Easiest matrix to understand.

Adder / Averager

$$\text{Add} = [1\;1\;1]$$ $$\text{Avg} = [\tfrac{1}{3}\;\tfrac{1}{3}\;\tfrac{1}{3}]$$

Sum all inputs, or average them. Single-row matrices that reduce a vector to a number.

Selector / Permutation

$$[1\;0\;0] \;\text{picks 1st input}$$ $$\begin{bmatrix} 1&0&0 \\ 0&0&1 \\ 0&1&0 \end{bmatrix} \text{swaps 2nd↔3rd}$$

Upper Triangular

$$\begin{bmatrix} 2&3&1 \\ 0&4&-1 \\ 0&0&5 \end{bmatrix}$$

Zeros below the diagonal. Appears in Gaussian elimination. Easy to solve by back-substitution.

Symmetric Matrices

A matrix is symmetric if $A = A^T$ (it equals its transpose — same when you flip rows↔columns). Symmetric matrices have special properties: their eigenvalues are always real, and their eigenvectors are orthogonal. They appear in physics, statistics, and optimisation.

$$A = \begin{bmatrix} 2 & 3 & 1 \\ 3 & 5 & -2 \\ 1 & -2 & 4 \end{bmatrix} \qquad A = A^T \;\text{(symmetric)}$$

Interactive: Operation Explorer

Select a preset operation matrix or type your own. See how it transforms the input vector.

M:
·
Real-World Example: GPS Coordinate System Conversion

GPS satellites broadcast positions in the Earth-Centred Earth-Fixed (ECEF) system. Your phone converts to local East-North-Up (ENU) coordinates using a rotation matrix derived from your latitude $\phi$ and longitude $\lambda$:

$$R = \begin{bmatrix} -\sin\lambda & \cos\lambda & 0 \\ -\sin\phi\cos\lambda & -\sin\phi\sin\lambda & \cos\phi \\ \cos\phi\cos\lambda & \cos\phi\sin\lambda & \sin\phi \end{bmatrix}$$

Every time your phone shows a blue dot on a map, it has multiplied satellite vectors through this matrix.

9. The Transpose

The transpose of a matrix flips it across its main diagonal — rows become columns and columns become rows.

$$A = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{bmatrix} \qquad A^T = \begin{bmatrix} 1 & 4 \\ 2 & 5 \\ 3 & 6 \end{bmatrix}$$

If $A$ is $m \times n$, then $A^T$ is $n \times m$. Entry $(A^T)_{ij} = A_{ji}$.

Useful Properties

$$(A^T)^T = A$$ $$(AB)^T = B^T A^T \qquad \text{(note the order reversal!)}$$ $$(A + B)^T = A^T + B^T$$

The transpose connects to the dot product: $\mathbf{a} \cdot \mathbf{b} = \mathbf{a}^T \mathbf{b}$. Writing the dot product as matrix multiplication makes many proofs cleaner.

Worked Example: Transpose and Dot Product

Compute $\mathbf{a}^T\mathbf{b}$ where $\mathbf{a} = \begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix}$, $\mathbf{b} = \begin{bmatrix} 4 \\ 5 \\ 6 \end{bmatrix}$.

$\mathbf{a}^T\mathbf{b} = \begin{bmatrix} 1 & 2 & 3 \end{bmatrix}\begin{bmatrix} 4 \\ 5 \\ 6 \end{bmatrix} = 1(4) + 2(5) + 3(6) = 4 + 10 + 18 = 32$

This is identical to the dot product $\mathbf{a} \cdot \mathbf{b} = 32$. The transpose turns a column into a row, enabling the "row × column" multiplication.

10. The Inverse Matrix

If a matrix $A$ transforms input into output ($\mathbf{y} = A\mathbf{x}$), the inverse $A^{-1}$ goes backwards: $\mathbf{x} = A^{-1}\mathbf{y}$. It "undoes" the transformation.

$$A A^{-1} = A^{-1} A = I$$

Applying $A$ then $A^{-1}$ (or vice versa) gets you back to where you started.

When Does the Inverse Exist?

Not all matrices have inverses. An inverse exists if and only if:

These are all equivalent statements — if any one is true, they're all true.

The 2×2 Inverse Formula

$$\begin{bmatrix} a & b \\ c & d \end{bmatrix}^{-1} = \frac{1}{ad - bc}\begin{bmatrix} d & -b \\ -c & a \end{bmatrix}$$

Swap $a$ and $d$, negate $b$ and $c$, divide by the determinant.

Worked Example: Finding a 2×2 Inverse

Find the inverse of $A = \begin{bmatrix} 3 & 1 \\ 5 & 2 \end{bmatrix}$.

Step 1: $\det(A) = 3(2) - 1(5) = 6 - 5 = 1$. Non-zero, so the inverse exists.
Step 2: $A^{-1} = \frac{1}{1}\begin{bmatrix} 2 & -1 \\ -5 & 3 \end{bmatrix} = \begin{bmatrix} 2 & -1 \\ -5 & 3 \end{bmatrix}$
Check: $AA^{-1} = \begin{bmatrix} 3 & 1 \\ 5 & 2 \end{bmatrix}\begin{bmatrix} 2 & -1 \\ -5 & 3 \end{bmatrix} = \begin{bmatrix} 6-5 & -3+3 \\ 10-10 & -5+6 \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}$ ✓
Worked Example: A Singular Matrix (No Inverse)

Find the inverse of $A = \begin{bmatrix} 2 & 4 \\ 1 & 2 \end{bmatrix}$.

$\det(A) = 2(2) - 4(1) = 4 - 4 = 0$. The determinant is zero!
No inverse exists. Notice column 2 = $2 \times$ column 1. The columns are dependent — the matrix collapses 2D space onto a line. You can't undo that (infinitely many points get mapped to each point on the line).

Interactive: 2×2 Inverse Calculator

Real-World Example: Decrypting Messages (Hill Cipher)

The Hill cipher encrypts a message by multiplying letter-pair vectors by a key matrix $K$. To decrypt, multiply by $K^{-1}$. If $K = \begin{bmatrix} 3 & 3 \\ 2 & 5 \end{bmatrix}$, then $K^{-1} = \begin{bmatrix} 5 & -3 \\ -2 & 3 \end{bmatrix} \cdot \frac{1}{9}$ (mod 26). The inverse literally reverses the encryption — no inverse means no decryption, which is why singular matrices make bad encryption keys.

11. Real Example: Stock Portfolios

Let's see linear algebra as a "mini-spreadsheet" in action.

Suppose a new product launches: Apple stock jumps 20%, Google drops 5%, Microsoft stays flat. We want to (1) update each stock value and (2) compute total profit.

$$\underbrace{\begin{bmatrix} 1.2 & 0 & 0 \\ 0 & 0.95 & 0 \\ 0 & 0 & 1 \\ 0.2 & -0.05 & 0 \end{bmatrix}}_{\text{Operations}} \begin{bmatrix} \text{Apple} \\ \text{Google} \\ \text{Microsoft} \end{bmatrix} = \begin{bmatrix} \text{New Apple} \\ \text{New Google} \\ \text{New Microsoft} \\ \text{Profit} \end{bmatrix}$$

Three inputs enter, four outputs leave. The first three rows are a "modified identity" (update each value); the fourth row computes the change. Read each row as a recipe!

Worked Example: Portfolio with Numbers

Holdings: AAPL = $1000, GOOG = $2000, MSFT = $500.

Row 1 (new AAPL): $1.2 \times 1000 + 0 \times 2000 + 0 \times 500 = \$1200$
Row 2 (new GOOG): $0 \times 1000 + 0.95 \times 2000 + 0 \times 500 = \$1900$
Row 3 (new MSFT): $0 \times 1000 + 0 \times 2000 + 1 \times 500 = \$500$
Row 4 (profit): $0.2 \times 1000 + (-0.05) \times 2000 + 0 \times 500 = 200 - 100 + 0 = \$100$
Total portfolio: was $3500, now $3600. Net gain: $100.

Interactive: Portfolio Calculator

Enter stock holdings and market changes. The matrix does the rest.

Holdings ($):
Changes (%):
Real-World Example: Leontief Input-Output Economic Model

Nobel laureate Wassily Leontief modelled entire economies with matrices. If three industries each consume outputs of the others, the total production $\mathbf{x}$ needed to meet external demand $\mathbf{d}$ is:

$$\mathbf{x} = (I - A)^{-1}\mathbf{d}$$

Governments and the World Bank still use this matrix model to forecast economic ripple effects.

12. Geometric Transformations

When we treat inputs as 2D coordinates, a $2 \times 2$ matrix becomes a geometric transformation. Here are the big four:

Scale: $$\begin{bmatrix} s_x & 0 \\ 0 & s_y \end{bmatrix}$$

Stretch horizontally by $s_x$, vertically by $s_y$.

Rotate by $\theta$: $$\begin{bmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{bmatrix}$$

Rotates every point by angle $\theta$ counter-clockwise around the origin.

Reflect (x-axis): $$\begin{bmatrix} 1 & 0 \\ 0 & -1 \end{bmatrix}$$

Flips the y-coordinate. Like looking in a mirror along the x-axis.

Shear: $$\begin{bmatrix} 1 & k \\ 0 & 1 \end{bmatrix}$$

Tilts shapes sideways. $k > 0$ leans right; $k < 0$ leans left.

Key insight — What the columns tell you: The first column of a transformation matrix is where $(1,0)$ lands. The second column is where $(0,1)$ lands. So a $2 \times 2$ matrix is completely determined by what it does to the two basis vectors. Everything else follows by linearity!
Worked Example: Where Does a Point Land After Rotation?

Rotate $(3, 1)$ by $90°$ counter-clockwise.

Step 1: The rotation matrix for $90°$: $\begin{bmatrix} \cos 90° & -\sin 90° \\ \sin 90° & \cos 90° \end{bmatrix} = \begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix}$
Step 2: $\begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix}\begin{bmatrix} 3 \\ 1 \end{bmatrix} = \begin{bmatrix} 0(3)+(-1)(1) \\ 1(3)+0(1) \end{bmatrix} = \begin{bmatrix} -1 \\ 3 \end{bmatrix}$
Check: $(3,1)$ rotated $90°$ CCW should become $(-1,3)$. The x becomes $-y$ and the y becomes $x$. ✓
Worked Example: Building a Transformation from Scratch

Find the matrix that reflects across the line $y = x$.

Step 1: Where does $(1,0)$ go? Reflecting $(1,0)$ across $y=x$ gives $(0,1)$. That's column 1.
Step 2: Where does $(0,1)$ go? Reflecting $(0,1)$ across $y=x$ gives $(1,0)$. That's column 2.
Result: $\begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}$. This swaps x and y coordinates — exactly what "reflect across $y=x$" means!

Interactive: 2D Transformation Playground

Pick a transform and watch a unit square warp in real time. The grid shows how the entire space is affected.

30°
Real-World Example: Computer Graphics & Game Engines

Every frame in a 3D game, each vertex is transformed by a series of matrices:

$$\mathbf{v}_{\text{screen}} = P \cdot V \cdot M \cdot \mathbf{v}_{\text{model}}$$

M (Model) positions the object. V (View) moves relative to the camera. P (Projection) flattens 3D to 2D. At 60 FPS with millions of vertices, GPUs perform billions of matrix multiplications per second.

13. Solving Systems of Equations

A system of linear equations can be written as a single matrix equation $M\mathbf{x} = \mathbf{b}$:

$$\begin{cases} x + 2y + 3z = 3 \\ 2x + 3y + z = -10 \\ 5x - y + 2z = 14 \end{cases} \;\;\Longleftrightarrow\;\; \begin{bmatrix} 1&2&3 \\ 2&3&1 \\ 5&-1&2 \end{bmatrix} \begin{bmatrix} x\\y\\z \end{bmatrix} = \begin{bmatrix} 3\\-10\\14 \end{bmatrix}$$

If $M$ is invertible, the solution is simply $\mathbf{x} = M^{-1}\mathbf{b}$. But we usually don't compute the inverse directly — instead we use Gaussian elimination.

Gaussian Elimination — Step by Step

The idea: use legal row operations (add/subtract multiples of rows) to turn the matrix into an upper triangle, then solve from the bottom up.

Legal operations:

  1. Swap two rows
  2. Multiply a row by a non-zero constant
  3. Add a multiple of one row to another
Worked Example: Gaussian Elimination (2×2)

Solve: $2x + y = 5$ and $x - y = 1$.

Augmented matrix: $\left[\begin{array}{cc|c} 2 & 1 & 5 \\ 1 & -1 & 1 \end{array}\right]$
Step 1: Swap rows so the pivot is 1: $\left[\begin{array}{cc|c} 1 & -1 & 1 \\ 2 & 1 & 5 \end{array}\right]$
Step 2: $R_2 \leftarrow R_2 - 2R_1$: $\left[\begin{array}{cc|c} 1 & -1 & 1 \\ 0 & 3 & 3 \end{array}\right]$
Step 3: Back-substitute. From row 2: $3y = 3 \Rightarrow y = 1$. From row 1: $x - 1 = 1 \Rightarrow x = 2$.
Solution: $(x, y) = (2, 1)$. Check: $2(2)+1=5$ ✓ and $2-1=1$ ✓.
Worked Example: Gaussian Elimination (3×3)

Solve: $x + 2y + z = 9$, $2x - y + 3z = 8$, $3x + y - z = 3$.

Augmented matrix: $\left[\begin{array}{ccc|c} 1 & 2 & 1 & 9 \\ 2 & -1 & 3 & 8 \\ 3 & 1 & -1 & 3 \end{array}\right]$
Step 1: $R_2 \leftarrow R_2 - 2R_1$: $\left[\begin{array}{ccc|c} 1 & 2 & 1 & 9 \\ 0 & -5 & 1 & -10 \\ 3 & 1 & -1 & 3 \end{array}\right]$
Step 2: $R_3 \leftarrow R_3 - 3R_1$: $\left[\begin{array}{ccc|c} 1 & 2 & 1 & 9 \\ 0 & -5 & 1 & -10 \\ 0 & -5 & -4 & -24 \end{array}\right]$
Step 3: $R_3 \leftarrow R_3 - R_2$: $\left[\begin{array}{ccc|c} 1 & 2 & 1 & 9 \\ 0 & -5 & 1 & -10 \\ 0 & 0 & -5 & -14 \end{array}\right]$
Back-substitution:
Row 3: $-5z = -14 \Rightarrow z = 14/5 = 2.8$
Row 2: $-5y + z = -10 \Rightarrow -5y + 2.8 = -10 \Rightarrow y = 2.56$
Row 1: $x + 2(2.56) + 2.8 = 9 \Rightarrow x = 9 - 5.12 - 2.8 = 1.08$

Three Possible Outcomes

CaseWhat happensGeometrically (2D)
Unique solution$\det \neq 0$, lines/planes meet at one pointTwo lines cross at one point
No solutionContradictory equations (e.g. $0 = 5$)Parallel lines, never meet
Infinite solutionsRedundant equations, free variablesSame line, every point is a solution

Interactive: 2×2 System Solver

Enter coefficients for two equations. See the solution graphically (intersection of two lines).

x + y =
x + y =
Real-World Example: Circuit Analysis (Kirchhoff's Laws)

An electrical circuit with 3 loops and 3 unknown currents yields:

$$\begin{bmatrix} 10 & -4 & 0 \\ -4 & 12 & -6 \\ 0 & -6 & 8 \end{bmatrix} \begin{bmatrix} I_1\\I_2\\I_3 \end{bmatrix} = \begin{bmatrix} 12\\0\\-5 \end{bmatrix}$$

SPICE simulators solve systems of thousands of linear equations per simulation step.

14. Determinants

The determinant is a single number that captures the "essence" of a square matrix. It measures how the matrix scales area (2D) or volume (3D).

2×2: $$\det \begin{bmatrix} a & b \\ c & d \end{bmatrix} = ad - bc$$ 3×3 (cofactor expansion along first row): $$\det \begin{bmatrix} a & b & c \\ d & e & f \\ g & h & i \end{bmatrix} = a(ei - fh) - b(di - fg) + c(dh - eg)$$

What the Determinant Tells You

$\det(A)$Meaning
$|\det| > 1$The transformation expands area/volume
$|\det| < 1$The transformation shrinks area/volume
$|\det| = 1$Area/volume is preserved (e.g., rotations)
$\det = 0$Singular! Output collapses to lower dimension. No inverse.
$\det < 0$Orientation is flipped (mirror reflection)
Key insight: Feed in a unit square; the determinant tells you the signed area of the output parallelogram. This makes the determinant a "compression/expansion detector" — if it's zero, information is being destroyed.
Worked Example: 2×2 Determinant

$\det\begin{bmatrix} 3 & 1 \\ 2 & 4 \end{bmatrix} = (3)(4) - (1)(2) = 12 - 2 = 10$

The matrix scales area by a factor of 10. Since it's positive, orientation is preserved.

Worked Example: 3×3 Determinant

$\det\begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix}$

$= 1(5 \cdot 9 - 6 \cdot 8) - 2(4 \cdot 9 - 6 \cdot 7) + 3(4 \cdot 8 - 5 \cdot 7)$
$= 1(45 - 48) - 2(36 - 42) + 3(32 - 35)$
$= 1(-3) - 2(-6) + 3(-3) = -3 + 12 - 9 = 0$

Determinant is 0! This matrix is singular. Why? Row 3 = Row 1 + Row 2. The three row vectors are linearly dependent — they all lie in a plane, so they can't fill 3D space.

Useful Properties of Determinants

$$\det(AB) = \det(A) \cdot \det(B) \qquad \text{(composition multiplies scaling factors)}$$ $$\det(A^{-1}) = \frac{1}{\det(A)} \qquad \text{(inverting inverts the scaling)}$$ $$\det(A^T) = \det(A) \qquad \text{(transposing doesn't change the determinant)}$$ $$\det(cA) = c^n \det(A) \qquad \text{(for an } n \times n \text{ matrix)}$$

Interactive: Determinant Visualiser

Adjust matrix entries. The shaded region shows the transformed unit square and its area = |det|.

Real-World Example: Structural Engineering — Stability Check

In finite element analysis, the stiffness matrix $K$ relates forces to displacements: $K\mathbf{u} = \mathbf{f}$. If $\det(K) = 0$, the structure has a mechanism — it can move freely without resistance. Engineers check the determinant to verify structural stability before construction.

15. Null Space & Column Space

Every matrix defines two important subspaces that tell you fundamental things about what the matrix can and cannot do.

Column Space (Range)

The column space of $A$ = the set of all possible outputs $A\mathbf{x}$. It's the span of the columns of $A$.

$$\text{Col}(A) = \{A\mathbf{x} \;|\; \mathbf{x} \in \mathbb{R}^n\} = \text{span}\{\text{columns of } A\}$$

If $\mathbf{b}$ is in the column space, then $A\mathbf{x} = \mathbf{b}$ has a solution. If it's not, no solution exists.

The rank of a matrix = dimension of the column space = number of linearly independent columns.

Null Space (Kernel)

The null space of $A$ = the set of all inputs that get mapped to zero.

$$\text{Null}(A) = \{\mathbf{x} \;|\; A\mathbf{x} = \mathbf{0}\}$$

If the null space is just $\{\mathbf{0}\}$, the matrix is injective (one-to-one) — different inputs always produce different outputs.

If the null space contains non-zero vectors, some information is being destroyed.

The Rank-Nullity Theorem: $$\text{rank}(A) + \text{nullity}(A) = n \quad \text{(number of columns)}$$

The number of "useful" dimensions (rank) plus the number of "destroyed" dimensions (nullity) always equals the total input dimensions. What the matrix doesn't use goes to zero.

Worked Example: Finding the Null Space

Find the null space of $A = \begin{bmatrix} 1 & 2 & 3 \\ 2 & 4 & 6 \end{bmatrix}$.

Step 1: Solve $A\mathbf{x} = \mathbf{0}$. Row 2 = $2 \times$ Row 1, so after elimination: $\begin{bmatrix} 1 & 2 & 3 \\ 0 & 0 & 0 \end{bmatrix}\begin{bmatrix} x_1\\x_2\\x_3 \end{bmatrix} = \begin{bmatrix} 0\\0 \end{bmatrix}$
Step 2: Only one equation: $x_1 + 2x_2 + 3x_3 = 0$, so $x_1 = -2x_2 - 3x_3$.
Step 3: $x_2$ and $x_3$ are free. Setting $(x_2, x_3) = (1, 0)$ and $(0, 1)$: $$\text{Null}(A) = \text{span}\left\{\begin{bmatrix} -2\\1\\0 \end{bmatrix}, \begin{bmatrix} -3\\0\\1 \end{bmatrix}\right\}$$
Check rank-nullity: Rank = 1 (one independent row), Nullity = 2. Sum = 3 = number of columns. ✓
Worked Example: Column Space

Find the column space of $A = \begin{bmatrix} 1 & 3 \\ 2 & 6 \end{bmatrix}$.

Column 2 = $3 \times$ Column 1. So the columns are dependent.
$\text{Col}(A) = \text{span}\left\{\begin{bmatrix} 1\\2 \end{bmatrix}\right\}$ — a line through the origin in the direction $(1,2)$.
Rank = 1. Despite being a $2\times 2$ matrix, it can only produce outputs along one line!
For example, $A\mathbf{x} = \begin{bmatrix} 1\\1 \end{bmatrix}$ has no solution because $(1,1)$ is not on the line $y = 2x$.
Real-World Example: Underdetermined Systems & Compressed Sensing

MRI machines acquire far fewer measurements than pixels in the final image. The measurement matrix $A$ maps the unknown image $\mathbf{x}$ to the measurements $\mathbf{b}$: $A\mathbf{x} = \mathbf{b}$. Since $A$ has more columns than rows, the null space is non-trivial — infinitely many images could produce the same measurements. Compressed sensing adds the constraint "find the sparsest $\mathbf{x}$" to pick the right one. Understanding null space = understanding which information is lost.

16. Eigenvectors & Eigenvalues

This is one of the most important ideas in all of linear algebra.

Consider spinning a globe: every point moves to a new position — except the points on the axis (the poles). In matrix terms, most vectors change direction when you apply a matrix. But some special vectors only get scaled:

$$M\mathbf{v} = \lambda \mathbf{v}$$

An eigenvector $\mathbf{v}$ is an input that doesn't change direction through the matrix — it only scales by factor $\lambda$ (the eigenvalue).

"Eigen" is German for "own" or "characteristic" — these are the matrix's own vectors, the directions it naturally acts along.

What Eigenvalues Tell You

$\lambda$What happens to the eigenvector
$\lambda > 1$Stretches (gets longer)
$0 < \lambda < 1$Shrinks (gets shorter)
$\lambda = 1$Unchanged (fixed direction AND length)
$\lambda = 0$Collapsed to zero (this direction is destroyed)
$\lambda < 0$Flipped direction and scaled
Complex $\lambda$Rotation (no real eigenvector stays on its line)

How to Find Eigenvalues

Start from $M\mathbf{v} = \lambda\mathbf{v}$, rewrite as $(M - \lambda I)\mathbf{v} = \mathbf{0}$. For a non-zero $\mathbf{v}$ to exist, the matrix $(M - \lambda I)$ must be singular:

$$\det(M - \lambda I) = 0 \qquad \text{(the "characteristic equation")}$$

Solve this polynomial for $\lambda$. Then find eigenvectors by solving $(M - \lambda I)\mathbf{v} = \mathbf{0}$.

Worked Example: Finding Eigenvalues and Eigenvectors

Find the eigenvalues and eigenvectors of $A = \begin{bmatrix} 4 & 1 \\ 2 & 3 \end{bmatrix}$.

Step 1 — Characteristic equation: $\det(A - \lambda I) = \det\begin{bmatrix} 4-\lambda & 1 \\ 2 & 3-\lambda \end{bmatrix} = (4-\lambda)(3-\lambda) - 2 = 0$
$\lambda^2 - 7\lambda + 12 - 2 = 0 \Rightarrow \lambda^2 - 7\lambda + 10 = 0 \Rightarrow (\lambda - 5)(\lambda - 2) = 0$
$\lambda_1 = 5, \quad \lambda_2 = 2$
Step 2 — Eigenvector for $\lambda_1 = 5$: $(A - 5I)\mathbf{v} = \begin{bmatrix} -1 & 1 \\ 2 & -2 \end{bmatrix}\mathbf{v} = \mathbf{0}$
Row 1: $-v_1 + v_2 = 0 \Rightarrow v_2 = v_1$. So $\mathbf{v}_1 = \begin{bmatrix} 1 \\ 1 \end{bmatrix}$ (or any scalar multiple).
Step 3 — Eigenvector for $\lambda_2 = 2$: $(A - 2I)\mathbf{v} = \begin{bmatrix} 2 & 1 \\ 2 & 1 \end{bmatrix}\mathbf{v} = \mathbf{0}$
$2v_1 + v_2 = 0 \Rightarrow v_2 = -2v_1$. So $\mathbf{v}_2 = \begin{bmatrix} 1 \\ -2 \end{bmatrix}$.
Check: $A\mathbf{v}_1 = \begin{bmatrix} 4+1 \\ 2+3 \end{bmatrix} = \begin{bmatrix} 5 \\ 5 \end{bmatrix} = 5\begin{bmatrix} 1\\1 \end{bmatrix}$ ✓
$A\mathbf{v}_2 = \begin{bmatrix} 4-2 \\ 2-6 \end{bmatrix} = \begin{bmatrix} 2 \\ -4 \end{bmatrix} = 2\begin{bmatrix} 1\\-2 \end{bmatrix}$ ✓

Why Eigenvectors Matter

Eigenvectors reveal the "natural axes" of a transformation. Along these axes, the matrix acts simply (just scaling). In any other direction, it does complicated stretching + rotating. This is why eigenvectors are central to:

Interactive: Eigenvector Explorer

Set a 2×2 matrix. The eigenvectors (red/blue lines) stay on their line when transformed.

Real-World Example: Google PageRank

The web is a giant matrix $M$ where $M_{ij}$ = probability of clicking from page $j$ to page $i$. The principal eigenvector (with $\lambda = 1$) gives the steady-state probability of a random surfer being on each page — this is the PageRank score:

$$M\mathbf{r} = 1 \cdot \mathbf{r}$$

Pages with high eigenvector components rank higher. The same eigenvector technique powers social-network influence scores, recommendation engines, and epidemiological models.

Real-World Example: Vibrational Modes of a Bridge

A bridge's stiffness matrix $K$ and mass matrix $M$ define the eigenvalue problem $K\mathbf{v} = \omega^2 M\mathbf{v}$. Each eigenvector $\mathbf{v}$ is a natural vibration mode (the shape the bridge oscillates in), and $\omega$ is the frequency. The Tacoma Narrows Bridge collapsed in 1940 because wind excited an eigenmode — the bridge oscillated along an eigenvector of its structural matrix until it broke apart. Engineers now compute these eigenvalues to ensure no natural frequency matches expected wind or traffic patterns.

17. Change of Basis

We've been using the standard basis $\{(1,0), (0,1)\}$, but any set of independent vectors can serve as a basis. Change of basis translates coordinates between different "reference frames."

Key insight: A matrix that looks complicated in one basis might look simple (even diagonal!) in another. Finding the right basis is often the whole point of an analysis.

How It Works

Suppose a new basis $B = \{\mathbf{b}_1, \mathbf{b}_2\}$. The change-of-basis matrix $P$ has the new basis vectors as columns:

$$P = \begin{bmatrix} | & | \\ \mathbf{b}_1 & \mathbf{b}_2 \\ | & | \end{bmatrix}$$

From new to standard: $\mathbf{v}_{\text{standard}} = P \cdot \mathbf{v}_{\text{new basis}}$

From standard to new: $\mathbf{v}_{\text{new basis}} = P^{-1} \cdot \mathbf{v}_{\text{standard}}$

Changing the Basis of a Matrix

If $A$ is a transformation expressed in the standard basis, the same transformation expressed in basis $B$ is:

$$A_B = P^{-1} A P$$

Read right-to-left: convert from $B$-coords to standard ($P$), apply the transformation ($A$), convert back to $B$-coords ($P^{-1}$).

Worked Example: Diagonalising a Matrix

The matrix $A = \begin{bmatrix} 4 & 1 \\ 2 & 3 \end{bmatrix}$ has eigenvectors $\mathbf{v}_1 = \begin{bmatrix} 1\\1 \end{bmatrix}$ ($\lambda_1=5$) and $\mathbf{v}_2 = \begin{bmatrix} 1\\-2 \end{bmatrix}$ ($\lambda_2=2$).

Step 1: Form $P$ from eigenvectors as columns: $P = \begin{bmatrix} 1 & 1 \\ 1 & -2 \end{bmatrix}$
Step 2: In the eigenvector basis: $P^{-1}AP = \begin{bmatrix} 5 & 0 \\ 0 & 2 \end{bmatrix} = D$ (diagonal!)
The transformation is just "stretch by 5 in the $\mathbf{v}_1$ direction and by 2 in the $\mathbf{v}_2$ direction."
Bonus: $A^{100} = PD^{100}P^{-1} = P\begin{bmatrix} 5^{100} & 0 \\ 0 & 2^{100} \end{bmatrix}P^{-1}$ — trivially easy to compute high powers!
Real-World Example: PCA — Principal Component Analysis

In data science, you compute the covariance matrix of your data, find its eigenvectors (the "principal components"), and change basis to these eigenvectors. In the new basis, the data is uncorrelated and sorted by variance. This reveals which features actually matter and lets you compress high-dimensional data (e.g., 1000 gene expressions → 5 principal components) without losing much information.

18. Matrix Composition

A powerful idea: we can treat the operations matrix as input to another matrix. Applying one operations matrix to another gives a new matrix that does both transformations in order:

$$T \cdot N = X$$

$X$ first applies $N$, then $T$. We combined the operations themselves — no data needed.

Want to apply the same transform $k$ times? Use $M^k$.

Order matters! Rotate then scale ≠ Scale then rotate (usually). $AB \neq BA$.
Worked Example: Composing Rotation + Scale

First rotate by $90°$, then scale by 2.

Rotation: $R = \begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix}$ (90° CCW)
Scale: $S = \begin{bmatrix} 2 & 0 \\ 0 & 2 \end{bmatrix}$
Composed: $SR = \begin{bmatrix} 2 & 0 \\ 0 & 2 \end{bmatrix}\begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix} = \begin{bmatrix} 0 & -2 \\ 2 & 0 \end{bmatrix}$
Test: $(1,0) \xrightarrow{R} (0,1) \xrightarrow{S} (0,2)$. Also: $\begin{bmatrix} 0&-2\\2&0 \end{bmatrix}\begin{bmatrix} 1\\0 \end{bmatrix} = \begin{bmatrix} 0\\2 \end{bmatrix}$ ✓

Interactive: Compose Two Transforms

Pick two 2D transformations. See the individual and combined effects on the unit square.

Real-World Example: Robot Arm Kinematics

A robot arm computes the gripper position by composing rotation matrices for each joint:

$$T_{\text{gripper}} = T_{\text{base}} \cdot R_1(\theta_1) \cdot R_2(\theta_2) \cdot R_3(\theta_3) \cdot T_{\text{tool}}$$

Factory robots, surgical arms, and Mars rovers chain dozens of matrices for sub-millimetre precision.

19. Homogeneous Coordinates & Translation

Linear transformations always keep the origin fixed. But what about translation (sliding everything over)? Translation is NOT linear — it violates $F(\mathbf{0}) = \mathbf{0}$.

The trick: add a dummy "1" entry to the input. Now the matrix has an extra column to add constants:

$$\begin{bmatrix} 1 & 0 & t_x \\ 0 & 1 & t_y \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} x \\ y \\ 1 \end{bmatrix} = \begin{bmatrix} x + t_x \\ y + t_y \\ 1 \end{bmatrix}$$

We pretend the input lives in one higher dimension and place a "1" there. A shear in the higher dimension looks like a slide (translation) in the original dimensions. The dummy entry stays 1, ready for more slides.

Key insight: Now we can combine rotation, scale, AND translation in a single matrix multiply. This is how every game engine, graphics API, and robotics system works — all transformations are $4 \times 4$ (or $3 \times 3$ for 2D) homogeneous matrices.
Worked Example: Rotate then Translate

Rotate $(2, 0)$ by $90°$, then translate by $(3, 1)$.

Build the combined matrix: $T = \begin{bmatrix} \cos90° & -\sin90° & 3 \\ \sin90° & \cos90° & 1 \\ 0 & 0 & 1 \end{bmatrix} = \begin{bmatrix} 0 & -1 & 3 \\ 1 & 0 & 1 \\ 0 & 0 & 1 \end{bmatrix}$
Apply: $\begin{bmatrix} 0 & -1 & 3 \\ 1 & 0 & 1 \\ 0 & 0 & 1 \end{bmatrix}\begin{bmatrix} 2 \\ 0 \\ 1 \end{bmatrix} = \begin{bmatrix} 0+0+3 \\ 2+0+1 \\ 1 \end{bmatrix} = \begin{bmatrix} 3 \\ 3 \\ 1 \end{bmatrix}$
Result: $(2, 0) \xrightarrow{\text{rotate 90°}} (0, 2) \xrightarrow{\text{translate (3,1)}} (3, 3)$ ✓

Interactive: Translation + Rotation

Combine rotation and translation — something a plain 2×2 matrix can't do alone.

30° 1.5 0.5
Real-World Example: Autonomous Vehicle Localisation

A self-driving car uses $4 \times 4$ homogeneous matrices to track its position. Each LiDAR point cloud is transformed from sensor frame to world frame:

$$\mathbf{p}_{\text{world}} = T_{\text{world←car}} \cdot T_{\text{car←sensor}} \cdot \mathbf{p}_{\text{sensor}}$$

Without homogeneous coordinates, you'd need separate rotation and addition steps. Every self-driving car, drone, and warehouse robot composes these matrices thousands of times per second.

20. The Cross Product (3D)

The cross product takes two 3D vectors and returns a third vector that is perpendicular to both. It only works in 3D.

$$\mathbf{a} \times \mathbf{b} = \begin{bmatrix} a_2 b_3 - a_3 b_2 \\ a_3 b_1 - a_1 b_3 \\ a_1 b_2 - a_2 b_1 \end{bmatrix}$$

Magnitude: $\|\mathbf{a} \times \mathbf{b}\| = \|\mathbf{a}\|\|\mathbf{b}\|\sin\theta$ = area of the parallelogram formed by $\mathbf{a}$ and $\mathbf{b}$.

Direction follows the right-hand rule: point your fingers from $\mathbf{a}$ toward $\mathbf{b}$; your thumb points in the direction of $\mathbf{a} \times \mathbf{b}$.

Key properties:
$\mathbf{a} \times \mathbf{b} = -\mathbf{b} \times \mathbf{a}$ (anti-commutative — order matters!)
$\mathbf{a} \times \mathbf{a} = \mathbf{0}$ (a vector crossed with itself is zero)
$(\mathbf{a} \times \mathbf{b}) \cdot \mathbf{a} = 0$ and $(\mathbf{a} \times \mathbf{b}) \cdot \mathbf{b} = 0$ (result is perpendicular to both inputs)
Worked Example: Computing a Cross Product

$\begin{bmatrix} 1\\2\\3 \end{bmatrix} \times \begin{bmatrix} 4\\5\\6 \end{bmatrix}$

$x$-component: $(2)(6) - (3)(5) = 12 - 15 = -3$
$y$-component: $(3)(4) - (1)(6) = 12 - 6 = 6$
$z$-component: $(1)(5) - (2)(4) = 5 - 8 = -3$
Result: $\begin{bmatrix} -3\\6\\-3 \end{bmatrix}$
Check perpendicularity: $\begin{bmatrix} -3\\6\\-3 \end{bmatrix} \cdot \begin{bmatrix} 1\\2\\3 \end{bmatrix} = -3+12-9 = 0$ ✓

Connection to Determinants

The cross product can be written as a (symbolic) determinant:

$$\mathbf{a} \times \mathbf{b} = \det\begin{bmatrix} \mathbf{i} & \mathbf{j} & \mathbf{k} \\ a_1 & a_2 & a_3 \\ b_1 & b_2 & b_3 \end{bmatrix}$$

This is a mnemonic — expand along the first row using cofactors to get the formula above.

Real-World Example: Torque and Angular Momentum

Torque $\boldsymbol{\tau} = \mathbf{r} \times \mathbf{F}$ (position × force). The cross product gives a vector perpendicular to both — pointing along the axis of rotation. Its magnitude is $\|\mathbf{r}\|\|\mathbf{F}\|\sin\theta$, which is exactly the "turning effectiveness" of the force. Every physics simulation of spinning objects, gyroscopes, and orbital mechanics uses cross products extensively.


Summary — The Big Picture

Here's a map of everything we covered and how the pieces connect:

Vectors are ordered lists of numbers — the objects we work with.
Linear combinations ($c_1\mathbf{v}_1 + c_2\mathbf{v}_2 + \ldots$) are the fundamental operation.
Matrices are machines that perform linear combinations — each row is a recipe.
Matrix multiplication = applying one machine's output as another's input.
The determinant measures how much a matrix scales space (0 = collapse).
The inverse undoes a transformation (exists only when det ≠ 0).
Eigenvectors are the "natural axes" — directions the matrix just scales.
Change of basis = looking at the same transformation through different "glasses."
Null space = what's destroyed. Column space = what's reachable.

The recurring theme: linear algebra lets you decompose complex problems into simple pieces, solve each piece, and recombine. Whether you're building a search engine, training an AI, designing a bridge, or rendering a video game — this decompose-solve-recombine pattern is why linear algebra is the most widely used branch of mathematics.


Inspired by: BetterExplained by Kalid Azad, Khan Academy Linear Algebra, and 3Blue1Brown — Essence of Linear Algebra.