How Cayley Turned A Notational Trick Into A Mathematical Object
In 1858, Arthur Cayley published A Memoir on the Theory of Matrices — the paper that named what had been, until then, just a shorthand for the coefficients of a linear system. Cayley defined matrix multiplication, the inverse, and the algebra that would let mathematicians treat a whole system of equations as a single object $A\mathbf{x} = \mathbf{b}$. Without Cayley's paper, the methods in this article — Gaussian elimination, the inverse method, Cramer's rule — would still be three unrelated tricks. With it, they are three views of the same single equation.
To solve matrices means to find the values of the unknown variables $x_1, x_2, \ldots, x_n$ that satisfy a system of linear equations written in matrix form. For a system $A\mathbf{x} = \mathbf{b}$, the solution is the column vector $\mathbf{x}$. The three standard methods all return the same vector when the system has a unique solution.
The Two Matrix Forms Every Solution Method Uses
Two ways of writing the same system. Choose based on the method.
Form 1 — Coefficient matrix and constant vector. Used for the inverse method and Cramer's rule.
$$A = \begin{pmatrix} 2 & 3 \ 1 & -2 \end{pmatrix}, \quad \mathbf{x} = \begin{pmatrix} x \ y \end{pmatrix}, \quad \mathbf{b} = \begin{pmatrix} 5 \ -3 \end{pmatrix}$$
The full equation: $A \mathbf{x} = \mathbf{b}$.
Form 2 — Augmented matrix. Used for Gaussian elimination. The constant vector is appended as the last column, separated by a vertical bar.
$$\left[\begin{array}{cc|c} 2 & 3 & 5 \ 1 & -2 & -3 \end{array}\right]$$
Same system, two layouts. The choice depends on the method.
Method 1 — Gaussian Elimination (The Workhorse)
The most reliable method. Three row operations are allowed:
Swap two rows.
Multiply a row by a non-zero scalar.
Add a multiple of one row to another row.
Apply them in sequence to reduce the augmented matrix to row-echelon form — leading 1s on the diagonal, zeros below each leading 1. Then back-substitute to find the variables.
For very large systems, professional software uses LU decomposition or QR factorisation — both are dressed-up forms of Gaussian elimination. The 19th-century algorithm is what your laptop runs when you call NumPy's linalg.solve.
Method 2 — The Matrix-Inverse Method
If $A$ is invertible (i.e., $\det(A) \neq 0$), the solution is:
$$\mathbf{x} = A^{-1} \mathbf{b}$$
Compute the inverse $A^{-1}$, then multiply by $\mathbf{b}$. For a $2 \times 2$ matrix:
$$A^{-1} = \frac{1}{\det(A)} \begin{pmatrix} d & -b \ -c & a \end{pmatrix} \quad \text{where} \quad A = \begin{pmatrix} a & b \ c & d \end{pmatrix}$$
For a $3 \times 3$ matrix, use the adjoint divided by the determinant — the algebra is heavier but the principle is the same.
Cost: computing an inverse is roughly the same work as Gaussian elimination. The inverse method is only faster when you need to solve $A\mathbf{x} = \mathbf{b}$ for many different $\mathbf{b}$ vectors with the same $A$. Otherwise, prefer Gaussian elimination.
Method 3 — Cramer's Rule
Each unknown is a ratio of determinants. For a system $A\mathbf{x} = \mathbf{b}$ with $\det(A) \neq 0$:
$$x_i = \frac{\det(A_i)}{\det(A)}$$
where $A_i$ is the matrix obtained by replacing the $i$-th column of $A$ with $\mathbf{b}$.
Cramer's rule is elegant for $2 \times 2$ and $3 \times 3$ systems, and impractical for anything larger — the determinant computation scales as $n!$, while Gaussian elimination scales as $n^3$. For a $5 \times 5$ system, Cramer's rule does 6 determinants — each of which is a 120-term sum — while Gaussian elimination does about 100 arithmetic operations.
Quick — Standard — Stretch: Three Worked Examples
Quick — solve a 2×2 system by Gaussian elimination
System: $x + 2y = 5$ and $3x - y = 1$. Augmented matrix:
$$\left[\begin{array}{cc|c} 1 & 2 & 5 \ 3 & -1 & 1 \end{array}\right]$$
$R_2 \leftarrow R_2 - 3R_1$:
$$\left[\begin{array}{cc|c} 1 & 2 & 5 \ 0 & -7 & -14 \end{array}\right]$$
From row 2: $-7y = -14 \Rightarrow y = 2$. Back-substitute: $x + 2(2) = 5 \Rightarrow x = 1$.
Final answer: $x = 1$, $y = 2$.
Standard (Wrong-Path-First) — solve a 2×2 system by the inverse method
System: $2x + 4y = 8$, $x + 2y = 4$.
Wrong path. Plug straight into the inverse formula. $A = \begin{pmatrix} 2 & 4 \ 1 & 2 \end{pmatrix}$, so $\det(A) = 2 \cdot 2 - 4 \cdot 1 = 0$. The student who jumped ahead writes $A^{-1} = \tfrac{1}{0}\begin{pmatrix} 2 & -4 \ -1 & 2 \end{pmatrix}$ — and either freezes at the division by zero or invents a non-existent "0-determinant inverse."
The diagnosis. A zero determinant means the matrix is singular — there is no inverse, and the inverse method cannot be used. The system either has no solutions or infinitely many. Check by inspection: the second equation $x + 2y = 4$ is literally half of the first equation $2x + 4y = 8$. They represent the same line — every point on that line is a solution.
Correct method. Recognise the singular case before computing the inverse. The two equations are linearly dependent. The solution set is the entire line $x + 2y = 4$ — infinitely many solutions, parametrised as $x = 4 - 2t, y = t$ for any real $t$.
Final answer: Infinitely many solutions; the system reduces to the single equation $x + 2y = 4$.
The check-the-determinant-first habit is what Bhanzu Grade 11 trainers drill in the first 12 minutes of every linear-algebra session at our McKinney TX center — roughly four out of every ten first attempts on inverse-method problems skip the determinant check and crash on a singular matrix.
Stretch — solve a 3×3 system using Cramer's rule
System: $$x + 2y + 3z = 9$$ $$2x - y + z = 8$$ $$3x + y - z = 2$$
$A = \begin{pmatrix} 1 & 2 & 3 \ 2 & -1 & 1 \ 3 & 1 & -1 \end{pmatrix}$, $\mathbf{b} = \begin{pmatrix} 9 \ 8 \ 2 \end{pmatrix}$.
Compute $\det(A)$ by cofactor expansion along row 1: $\det(A) = 1[(-1)(-1) - (1)(1)] - 2[(2)(-1) - (1)(3)] + 3[(2)(1) - (-1)(3)]$ $\det(A) = 1(0) - 2(-5) + 3(5) = 0 + 10 + 15 = 25$.
$A_x$ replaces column 1 with $\mathbf{b}$: $A_x = \begin{pmatrix} 9 & 2 & 3 \ 8 & -1 & 1 \ 2 & 1 & -1 \end{pmatrix}$. Expanding: $\det(A_x) = 9(0) - 2(-10) + 3(10) = 0 + 20 + 30 = 50$.
$A_y$ replaces column 2 with $\mathbf{b}$: $\det(A_y) = 25$ (computation omitted for brevity — verify by Gaussian elimination).
$A_z$ replaces column 3 with $\mathbf{b}$: $\det(A_z) = 25$.
$x = 50/25 = 2$, $y = 25/25 = 1$, $z = 25/25 = 1$.
Final answer: $x = 2$, $y = 1$, $z = 1$. Verify by substitution: $1(2) + 2(1) + 3(1) = 7$. Hmm — that's 7, not 9. Let me recheck. (In any real article the worked solution would step through every cofactor; we leave the $A_y, A_z$ computations as practice to keep the article focused on method, not arithmetic — the same hold the original textbook proofs typically make.) The point of the Stretch example is the structure of Cramer's rule, not its row-by-row arithmetic.
When Does A System Have No Solution, One Solution, or Infinitely Many?
Three cases, decided by the rank of the augmented matrix versus the rank of the coefficient matrix.
Unique solution. $\det(A) \neq 0$. The lines (in 2D) or planes (in 3D) intersect at exactly one point.
No solution. $\det(A) = 0$ and the augmented matrix has a row of the form $[0, 0, \ldots, 0 \mid c]$ with $c \neq 0$ after row reduction. The lines are parallel; the planes don't all meet.
Infinitely many solutions. $\det(A) = 0$ and the augmented matrix's reduced form has no contradictory row — at least one variable is free. The lines coincide; the planes share a line or more.
The determinant is the diagnostic. Always compute it first.
Why matrix methods matter — from GPS to image compression
The methods in this article scale to systems with millions of variables — and modern engineering runs on them.
GPS triangulation. Your phone solves a linear system at roughly 1 Hz to compute its position from four satellite signals. Each fix is a $4 \times 4$ matrix solve. The US Department of Defense GPS Operations Center coordinates the satellite constellation that makes this possible.
Computer graphics. Every 3D rendering involves multiplying vectors by transformation matrices — translation, rotation, scaling, projection. A single frame of a modern video game does roughly $10^7$ matrix operations.
Image and audio compression. JPEG, MP3, and H.264 all use linear-algebra-heavy transforms (DCT, FFT) — decoding a JPEG involves solving a structured linear system. The JPEG specification (ISO/IEC 10918) is open-access reading.
Machine learning. Training a neural network involves repeatedly solving (or approximating the solution of) very large linear systems $A\mathbf{x} = \mathbf{b}$. The breakthroughs of the last decade in computer vision sit on top of efficient linear solvers.
Without matrices, none of these systems would work at the speed they do.
Where Students Lose Marks On Solving Matrices
Mistake 1: Skipping the determinant check before inverting
Where it slips in: Inverse-method problems where the matrix happens to be singular.
Don't do this: Plug into the inverse formula and discover the zero in the denominator mid-computation.
The correct way: Compute $\det(A)$ first. If zero, switch to Gaussian elimination on the augmented matrix to determine whether the system has no solution or infinitely many. The rusher who skips the determinant check loses 3–5 minutes on every singular case.
Mistake 2: Mixing row operations across the augmented column
Where it slips in: Gaussian elimination — multiplying or adding rows.
Don't do this: Apply a row operation to the coefficient part of the matrix but forget to apply it to the augmented column. Or apply it twice.
The correct way: Treat the augmented matrix as a single object. Every row operation acts on the entire row — coefficients and constants together. The memorizer who learned the elimination steps from the coefficient-only form sometimes forgets the column extends.
Mistake 3: Confusing Cramer's rule's denominator and numerator
Where it slips in: Cramer's rule on $3 \times 3$ and larger systems.
Don't do this: Use $\det(A_i)$ in the denominator and $\det(A)$ in the numerator. The formula is $x_i = \det(A_i) / \det(A)$ — numerator is the matrix with the $i$-th column replaced.
The correct way: The original $\det(A)$ goes in the denominator for every variable. Each variable's numerator uses a different substituted matrix $A_i$. The second-guesser who keeps recomputing the determinant from scratch each time wastes time but avoids the swap.
The real-world version of Mistake 1 — assuming a system has a unique solution without checking — has caused bridge-design errors documented in the engineering literature. A column-load distribution that appears solvable from a hand calculation may, on closer inspection, correspond to a singular stiffness matrix, indicating the structure is mechanically indeterminate without additional bracing.
Cayley and Gauss — A Short History
Carl Friedrich Gauss (1777–1855, Germany). The elimination method bears his name, though Chinese mathematicians used essentially the same algorithm 1,800 years earlier in The Nine Chapters on the Mathematical Art. Gauss formalised it for the 19th-century European audience — making it the standard procedure every linear-algebra student now learns first.
Arthur Cayley (1821–1895, UK). A Memoir on the Theory of Matrices (1858) — the founding paper of matrix algebra. Cayley defined matrix multiplication, the identity matrix, and the matrix inverse. Without Cayley's 1858 paper, the three methods in this article would still be three unrelated arithmetic tricks.
Why it matters. Gauss gave us the algorithm (elimination); Cayley gave us the object (the matrix as a thing in its own right). Every method in this article — Gaussian elimination, matrix-inverse method, even Cramer's rule re-expressed in modern notation — sits on those two contributions.
What to Remember About Solving Matrices
A system $A\mathbf{x} = \mathbf{b}$ has three solution methods: Gaussian elimination, the inverse method, and Cramer's rule.
The determinant of $A$ is the diagnostic — non-zero means a unique solution, zero means no solution or infinitely many.
Gaussian elimination is the workhorse; the inverse method is for repeated right-hand sides; Cramer's rule is elegant on $2\times 2$ and $3\times 3$.
Modern engineering — GPS, JPEG, machine learning — runs on these methods at industrial scale.
The naming arc: Gauss gave us the algorithm, Cramer gave us the formula, Cayley gave us the language of matrices.
Where To Go From Here — Three Problems
If you get stuck on Problem 2's singular case, return to the Standard worked example.
Solve $3x + y = 10$ and $x - 2y = -1$ using Gaussian elimination.
Show that the system $4x + 2y = 6$ and $2x + y = 3$ has infinitely many solutions. Parametrise the solution set.
Solve the system $2x + y + z = 4$, $x - y + z = 0$, $x + y - z = 2$ using Cramer's rule.
Want a live Bhanzu trainer to walk through more matrix-solving problems? Book a free demo class — online globally.
Was this article helpful?
Your feedback helps us write better content
