Introduction to Algorithms · Third & Fourth Editions

CLRS, Studied Rigorously

A self-contained, interactive companion covering foundations through the master method — with formal definitions, worked proofs, live algorithm visualizations, and checked exercises.

This guide is designed for a single purpose: to help you learn the material as a mathematician would. Definitions come first. Theorems follow. Every claim that matters is proved, and the most important proofs appear in full. Running alongside the formal text are interactive elements — a step-through of INSERTION-SORT, a live merge-sort recursion, a recursion-tree builder, and a master-theorem dispatcher — intended not as decoration, but as a second channel for intuition. Work through the quizzes; attempt the exercises before revealing solutions. The ornament ❧ marks the end of a section.

A brief notational remark. Arrays are indexed from $1$ throughout (as CLRS does), so an array $A$ of length $n$ has entries $A[1], A[2], \ldots, A[n]$. The notation $A[i\mathinner{.\,.}j]$ denotes the subarray from index $i$ to $j$ inclusive.

❧

Chapter 2

Getting Started

We begin with the most concrete object in the book: a small, specific algorithm, and the tools to say exactly what "efficient" means for it. The two algorithms in this chapter — insertion sort and merge sort — serve as the bedrock upon which all subsequent analysis rests. Master these and you have the vocabulary to attack everything that follows.

2.1Insertion Sort

Insertion sort is the algorithm most people would invent if handed a pile of index cards to sort. You pick up one card at a time, and slide each new card into its correct position among the cards you have already sorted. Formally:

The Sorting Problem Input. A sequence of $n$ numbers $\langle a_1, a_2, \ldots, a_n \rangle$.
Output. A permutation $\langle a'_1, a'_2, \ldots, a'_n \rangle$ of the input such that $a'_1 \leq a'_2 \leq \cdots \leq a'_n$.

The Algorithm

We present insertion sort in the CLRS pseudocode style. The variable $j$ marks the "current" card being inserted; indices $1$ through $j-1$ are already in sorted order; $i$ scans backward through that sorted prefix looking for the correct insertion point.

Insertion-Sort(A)

for j = 2 to A.length    key = A[j]    // Insert A[j] into the sorted sequence A[1..j-1].    i = j - 1    while i > 0 and A[i] > key        A[i + 1] = A[i]        i = i - 1    A[i + 1] = key

Interactive Trace

Step through the algorithm on a concrete input. The blue cell holds the current key; amber cells are being compared; green cells are in their final sorted positions relative to the prefix seen so far.

Insertion Sort — Step Visualizer

Follow each iteration of the outer and inner loops.

Input:

Step 0 / 0 · Comparisons: 0

Click "Next Step" to begin.

Best and Worst Case Analysis

Let $t_j$ denote the number of times the while-loop test on line 5 is executed for a given value of $j$. We count the cost of each line as a constant $c_k$ multiplied by the number of times that line executes:

Line	Cost	Times
1 for j = 2 to n	$c_1$	$n$
2 key = A[j]	$c_2$	$n-1$
4 i = j - 1	$c_4$	$n-1$
5 while ...	$c_5$	$\sum_{j=2}^{n} t_j$
6 A[i+1] = A[i]	$c_6$	$\sum_{j=2}^{n}(t_j - 1)$
7 i = i - 1	$c_7$	$\sum_{j=2}^{n}(t_j - 1)$
8 A[i+1] = key	$c_8$	$n-1$

The total running time $T(n)$ is the sum of these products. We analyse two extremes.

Best Case — input already sorted

If $A$ is already in ascending order, then for each $j$ the test A[i] > key on line 5 is false on the first evaluation. Hence $t_j = 1$ for all $j = 2, \ldots, n$.

$$T(n) = c_1 n + (c_2 + c_4 + c_5 + c_8)(n-1) = an + b$$

for constants $a, b$. This is a linear function of $n$, so the best-case running time is $\Theta(n)$.

Worst Case — input in reverse sorted order

If $A$ is in strictly decreasing order, each key $A[j]$ must be compared with every element $A[1], \ldots, A[j-1]$; the inner loop runs until $i = 0$. Hence $t_j = j$, giving

$$\sum_{j=2}^{n} t_j = \sum_{j=2}^{n} j = \frac{n(n+1)}{2} - 1, \qquad \sum_{j=2}^{n}(t_j - 1) = \frac{n(n-1)}{2}.$$

Substituting and collecting constants gives

$$T(n) = \left(\tfrac{c_5}{2} + \tfrac{c_6}{2} + \tfrac{c_7}{2}\right)n^2 + \left(c_1 + c_2 + c_4 + \tfrac{c_5}{2} - \tfrac{c_6}{2} - \tfrac{c_7}{2} + c_8\right)n - (c_2 + c_4 + c_5 + c_8).$$

This is a quadratic function of $n$, so the worst-case running time is $\Theta(n^2)$.

The average-case running time is also $\Theta(n^2)$: on average half the sorted prefix must be scanned before the correct insertion point is found, giving $t_j \approx j/2$, and the leading term is still quadratic.

Loop Invariant — Proof of Correctness

A loop invariant is an assertion about the state of a computation that holds at designated points during the execution of a loop. We use it to prove correctness the same way mathematical induction proves statements about the natural numbers: we must establish three properties.

Loop Invariant Proof Method Initialization. The invariant is true prior to the first iteration of the loop.
Maintenance. If the invariant is true before an iteration, it remains true before the next iteration.
Termination. When the loop terminates, the invariant (combined with the reason the loop terminated) yields a useful property that helps show the algorithm is correct.

Theorem 2.1 (Correctness of Insertion-Sort) Invariant: At the start of each iteration of the for-loop on line 1 (with loop index $j$), the subarray $A[1\mathinner{.\,.}j-1]$ consists of the elements originally in $A[1\mathinner{.\,.}j-1]$, but in sorted order.

We verify the three conditions.

Initialization. Before the first iteration, $j = 2$, so $A[1\mathinner{.\,.}j-1] = A[1\mathinner{.\,.}1]$, which is a single-element subarray. It is trivially sorted and trivially a permutation of itself. The invariant holds. (That the base case is $j=2$ and not $j=1$ is a common confusion — the invariant is stated at the start of an iteration, and the first iteration starts with $j=2$.)

Maintenance. Assume the invariant holds at the start of iteration $j$: that is, $A[1\mathinner{.\,.}j-1]$ is a sorted permutation of the original $A[1\mathinner{.\,.}j-1]$. We must show it holds at the start of iteration $j+1$, i.e. that $A[1\mathinner{.\,.}j]$ is a sorted permutation of the original $A[1\mathinner{.\,.}j]$.

The body of the iteration saves $\textit{key} = A[j]$ and then repeatedly shifts elements of $A[1\mathinner{.\,.}j-1]$ one position to the right — replacing $A[i+1]$ with $A[i]$ — for as long as $A[i] > \textit{key}$ and $i > 0$. (A formal proof at this level requires its own inner invariant for the while-loop; informally: after each inner iteration, $A[i+2\mathinner{.\,.}j]$ is a rightward shift of the original $A[i+1\mathinner{.\,.}j-1]$, and every element in that range exceeds $\textit{key}$.) The loop terminates when either $i = 0$ or $A[i] \leq \textit{key}$; in either case, the assignment $A[i+1] = \textit{key}$ places $\textit{key}$ immediately to the right of an element that does not exceed it (or at position $1$). Thus $A[1\mathinner{.\,.}j]$ is sorted and is a permutation of the original $A[1\mathinner{.\,.}j]$.

Termination. The loop terminates when $j$ exceeds $n$, i.e. when $j = n+1$. Substituting into the invariant, the subarray $A[1\mathinner{.\,.}n]$ consists of the elements originally in $A[1\mathinner{.\,.}n]$, in sorted order. But $A[1\mathinner{.\,.}n]$ is the entire array, so the algorithm has correctly sorted $A$.■

Check Your Understanding — Insertion Sort

Q1. On an input of $n$ elements already sorted in descending order, how many times does the while-loop test on line 5 execute in total across the whole algorithm?

$n - 1$
$\lceil n/2 \rceil$
$n(n+1)/2 - 1$
$n \log_2 n$

Descending input is the worst case. For each $j$ from $2$ to $n$, the inner loop runs $j$ times (it scans the entire sorted prefix plus one final test). So the total is $\sum_{j=2}^{n} j = \frac{n(n+1)}{2} - 1$.

Q2. Why is the loop invariant stated at the start of each iteration rather than at the end?

It is a matter of convention; the end-state version is equivalent.
Stating it at the start makes termination immediately useful: when the loop exits, we can substitute the terminating value of the loop variable directly into the invariant.
Because the invariant does not hold at the end of the body.
Because the first iteration has not yet executed.

The "start-of-iteration" convention makes the termination step clean. When the for-loop exits with $j = n+1$, the invariant tells us $A[1\mathinner{.\,.}n]$ is sorted — the full result we need. Choice B captures this.

Q3. Insertion-Sort is an in-place algorithm. Which statement best describes what that means?

It does not use any variables besides the array.
It uses $O(\log n)$ extra memory.
It is optimally efficient.
At most a constant number of elements of the input array are stored outside the array at any time.

"In-place" means auxiliary storage is $O(1)$ — bounded by a constant independent of $n$. Insertion sort uses only the single variable key (plus loop indices) beyond the array itself.

Selected Exercises

Exercise 2.1-3 — Linear Search, with loop invariant

Problem. Consider the searching problem: given a sequence $A = \langle a_1, \ldots, a_n \rangle$ and a value $v$, return the index $i$ such that $v = A[i]$, or NIL if $v$ does not appear. Write pseudocode for linear search and prove it correct using a loop invariant.

Linear-Search(A, v)

for i = 1 to A.length    if A[i] == v        return ireturn NIL

Invariant. At the start of each iteration of the for-loop, the value $v$ does not appear in the subarray $A[1\mathinner{.\,.}i-1]$.

Initialization. Before iteration $i=1$, the subarray $A[1\mathinner{.\,.}0]$ is empty, so $v$ vacuously does not appear in it.

Maintenance. If we reach the start of iteration $i+1$, we did not return in iteration $i$, so $A[i] \neq v$. Combined with the invariant, $v$ does not appear in $A[1\mathinner{.\,.}i]$.

Termination. The loop exits either by returning an index $i$ (in which case $A[i] = v$, correct) or by completing all iterations. In the latter case $i = n+1$ and the invariant says $v \notin A[1\mathinner{.\,.}n]$, so returning NIL is correct.■

Exercise 2.1-4 — Binary Addition

Problem. Given two $n$-bit integers $A$ and $B$ stored in arrays of length $n$, design an algorithm that produces their $(n+1)$-bit binary sum $C$.

Binary-Add(A, B)

carry = 0for i = 1 to n    C[i] = (A[i] + B[i] + carry) mod 2    carry = (A[i] + B[i] + carry) / 2   // integer divisionC[n+1] = carry

The running time is $\Theta(n)$ — each bit is touched a constant number of times. A loop invariant (left to you) would state that after iteration $i$, the low $i$ bits of $C$ together with carry correctly represent $A[1\mathinner{.\,.}i] + B[1\mathinner{.\,.}i]$.

❧

2.2Analyzing Algorithms

To analyze an algorithm is to predict the resources it will require. Usually we care about time, but memory, bandwidth, or energy may be equally relevant. Before any analysis is possible we must commit to a model of computation; CLRS fixes the Random-Access Machine (RAM).

The RAM Model

Instructions execute one after another, with no concurrency. Each instruction takes a constant amount of time. The primitive instructions include arithmetic (add, subtract, multiply, divide, remainder, floor, ceiling), data movement (load, store, copy), and control (conditional and unconditional branch, subroutine call and return). Data types are integer and floating-point, with a word size that grows only modestly — specifically, with $c \log n$ bits for some constant $c \geq 1$ on inputs of size $n$.

Input Size and Running Time

Two quantities need precise definition.

Input size. The natural measure depends on the problem. For sorting, it is the number of items $n$. For integer multiplication, it is the total number of bits in the input. For a graph algorithm, it is often a pair $(|V|, |E|)$.

Running time. On a given input, the running time is the number of primitive operations (or "steps") executed. We follow the convention that each line of pseudocode takes constant time, though the constants may differ from line to line. This makes our analysis machine-independent up to a constant factor — exactly the granularity $\Theta$-notation (developed in Chapter 3) is designed for.

Worst Case, Best Case, Average Case

For a fixed algorithm, the running time depends on the input. We distinguish:

Worst-case running time $T(n)$: the maximum running time over all inputs of size $n$. This is the default. It gives an upper bound for any input, it occurs often in practice, and the "average case" is frequently as bad as the worst.
Best-case running time: the minimum over inputs of size $n$. Typically too optimistic to be useful on its own.
Average-case running time: the expectation over inputs of size $n$, under some distribution — commonly the uniform distribution. Requires probabilistic analysis and a justified distribution.

Order of Growth

Even the exact expression for $T(n)$ depends on machine-specific constants. For comparing algorithms, only the rate of growth matters as $n \to \infty$. For insertion sort we summarize:

Summary — Insertion Sort

$$T_{\text{best}}(n) = \Theta(n), \qquad T_{\text{worst}}(n) = \Theta(n^2), \qquad T_{\text{avg}}(n) = \Theta(n^2).$$

Check Your Understanding — Analyzing Algorithms

Q1. In the RAM model, which of the following is not assumed to be a constant-time operation?

Adding two word-sized integers.
Comparing two floating-point numbers.
Loading a word from memory at a given address.
Computing the factorial of a word-sized integer.

The RAM primitives include arithmetic, data movement, and control — each constant time. Factorial is a derived operation requiring up to $n$ multiplications, so it is not a RAM primitive.

Q2. An algorithm has a best-case running time of $\Theta(n)$ and a worst-case running time of $\Theta(n^2)$. Which of the following must be true?

The average-case running time is $\Theta(n^{3/2})$.
The average-case running time is $\Theta(n)$.
The running time on any input of size $n$ is $\Omega(n)$ and $O(n^2)$.
The running time equals exactly $n^2$ on some input.

Worst-case is an upper envelope; best-case is a lower envelope. Together they bracket every input. The average depends on the distribution and is not determined by these two quantities alone.

❧

2.3Designing Algorithms

Insertion sort uses an incremental design: we build up the sorted prefix one element at a time. In this section we introduce a different paradigm that will reappear throughout the book.

2.3.1 The Divide-and-Conquer Approach

Divide-and-Conquer Divide the problem into a number of subproblems that are smaller instances of the same problem.
Conquer the subproblems by solving them recursively. If the subproblem size is small enough (the base case), solve it directly.
Combine the solutions to the subproblems into the solution for the original problem.

Merge Sort

Merge sort applies this paradigm to sorting. To sort $A[p\mathinner{.\,.}r]$:

Divide: split the $n$-element subarray into two subarrays of $n/2$ elements each, namely $A[p\mathinner{.\,.}q]$ and $A[q+1\mathinner{.\,.}r]$ where $q = \lfloor (p+r)/2 \rfloor$.
Conquer: sort each half recursively using merge sort.
Combine: merge the two sorted halves into a single sorted array.

The recursion bottoms out when a subarray has just one element (which is trivially sorted). The key subroutine is MERGE, which assumes $A[p\mathinner{.\,.}q]$ and $A[q+1\mathinner{.\,.}r]$ are each already sorted.

Merge(A, p, q, r)

n_1 = q - p + 1n_2 = r - qlet L[1..n_1 + 1] and R[1..n_2 + 1] be new arraysfor i = 1 to n_1    L[i] = A[p + i - 1]for j = 1 to n_2    R[j] = A[q + j]L[n_1 + 1] = ∞R[n_2 + 1] = ∞i = 1j = 1for k = p to r    if L[i] <= R[j]        A[k] = L[i]        i = i + 1    else A[k] = R[j]        j = j + 1

The use of $\infty$ as a sentinel eliminates a tedious boundary check: neither $L$ nor $R$ can be exhausted before position $r$ because each contains a value ($\infty$) that is never less than the other.

Merge-Sort(A, p, r)

if p < r    q = ⌊(p + r) / 2⌋    Merge-Sort(A, p, q)    Merge-Sort(A, q + 1, r)    Merge(A, p, q, r)

Interactive Merge-Sort Trace

Watch the recursion unfold. The tree shows each subproblem; highlighted nodes are being processed; the array below shows the current state of $A$ after each MERGE completes.

Merge Sort — Recursion & Merge Visualizer

Each level of the tree corresponds to one level of recursion.

Input:

Step 0 / 0

Click "Next Step" to begin.

Correctness of MERGE

Theorem 2.2 (Correctness of Merge) Invariant: At the start of each iteration of the for-loop on line 12, the subarray $A[p\mathinner{.\,.}k-1]$ contains the $k - p$ smallest elements of $L[1\mathinner{.\,.}n_1+1]$ and $R[1\mathinner{.\,.}n_2+1]$, in sorted order. Moreover, $L[i]$ and $R[j]$ are the smallest elements of their arrays that have not yet been copied back into $A$.

Initialization. Before the first iteration, $k = p$, so $A[p\mathinner{.\,.}k-1] = A[p\mathinner{.\,.}p-1]$ is empty; it contains the $0$ smallest elements trivially. Since $i = j = 1$, $L[i]$ and $R[j]$ are the first (smallest) elements of their respective sorted arrays.

Maintenance. Suppose $L[i] \leq R[j]$; then $L[i]$ is the smallest element not yet copied back. The body copies $L[i]$ to $A[k]$ and increments both $i$ and $k$; the invariant holds for the new values. The case $L[i] > R[j]$ is symmetric.

Termination. The loop terminates when $k = r+1$. The invariant then says $A[p\mathinner{.\,.}r]$ contains the $r - p + 1 = n_1 + n_2$ smallest elements of $L$ and $R$ in sorted order. The only elements not among these are the two sentinels $\infty$. Hence $A[p\mathinner{.\,.}r]$ is exactly the sorted union of the original $L[1\mathinner{.\,.}n_1]$ and $R[1\mathinner{.\,.}n_2]$.■

2.3.2 Analyzing Divide-and-Conquer Algorithms

The running time of a divide-and-conquer algorithm is naturally described by a recurrence. Let $T(n)$ be the running time on a problem of size $n$. If the problem is small enough (size $\leq c$ for some constant $c$), the direct solution takes constant time $\Theta(1)$. Otherwise:

The divide step takes $D(n)$ time.
We solve $a$ subproblems, each of size $n/b$, in total time $a \cdot T(n/b)$.
The combine step takes $C(n)$ time.

This gives the general recurrence

$$T(n) = \begin{cases} \Theta(1) & \text{if } n \leq c, \\ aT(n/b) + D(n) + C(n) & \text{otherwise.} \end{cases}$$

The Merge-Sort Recurrence

For merge sort on $n$ elements with $n > 1$:

Divide: compute the midpoint $q$. This is $D(n) = \Theta(1)$.
Conquer: recursively sort two subproblems of size $n/2$, totalling $2T(n/2)$.
Combine: the MERGE procedure runs in $\Theta(n)$ on an $n$-element subarray (it is a single for-loop from $p$ to $r$ with constant-time body).

Therefore

$$T(n) = \begin{cases} \Theta(1) & \text{if } n = 1, \\ 2T(n/2) + \Theta(n) & \text{if } n > 1. \end{cases}$$

Theorem 2.3 (Merge-Sort Running Time) The recurrence above has solution $T(n) = \Theta(n \log n)$.

A rigorous derivation appears in §4.4 via the recursion-tree method and in §4.5 as Case 2 of the master theorem. The intuition, however, is easy to state now: the recursion tree for merge sort has depth $\log_2 n$, and each level performs $\Theta(n)$ work in total (the top level does $n$ work, the next level does $n/2 + n/2 = n$ work, and so on). Total work: $\Theta(n) \cdot \Theta(\log n) = \Theta(n \log n)$.

Check Your Understanding — Merge Sort

Q1. Why does MERGE use sentinels $L[n_1+1] = R[n_2+1] = \infty$?

To pad the arrays to a power of 2.
To guarantee the inner test L[i] <= R[j] is always well-defined and the correct element is chosen, avoiding explicit boundary checks.
To mark the end of input for I/O purposes.
They are a debugging aid and can be removed without changing behavior.

An exhausted side would otherwise require a boundary check (is $i > n_1$? is $j > n_2$?) in the hot loop. Setting both tails to $\infty$ means whichever side is exhausted "loses" every remaining comparison, so the other side drains cleanly into $A$.

Q2. For which input size $n_0$ does merge sort begin to beat insertion sort (asymptotically)? Assume insertion sort runs in $8n^2$ steps and merge sort in $64n \lg n$ steps.

$n_0 = 8$
$n_0 = 16$
$n_0 \approx 44$ — the smallest $n$ with $n > 8 \lg n$.
$n_0 = 1024$

We need $8n^2 > 64 n \lg n$, i.e. $n > 8 \lg n$. This first holds around $n = 44$. For smaller $n$, the larger constants of merge sort dominate — which is why many real-world implementations (Timsort, introsort) switch to insertion sort on small subarrays.

Q3. In the general divide-and-conquer recurrence $T(n) = aT(n/b) + D(n) + C(n)$, which parameter most directly controls the depth of the recursion tree?

$a$
$b$
$D(n) + C(n)$
The base case size $c$

Each level shrinks the problem by a factor of $b$. Starting at size $n$ and halting at size $1$ (or $c$), the depth is roughly $\log_b n$. The branching factor $a$ controls the width of the tree, not its depth.

Selected Exercise

Exercise 2.3-5 — Binary search and its recurrence

Problem. Referring back to the searching problem, observe that if $A$ is sorted we can check the midpoint against $v$ and eliminate half of the array. This is binary search. Write iterative and recursive pseudocode and argue that the worst-case running time is $\Theta(\log n)$.

Binary-Search(A, v) — iterative

low = 1; high = A.lengthwhile low <= high    mid = ⌊(low + high) / 2⌋    if A[mid] == v return mid    elseif A[mid] < v  low = mid + 1    else               high = mid - 1return NIL

Each iteration either returns or halves the search range (from size $n$ to $\lfloor n/2 \rfloor$ or smaller). The number of halvings required to shrink $n$ to $0$ is $\lfloor \lg n \rfloor + 1$, and each iteration does constant work. Hence worst case $\Theta(\log n)$. The recurrence is $T(n) = T(n/2) + \Theta(1)$, which we will solve formally by the master theorem.

❧

Chapter 3

Growth of Functions

Chapter 2 justified the convention that lines of pseudocode run in constant time — but only up to a constant. To compare algorithms meaningfully we need a vocabulary that ignores constant factors and low-order terms. That vocabulary is asymptotic notation.

3.1Asymptotic Notation

The symbols $O$, $\Theta$, $\Omega$, $o$, and $\omega$ describe the asymptotic running time of an algorithm — its behavior as $n \to \infty$. Each describes a set of functions; equations like "$T(n) = O(n^2)$" are abuses of notation that should be read as "$T(n)$ is an element of the set $O(n^2)$".

$\Theta$-Notation — Asymptotic Tight Bound

Definition 3.1 ($\Theta$)

For a given function $g(n)$, we denote by $\Theta(g(n))$ the set of functions $$\Theta(g(n)) = \{\, f(n) : \exists\, c_1, c_2 > 0,\ n_0 > 0\ \text{such that}\ 0 \leq c_1 g(n) \leq f(n) \leq c_2 g(n)\ \text{for all } n \geq n_0 \,\}.$$

Informally: $f(n)$ is squeezed between $c_1 g(n)$ and $c_2 g(n)$ for all sufficiently large $n$. Note the requirement $f(n) \geq 0$ — all functions we use with asymptotic notation are asymptotically nonnegative.

Example 3.1 — $\tfrac{1}{2}n^2 - 3n = \Theta(n^2)$

We must find constants $c_1, c_2, n_0 > 0$ such that $$c_1 n^2 \leq \tfrac{1}{2} n^2 - 3n \leq c_2 n^2 \quad \text{for all } n \geq n_0.$$

Dividing by $n^2$ (valid for $n > 0$): $c_1 \leq \tfrac{1}{2} - \tfrac{3}{n} \leq c_2$. The right inequality holds for all $n \geq 1$ with $c_2 = \tfrac{1}{2}$. The left inequality holds for all $n \geq 7$ with $c_1 = \tfrac{1}{14}$ (since then $\tfrac{1}{2} - \tfrac{3}{7} = \tfrac{1}{14}$). Choosing $n_0 = 7$, $c_1 = \tfrac{1}{14}$, $c_2 = \tfrac{1}{2}$ completes the proof.■

$O$-Notation — Asymptotic Upper Bound

Definition 3.2 ($O$) $$O(g(n)) = \{\, f(n) : \exists\, c > 0,\ n_0 > 0\ \text{such that}\ 0 \leq f(n) \leq c g(n)\ \text{for all } n \geq n_0 \,\}.$$

Big-$O$ gives an upper bound only, not necessarily tight. Note $\Theta(g(n)) \subseteq O(g(n))$.

The idiomatic use: "the running time of insertion sort is $O(n^2)$" — meaning no input causes more than a constant times $n^2$ steps. Because $O$ bounds the worst case from above, it is the workhorse of the field.

$\Omega$-Notation — Asymptotic Lower Bound

Definition 3.3 ($\Omega$) $$\Omega(g(n)) = \{\, f(n) : \exists\, c > 0,\ n_0 > 0\ \text{such that}\ 0 \leq c g(n) \leq f(n)\ \text{for all } n \geq n_0 \,\}.$$

Theorem 3.1 — The $\Theta$ / $O$ / $\Omega$ Connection For any two functions $f(n)$ and $g(n)$: $$f(n) = \Theta(g(n)) \iff f(n) = O(g(n))\ \text{and}\ f(n) = \Omega(g(n)).$$

($\Rightarrow$) If $f = \Theta(g)$, there exist $c_1, c_2, n_0$ with $c_1 g \leq f \leq c_2 g$ for $n \geq n_0$. The upper inequality with constant $c = c_2$ gives $f = O(g)$; the lower with $c = c_1$ gives $f = \Omega(g)$.

($\Leftarrow$) If $f = O(g)$ with constant $c_2$ and $n_0' $, and $f = \Omega(g)$ with constant $c_1$ and $n_0''$, then for all $n \geq \max(n_0', n_0'')$ we have $c_1 g \leq f \leq c_2 g$, establishing $f = \Theta(g)$.■

$o$ and $\omega$ — Non-tight Bounds

Definition 3.4 ($o$ and $\omega$)

$f(n) = o(g(n))$ if for every positive constant $c$, there exists $n_0 > 0$ with $0 \leq f(n) < c g(n)$ for all $n \geq n_0$. Equivalently: $\displaystyle\lim_{n \to \infty} \frac{f(n)}{g(n)} = 0$.

$f(n) = \omega(g(n))$ if for every positive constant $c$, there exists $n_0$ with $0 \leq c g(n) < f(n)$ for all $n \geq n_0$. Equivalently: $\displaystyle\lim_{n \to \infty} \frac{f(n)}{g(n)} = \infty$.

A useful analogy: $O$ is "$\leq$", $\Omega$ is "$\geq$", $\Theta$ is "$=$", $o$ is "$<$", and $\omega$ is "$>$". Examples: $2n = o(n^2)$, $n^2/2 \neq o(n^2)$; $n^2 = \omega(n)$, $n^2 \neq \omega(n^2)$.

Properties — Transitivity, Reflexivity, Symmetry

Properties of Asymptotic Notation

Transitivity. For any notation in $\{O, \Theta, \Omega, o, \omega\}$: $$f = \Theta(g)\ \text{and}\ g = \Theta(h) \Rightarrow f = \Theta(h),$$ and analogously for the others.

Reflexivity. $f = \Theta(f)$, $f = O(f)$, $f = \Omega(f)$. (Not $o$ or $\omega$!)

Symmetry. $f = \Theta(g) \iff g = \Theta(f)$.

Transpose symmetry. $f = O(g) \iff g = \Omega(f)$; $f = o(g) \iff g = \omega(f)$.

Unlike with real numbers, there is no trichotomy for functions: not every pair satisfies one of $f = O(g)$, $f = \Theta(g)$, or $f = \Omega(g)$. A counterexample: $f(n) = n$ and $g(n) = n^{1+\sin n}$. The exponent oscillates, so neither function eventually dominates the other.

Proof Techniques — Worked Problems

Worked Problem 3.1 — Prove $6n^3 \neq \Theta(n^2)$

Suppose for contradiction that $6n^3 = \Theta(n^2)$. Then there exist $c_2, n_0 > 0$ such that $6n^3 \leq c_2 n^2$ for all $n \geq n_0$, i.e. $6n \leq c_2$ for all $n \geq n_0$. But $6n \to \infty$, so this is false. Contradiction.■

Worked Problem 3.2 — Prove $\max(f(n), g(n)) = \Theta(f(n) + g(n))$ for asymptotically nonneg. $f, g$

We need constants $c_1, c_2, n_0$ such that $c_1(f+g) \leq \max(f,g) \leq c_2(f+g)$ for $n \geq n_0$.

Upper bound. Clearly $\max(f,g) \leq f + g$, so $c_2 = 1$ works.

Lower bound. We have $f + g \leq 2 \max(f, g)$, hence $\max(f,g) \geq \tfrac{1}{2}(f + g)$. So $c_1 = \tfrac{1}{2}$ works.

Both bounds hold as soon as both $f, g \geq 0$, which is true for all $n \geq n_0$ by hypothesis.■

Check Your Understanding — Asymptotic Notation

Q1. Which of the following is false?

$n = O(n^2)$
$n = o(n^2)$
$2n^2 + n = \Theta(n^2)$
$n^2 = O(n)$

Only D is false: $n^2$ grows strictly faster than any constant multiple of $n$, so it cannot be $O(n)$. The others are all correct applications of the definitions.

Q2. True or false: $f(n) = o(g(n))$ implies $f(n) = O(g(n))$.

True. "Strictly less" implies "less than or equal to".
False. $o$ and $O$ are incomparable.
True, but only for asymptotically nonnegative functions.
False. $o$ requires a strict inequality that $O$ does not provide.

If the limit $f/g \to 0$, then in particular $f/g \leq 1$ eventually (or any other constant), giving $f \leq c g$ for some $c$ — exactly the $O$ definition. So $o(g) \subseteq O(g)$.

Q3. To show $f(n) \neq O(g(n))$, what must you exhibit?

Some $n$ with $f(n) > g(n)$.
A sequence of $n$ with $f(n) > g(n)$.
A proof that for every choice of $c > 0$ and $n_0$, some $n \geq n_0$ violates $f(n) \leq c g(n)$.
A proof that $\lim f/g = \infty$.

Negating the $O$ definition: for all $c, n_0$ there exists $n \geq n_0$ with $f(n) > c g(n)$. Choice D is sufficient but strictly stronger (it proves $\omega$, hence not $O$); choice C is the precise negation.

❧

3.2Standard Notations and Common Functions

Before leaving growth of functions, we record a catalogue of standard objects — monotonicity, floors and ceilings, modular arithmetic, polynomials, exponentials, logarithms, factorials, and iterated logarithms — together with the comparisons between them that recur constantly in analysis.

Monotonicity

A function $f$ is monotonically increasing if $m \leq n \Rightarrow f(m) \leq f(n)$; strictly increasing if the conclusion is $f(m) < f(n)$; and analogously for decreasing.

Floors and Ceilings

$\lfloor x \rfloor$ is the largest integer $\leq x$, $\lceil x \rceil$ the smallest integer $\geq x$. For all real $x$: $x - 1 < \lfloor x \rfloor \leq x \leq \lceil x \rceil < x + 1$. Both are monotonically increasing. A fact we will use repeatedly: for any integer $n$ and positive integers $a$ and $b$,

$$\left\lceil \frac{\lceil n/a \rceil}{b} \right\rceil = \left\lceil \frac{n}{ab} \right\rceil, \qquad \left\lfloor \frac{\lfloor n/a \rfloor}{b} \right\rfloor = \left\lfloor \frac{n}{ab} \right\rfloor.$$

Polynomials, Exponentials, Logarithms

A polynomial in $n$ of degree $d$ is a function $p(n) = \sum_{i=0}^{d} a_i n^i$ with $a_d \neq 0$. If $a_d > 0$, then $p(n) = \Theta(n^d)$. We say $f(n)$ is polynomially bounded if $f(n) = O(n^k)$ for some constant $k$.

The following inequality captures the key comparison between exponentials and polynomials:

Theorem 3.2 — Exponentials dominate polynomials For any real constants $a > 1$ and $b$, $$\lim_{n \to \infty} \frac{n^b}{a^n} = 0,$$ hence $n^b = o(a^n)$. Equivalently: any polynomial grows asymptotically slower than any exponential with base $> 1$.

A symmetric statement holds for logarithms versus polynomials:

Theorem 3.3 — Polynomials dominate logarithms For any constants $a > 0$ and $b > 0$, $$\lim_{n \to \infty} \frac{\lg^b n}{n^a} = 0,$$ so $\lg^b n = o(n^a)$. (Here $\lg^b n$ means $(\lg n)^b$.)

Taken together, these are the backbone of the algorithm designer's intuition:

constants $\prec$ logs $\prec$ polynomials $\prec$ exponentials $\prec$ factorials.

Growth Rate Reference Table

Notation	Name	Example	Comment
$\Theta(1)$	constant	array indexing	independent of $n$
$\Theta(\log n)$	logarithmic	binary search	base is irrelevant (constant factor)
$\Theta(n)$	linear	linear search	touch each input once
$\Theta(n \log n)$	linearithmic	merge sort	optimal comparison sort
$\Theta(n^2)$	quadratic	insertion sort (worst)	nested loops over input
$\Theta(n^3)$	cubic	naïve matrix multiply	triple-nested loops
$\Theta(2^n)$	exponential	subset enumeration	intractable for large $n$
$\Theta(n!)$	factorial	brute-force TSP	grows faster than any exponential

Logarithm Identities

The following identities are used without comment throughout CLRS. For $a, b, c > 0$ and $n$ real:

$a = b^{\log_b a}$ • $\log_c(ab) = \log_c a + \log_c b$ • $\log_b a^n = n \log_b a$
$\log_b a = \dfrac{\log_c a}{\log_c b}$ • $\log_b(1/a) = -\log_b a$ • $\log_b a = \dfrac{1}{\log_a b}$
$a^{\log_b c} = c^{\log_b a}$

Factorials — Stirling's Approximation

Stirling's Approximation $$n! = \sqrt{2\pi n} \left(\frac{n}{e}\right)^n \left(1 + \Theta\!\left(\frac{1}{n}\right)\right).$$ As a consequence: $n! = o(n^n)$, $n! = \omega(2^n)$, and $\lg(n!) = \Theta(n \lg n)$.

Functional Iteration and $\log^*$

The notation $f^{(i)}(n)$ means $f$ applied $i$ times to $n$: $f^{(0)}(n) = n$, $f^{(i)}(n) = f(f^{(i-1)}(n))$ for $i \geq 1$. The iterated logarithm is

$$\lg^*(n) = \min\{\, i \geq 0 : \lg^{(i)}(n) \leq 1 \,\}.$$

That is, $\lg^* n$ counts how many times we must apply $\lg$ to $n$ before the result drops to at most $1$. This function grows unimaginably slowly:

$n$	$\lg^* n$
$n \leq 1$	$0$
$n = 2$	$1$
$n \in \{3, 4\}$	$2$
$n \in [5, 16]$	$3$
$n \in [17, 65536]$	$4$
$n \in [65537, 2^{65536}]$	$5$

For any conceivable physical input, $\lg^* n \leq 5$. This is why algorithms with running times such as $O(m \cdot \alpha(m, n))$ (where $\alpha$ is the inverse Ackermann function — even slower than $\lg^*$) are said to run in "effectively linear" time.

Check Your Understanding — Common Functions

Q1. Order from slowest- to fastest-growing: $\lg(n!)$, $\quad n^2$, $\quad n \lg n$, $\quad 2^n$.

$n \lg n \prec n^2 \prec \lg(n!) \prec 2^n$
$n \lg n \asymp \lg(n!) \prec n^2 \prec 2^n$
$\lg(n!) \prec n \lg n \prec n^2 \prec 2^n$
$n^2 \prec n \lg n \prec \lg(n!) \prec 2^n$

By Stirling, $\lg(n!) = \Theta(n \lg n)$ — the two are asymptotically equivalent ($\asymp$). Then $n \lg n$ is strictly between $n$ and $n^2$, which is strictly below $2^n$.

Q2. For which of the following is $f(n) = \Theta(g(n))$?

$f(n) = n \lg n$, $g(n) = n \sqrt{n}$
$f(n) = 2^n$, $g(n) = 3^n$
$f(n) = \log_2 n$, $g(n) = \log_{10} n$
$f(n) = n!$, $g(n) = n^n$

Logarithms of different bases differ only by a constant factor ($\log_2 n = \log_2 10 \cdot \log_{10} n$), so they are $\Theta$-equivalent. A: $\lg n = o(\sqrt{n})$. B: $2^n = o(3^n)$. D: $n! = o(n^n)$.

❧

Chapter 4

Divide & Conquer

When an algorithm is recursive, its running time is naturally expressed as a recurrence — an equation defining $T(n)$ in terms of $T$ at smaller values. This chapter develops three techniques for solving such recurrences: the substitution method (guess and prove by induction), the recursion-tree method (sum work level-by-level to generate a guess), and the master method (apply a theorem). In practice one often uses the recursion tree to guess, then the substitution method to prove, or — when applicable — skips directly to the master theorem.

4.3The Substitution Method

The substitution method has two steps:

Guess the form of the solution.
Use mathematical induction to find the constants and show that the solution works.

The name comes from substituting the guessed solution for the function when the inductive hypothesis is applied to smaller values.

Worked Example — Solving $T(n) = 2T(\lfloor n/2 \rfloor) + n$

We will show that $T(n) = O(n \lg n)$. (Observe that this is the merge-sort recurrence up to notation.)

Claim If $T(1) = \Theta(1)$ and $T(n) = 2T(\lfloor n/2 \rfloor) + n$ for $n > 1$, then $T(n) = O(n \lg n)$.

Guess. $T(n) \leq c n \lg n$ for some constant $c > 0$ and all $n \geq n_0$.

Inductive step. Assume the bound holds for all values $m < n$ (with $m \geq n_0$). We show it holds for $n$. By the recurrence, $$T(n) = 2T(\lfloor n/2 \rfloor) + n.$$ Applying the inductive hypothesis to $\lfloor n/2 \rfloor$: $$T(n) \leq 2 \cdot c \lfloor n/2 \rfloor \lg(\lfloor n/2 \rfloor) + n \leq 2 \cdot c \cdot (n/2) \cdot \lg(n/2) + n = c n \lg(n/2) + n.$$ Since $\lg(n/2) = \lg n - 1$, $$T(n) \leq c n \lg n - c n + n = c n \lg n - (c - 1) n.$$ For the desired bound $T(n) \leq c n \lg n$, we need $(c - 1) n \geq 0$, i.e. $c \geq 1$.

Base case. We need $n_0$ for which the induction can start. Taking $n_0 = 2$ and any $c \geq T(2)/(2 \lg 2) = T(2)/2$ ensures $T(2) \leq c \cdot 2 \lg 2 = 2c$. (We cannot use $n = 1$ as a base case because $\lg 1 = 0$ and the bound $c \cdot 1 \cdot \lg 1 = 0$ would require $T(1) = 0$. This is why substitution proofs typically take the smallest two values as base cases.) Taking $c = \max(1, T(2)/2)$ completes the argument.■

The Art of Making a Good Guess

There is no recipe — experience and the recursion-tree method (§4.4) are the two chief sources of guesses. When one recurrence resembles another whose solution is known, the known solution is a natural first guess. For example:

$T(n) = 2T(\lfloor n/2 \rfloor + 17) + n$ looks like merge sort — guess $O(n \lg n)$; the $+17$ perturbation is asymptotically irrelevant.
If loose bounds are known ($T = \Omega(n)$ and $T = O(n^2)$), work inward: first refine the lower bound to $\Omega(n \lg n)$, then the upper to $O(n \lg n)$.

Subtleties — The Need for Stronger Induction Hypotheses

Sometimes the obvious guess cannot be proved even though it is correct. Consider

$$T(n) = T(\lfloor n/2 \rfloor) + T(\lceil n/2 \rceil) + 1.$$

We might guess $T(n) = O(n)$, and attempt to prove $T(n) \leq c n$. Then $$T(n) \leq c \lfloor n/2 \rfloor + c \lceil n/2 \rceil + 1 = c n + 1,$$ which is not $\leq c n$. The induction fails — but the guess is right! The fix is to strengthen the hypothesis by subtracting a lower-order term: guess $T(n) \leq c n - d$ for constants $c, d > 0$. Then

$$T(n) \leq (c \lfloor n/2 \rfloor - d) + (c \lceil n/2 \rceil - d) + 1 = c n - 2d + 1 \leq c n - d$$

as long as $d \geq 1$. The stronger hypothesis is easier to prove because it leaves more "slack" to absorb the additive $+1$.

Pitfalls

Never conclude $T(n) = O(n)$ from an argument of the form

$T(n) \leq 2(c \lfloor n/2 \rfloor) + n = c n + n = O(n)$.

This is wrong. The inductive step must produce exactly the form of the hypothesis $T(m) \leq c m$ — with the same constant $c$ — not merely some $O(n)$ expression. Constants must be pinned down explicitly.

Check Your Understanding — Substitution Method

Q1. You attempt to show $T(n) = T(n-1) + n = O(n)$ by guessing $T(n) \leq c n$. The induction gives $T(n) \leq c(n-1) + n = c n + (n - c)$. What has gone wrong?

Nothing — the bound holds for $c$ large enough.
The bound $T(n) \leq c n$ requires $n - c \leq 0$, i.e. $n \leq c$, which fails for infinitely many $n$. The guess is wrong; $T(n) = \Theta(n^2)$.
The base case is missing.
The inductive hypothesis should have been applied at $n/2$, not $n-1$.

Unrolling: $T(n) = T(n-1) + n = T(n-2) + (n-1) + n = \cdots = \sum_{k=1}^{n} k = \Theta(n^2)$. The linear guess $O(n)$ is simply too small.

Q2. For which recurrence would subtracting a lower-order term from the hypothesis ($T(n) \leq cn - d$) be a productive technique?

$T(n) = T(\lfloor n/2 \rfloor) + T(\lceil n/2 \rceil) + 1$ — to prove $O(n)$.
$T(n) = 2 T(n/2) + n^2$ — to prove $O(n \lg n)$.
$T(n) = 4 T(n/2) + n$ — to prove $O(n^2)$.
$T(n) = T(n-1) + \lg n$ — to prove $O(n \lg n)$.

The technique is valuable when the inductive step leaves behind a small additive term (like $+1$) that blocks the direct bound. Choice A is the canonical example from the main text.

❧

4.4The Recursion-Tree Method

In a recursion tree, each node represents the cost of a single subproblem in the set of recursive function invocations. We sum the costs within each level to obtain a per-level total, then sum those over all levels. The recursion tree is especially good at generating a guess that we can then verify by the substitution method.

Case Study — Merge Sort, Revisited

Recall $T(n) = 2T(n/2) + cn$ for $n > 1$ (with $c$ some constant), $T(1) = c$. Assume for exposition that $n$ is a power of $2$.

Recursion Tree for Merge Sort

Level 0 (the root) does $cn$ work and has $1$ node.

Level 1 has $2$ nodes, each doing $c(n/2)$ work — total: $cn$.

Level 2 has $4$ nodes, each doing $c(n/4)$ work — total: $cn$.

Level $i$ has $2^i$ nodes, each doing $c(n/2^i)$ work — total: $cn$.

The recursion bottoms out at subproblem size $1$, which happens at level $\lg n$.

There are thus $\lg n + 1$ levels, each contributing $cn$ work: $$T(n) = cn (\lg n + 1) = cn \lg n + cn = \Theta(n \lg n).$$

Interactive Recursion Tree — Merge Sort

Merge Sort Recursion Tree — Level-by-Level

Set $n$ and watch the tree grow. Per-level work and cumulative work are shown.

$n$:

Choose $n$ (a power of 2 works best) and click "Build Tree".

A Harder Example — $T(n) = 3T(n/4) + cn^2$

Let's use the recursion tree to guess a solution, then verify by substitution.

Level-by-level Cost

The root has cost $cn^2$ and $3$ children of size $n/4$; those children have $9$ grandchildren of size $n/16$; and so on. At level $i$ there are $3^i$ nodes, each with cost $c(n/4^i)^2 = c n^2 / 16^i$. The per-level cost is $$3^i \cdot \frac{c n^2}{16^i} = \left(\frac{3}{16}\right)^i c n^2.$$

The tree has depth $\log_4 n$ (since subproblem size $n/4^i = 1$ when $i = \log_4 n$). Summing the per-level costs: $$T(n) = \sum_{i=0}^{\log_4 n - 1} \left(\frac{3}{16}\right)^i c n^2 + \Theta(n^{\log_4 3}).$$ The final additive term is the total cost of the leaves: there are $3^{\log_4 n} = n^{\log_4 3}$ leaves, each costing $\Theta(1)$.

Since $3/16 < 1$, the geometric sum is bounded above by $\sum_{i=0}^{\infty} (3/16)^i c n^2 = \frac{c n^2}{1 - 3/16} = \frac{16}{13} c n^2$. Thus $T(n) = O(n^2)$. The leaf contribution $n^{\log_4 3} \approx n^{0.79}$ is dominated by $n^2$.

We verify by substitution: guess $T(n) \leq d n^2$. The recurrence gives $T(n) \leq 3 d (n/4)^2 + c n^2 = \tfrac{3}{16} d n^2 + c n^2 = (\tfrac{3}{16} d + c) n^2.$ We want this $\leq d n^2$, i.e. $\tfrac{3}{16} d + c \leq d$, i.e. $d \geq \tfrac{16}{13} c$. So choose $d = \tfrac{16}{13} c$ and the bound holds. Conclusion: $T(n) = O(n^2)$ — and a matching lower bound from the root term gives $T(n) = \Theta(n^2)$.

Unbalanced Splits — $T(n) = T(n/3) + T(2n/3) + cn$

Here subproblem sizes shrink at different rates. The longest root-to-leaf path goes through the $2/3$-children at each step; it reaches size $1$ when $(2/3)^k n = 1$, i.e. $k = \log_{3/2} n$. The shortest path goes through the $1/3$-children and has length $\log_3 n$. For a full tree (before any leaf is reached), each level sums to $cn$: a subproblem of size $m$ contributes $cm$, and the subproblems at each level partition the original $n$. So for at least the first $\log_3 n$ levels the per-level total is exactly $cn$. The upper bound is $T(n) = O(cn \log_{3/2} n) = O(n \lg n)$; a matching lower bound from counting the "full" part gives $T(n) = \Omega(n \lg n)$, so $T(n) = \Theta(n \lg n)$.

Check Your Understanding — Recursion Trees

Q1. In the recursion tree for $T(n) = 4T(n/2) + n$, what is the total work at level $i$ (for $0 \leq i \leq \lg n$)?

$n$ — the per-level work is constant.
$2^i n$
$2^i n$ — grows geometrically; leaves dominate.
$n/2^i$ — shrinks geometrically; root dominates.

At level $i$ there are $4^i$ nodes each of size $n/2^i$; the combine cost is $n/2^i$ per node, so per-level work is $4^i \cdot n/2^i = 2^i n$. The per-level totals form a geometric progression with ratio $2$; the leaves dominate and $T(n) = \Theta(n^{\lg 4}) = \Theta(n^2)$.

Q2. For $T(n) = 2T(n/2) + n$ (merge sort), at the bottom of the tree (leaves), how many leaves are there, and what is the total leaf cost?

$\lg n$ leaves, total cost $\Theta(\lg n)$.
$n$ leaves, total cost $\Theta(n)$.
$n/2$ leaves, total cost $\Theta(n/2)$.
$n^2$ leaves, total cost $\Theta(n^2)$.

Each leaf is a subproblem of size $1$ doing $\Theta(1)$ work. The tree has depth $\lg n$ and branching factor $2$, so there are $2^{\lg n} = n$ leaves. Total leaf cost: $\Theta(n)$.

Q3. A recursion tree is most useful as a tool for:

Proving a recurrence solution rigorously.
Visualizing the stack frame.
Generating a good guess for the asymptotic solution, which is then verified rigorously (often by substitution or the master method).
Replacing the master method in all cases.

A tree diagram with ellipses is not a proof — floors and ceilings are ignored, constants are hidden. CLRS therefore recommends using the tree to generate a guess, then proving it by substitution.

❧

4.5The Master Method

The master method is a cookbook for recurrences of the form $$T(n) = a T(n/b) + f(n), \qquad a \geq 1, \quad b > 1, \quad f(n) \text{ asymptotically positive}.$$ Here the problem of size $n$ is divided into $a$ subproblems of size $n/b$, and $f(n)$ is the cost of the divide and combine steps. (We interpret $n/b$ as either $\lfloor n/b \rfloor$ or $\lceil n/b \rceil$ throughout — the asymptotic answer is the same.)

Theorem 4.1 — The Master Theorem (CLRS) Let $a \geq 1$ and $b > 1$ be constants, let $f(n)$ be an asymptotically positive function, and let $T(n)$ be defined by $$T(n) = a T(n/b) + f(n).$$ Define the watershed function $n^{\log_b a}$. Then $T(n)$ has the following asymptotic bounds:

Case 1. If $f(n) = O(n^{\log_b a - \varepsilon})$ for some constant $\varepsilon > 0$, then $T(n) = \Theta(n^{\log_b a})$.
Case 2. If $f(n) = \Theta(n^{\log_b a})$, then $T(n) = \Theta(n^{\log_b a} \lg n)$.
Case 3. If $f(n) = \Omega(n^{\log_b a + \varepsilon})$ for some constant $\varepsilon > 0$, and if $a f(n/b) \leq c f(n)$ for some constant $c < 1$ and all sufficiently large $n$ (the regularity condition), then $T(n) = \Theta(f(n))$.

Interpretation

The watershed $n^{\log_b a}$ is the cost contributed by the leaves of the recursion tree: there are $a^{\log_b n} = n^{\log_b a}$ of them, each of $\Theta(1)$ cost.

Case 1. $f$ is polynomially smaller than the watershed — leaves dominate, $T(n) = \Theta(n^{\log_b a})$.
Case 2. $f$ matches the watershed up to constants — all levels are equal, add an extra $\lg n$ factor, $T(n) = \Theta(n^{\log_b a} \lg n)$.
Case 3. $f$ is polynomially larger than the watershed and regular — the root dominates, $T(n) = \Theta(f(n))$.

The word "polynomially" is essential. $f(n) = n^{\log_b a} / \lg n$ is asymptotically smaller than $n^{\log_b a}$, but not polynomially so. In this gap, none of the three cases applies. Similarly $f(n) = n^{\log_b a} \lg n$ sits in a gap above Case 2.

Interactive Master Theorem Calculator

Enter $a$, $b$, and $f(n)$ (as $n^k$ or $n^k \lg n$) to have the cases mechanically applied.

Master Theorem Dispatcher

Accepts $f(n)$ of the form $n^k$ or $n^k \lg n$. Identifies the case and gives the asymptotic bound.

$a$ (subproblems)

$b$ (size factor, $>1$)

exponent $k$ in $f(n)$

log factor?

Enter parameters and click "Apply Master Theorem".

Canonical Applications

Merge Sort — $T(n) = 2 T(n/2) + \Theta(n)$

$a = 2, b = 2$, so $\log_b a = 1$, watershed $n$. $f(n) = n = \Theta(n) = \Theta(n^{\log_b a})$. This is Case 2, giving $T(n) = \Theta(n \lg n)$. ✓

Binary Search — $T(n) = T(n/2) + \Theta(1)$

$a = 1, b = 2$, so $\log_b a = 0$, watershed $n^0 = 1$. $f(n) = \Theta(1) = \Theta(n^0)$. Case 2: $T(n) = \Theta(\lg n)$. ✓

Strassen's Matrix Multiplication — $T(n) = 7 T(n/2) + \Theta(n^2)$

$a = 7, b = 2$, $\log_b a = \lg 7 \approx 2.807$. $f(n) = n^2 = O(n^{\lg 7 - \varepsilon})$ for, say, $\varepsilon = 0.8$. Case 1: $T(n) = \Theta(n^{\lg 7})$. This is the celebrated $O(n^{2.81})$ matrix-multiplication bound.

A Case-3 Example — $T(n) = 3 T(n/4) + n \lg n$

$\log_b a = \log_4 3 \approx 0.793$, watershed $n^{0.793}$. $f(n) = n \lg n = \Omega(n^{0.793 + \varepsilon})$ for, say, $\varepsilon = 0.1$. We verify regularity: $a f(n/b) = 3 \cdot (n/4)\lg(n/4) = (3/4) n \lg n - (3/4) n \lg 4 \leq (3/4) n \lg n = (3/4) f(n)$ for large $n$, with $c = 3/4 < 1$. Case 3: $T(n) = \Theta(n \lg n)$.

A Gap Example — $T(n) = 2 T(n/2) + n \lg n$

$\log_b a = 1$, watershed $n$. $f(n) = n \lg n$. Is $n \lg n = \Omega(n^{1+\varepsilon})$ for some $\varepsilon > 0$? No: $\lg n = o(n^\varepsilon)$ for every $\varepsilon > 0$. So we are in the gap between Cases 2 and 3, and the master theorem does not apply. A different technique — recursion tree or substitution — gives $T(n) = \Theta(n \lg^2 n)$.

Summary Table — When to Use Which Method

Technique	Best for	Proves?
Substitution	Verifying a specific guess	Yes, with induction
Recursion tree	Generating a guess	Not rigorously; use substitution to prove
Master method	Recurrences of form $aT(n/b) + f(n)$ where $f$ has a clean form	Yes (when a case applies)

Check Your Understanding — Master Theorem

Q1. Apply the master method: $T(n) = 9 T(n/3) + n$.

$T(n) = \Theta(n^2)$ — Case 1.
$T(n) = \Theta(n^2 \lg n)$ — Case 2.
$T(n) = \Theta(n)$ — Case 3.
The master theorem does not apply.

$\log_b a = \log_3 9 = 2$, watershed $n^2$. $f(n) = n = O(n^{2 - \varepsilon})$ for any $\varepsilon \leq 1$. Case 1: $T(n) = \Theta(n^{\log_b a}) = \Theta(n^2)$.

Q2. Apply the master method: $T(n) = T(2n/3) + 1$.

$T(n) = \Theta(1)$ — Case 1.
$T(n) = \Theta(\lg n)$ — Case 2.
$T(n) = \Theta(n)$ — Case 3.
The master theorem does not apply.

$a = 1, b = 3/2$, so $\log_b a = \log_{3/2} 1 = 0$, watershed $n^0 = 1$. $f(n) = 1 = \Theta(1) = \Theta(n^0)$. Case 2: $T(n) = \Theta(n^0 \lg n) = \Theta(\lg n)$.

Q3. Apply the master method: $T(n) = 2 T(n/2) + n / \lg n$.

$T(n) = \Theta(n)$ — Case 1.
$T(n) = \Theta(n \lg n)$ — Case 2.
$T(n) = \Theta(n / \lg n)$ — Case 3.
The master theorem does not apply.

$\log_b a = 1$, watershed $n$. $f(n) = n / \lg n$ is smaller than $n$ but not polynomially smaller: $f(n)/n^{1-\varepsilon} = n^\varepsilon / \lg n \to \infty$ for every $\varepsilon > 0$. So Case 1 fails. It is also not $\Theta(n)$ (Case 2 fails). So we are in the Case-1/Case-2 gap; master theorem does not apply. (The actual solution, obtainable by other means, is $T(n) = \Theta(n \lg \lg n)$.)

Q4. What is the purpose of the regularity condition $a f(n/b) \leq c f(n)$ in Case 3?

It is redundant; the polynomial separation from the watershed already implies it.
It ensures that per-level work decreases geometrically as we descend the recursion tree, so the root's $f(n)$ work dominates the total.
It replaces the base case of the recurrence.
It is an equivalent form of the polynomial-separation hypothesis.

The condition guarantees level costs shrink by a constant factor per level ($a f(n/b) \leq c f(n)$), forming a geometric sum dominated by the $f(n)$ root term. Without regularity, $f$ could misbehave on subtle inputs despite the polynomial separation. For all polynomially bounded $f$ that satisfy Case 3's growth, regularity holds automatically — but it must be stated for the theorem to be clean.

Selected Exercises

Exercise 4.5-1 — Apply the master method

(a) $T(n) = 2T(n/4) + 1$. $\log_4 2 = 1/2$, watershed $\sqrt{n}$. $f(n) = 1 = O(n^{1/2 - \varepsilon})$ for $\varepsilon = 1/2$. Case 1: $T(n) = \Theta(\sqrt{n})$.

(b) $T(n) = 2T(n/4) + \sqrt{n}$. Watershed $\sqrt{n}$; $f = \sqrt{n} = \Theta(\sqrt{n})$. Case 2: $T(n) = \Theta(\sqrt{n} \lg n)$.

(c) $T(n) = 2T(n/4) + n$. Watershed $\sqrt{n}$; $f = n = \Omega(n^{1/2 + 1/2})$. Regularity: $2 \cdot (n/4) = n/2 = (1/2) f(n)$, with $c = 1/2 < 1$. Case 3: $T(n) = \Theta(n)$.

(d) $T(n) = 2T(n/4) + n^2$. Watershed $\sqrt{n}$; $f = n^2 = \Omega(n^{1/2 + 3/2})$. Regularity: $2(n/4)^2 = n^2/8 = (1/8) f(n)$. Case 3: $T(n) = \Theta(n^2)$.

Exercise 4.5-2 — Beat Strassen

Problem. Your algorithm multiplies two $3 \times 3$ matrices using $k$ scalar multiplications (and $O(n^2)$ work on submatrices). For which values of $k$ is your algorithm asymptotically faster than Strassen's?

The recurrence becomes $T(n) = k T(n/3) + \Theta(n^2)$. Strassen is $\Theta(n^{\lg 7}) \approx \Theta(n^{2.807})$. By the master theorem, if $\log_3 k > 2$ (i.e. $k > 9$), Case 1 gives $T(n) = \Theta(n^{\log_3 k})$, and we beat Strassen iff $\log_3 k < \lg 7$, i.e. $k < 3^{\lg 7} \approx 21.85$. So the interesting range is $k \in \{10, 11, \ldots, 21\}$. (Standard matrix multiplication uses $k = 27$; Pan's $3\times3$ algorithm achieves $k = 21$, which gives $O(n^{2.78})$ — faster than Strassen.)

❧