Big-O Notation: Formal Definition, Examples, and Common Mistakes

Why This Matters

Big-O is the language algorithm engineers use to compare designs without running them. Two implementations of the same problem can differ by a constant factor on a benchmark and still differ by an order of magnitude on inputs the benchmark never reached. Big-O is the tool that catches that.

The notation is widely abused. Every interview asks for it; few candidates can state the formal definition. The mistakes compound: people quote $O(n \log n)$ for an algorithm that is in fact $\Theta(n^2)$ in the worst case, then act surprised when production traffic finds the worst case.

Formal Definition

A function $f(n)$ is in $O(g(n))$ if there exist positive constants $c$ and $n_0$ such that for all $n \ge n_0$ ,

$0 \le f(n) \le c \cdot g(n).$

In words: past some input size $n_0$ , $f$ is at most a constant multiple of $g$ . The notation is an upper bound on growth rate, not an exact rate. An algorithm that runs in $\Theta(n)$ time is also in $O(n)$ , in $O(n^2)$ , and in $O(2^n)$ — Big-O does not promise tightness.

Worked Examples

Constant time, $O(1)$ . Accessing an array element by index. The cost does not depend on the array's length.

Logarithmic, $O(\log n)$ . Binary search on a sorted array. Each comparison halves the search space, so the depth of the recursion is $\log_2 n$ .

Linear, $O(n)$ . Computing the sum of an array. One pass, one operation per element.

Linearithmic, $O(n \log n)$ . Merge sort. The recursion tree has $\log n$ levels and each level does $n$ work to merge.

Quadratic, $O(n^2)$ . A naive nested loop comparing all pairs. The number of pairs is $\binom{n}{2} = n(n-1)/2$ , which is $\Theta(n^2)$ .

Exponential, $O(2^n)$ . Enumerating all subsets of an $n$ -element set. The set has $2^n$ subsets, so any algorithm that materializes them all is at least $\Omega(2^n)$ .

Big-O vs Big-Omega vs Big-Theta

Three related notations describe asymptotic growth at different bounds:

$f(n) \in O(g(n))$ means $f$ is at most a constant times $g$ asymptotically. Upper bound.
$f(n) \in \Omega(g(n))$ means $f$ is at least a constant times $g$ asymptotically. Lower bound.
$f(n) \in \Theta(g(n))$ means $f$ is both $O(g)$ and $\Omega(g)$ . Exact rate up to constants.

Saying merge sort is " $O(n \log n)$ " is correct but loose. Saying merge sort is " $\Theta(n \log n)$ " is the tighter and more useful claim: it both upper- and lower-bounds the cost. Use $\Theta$ when you can prove it; use $O$ when you only have the upper bound.

Common Mistakes

Confusing average case with worst case. Quicksort with a random pivot is $\Theta(n \log n)$ in expectation. Its worst case is $\Theta(n^2)$ . "Quicksort is $O(n \log n)$ " is true on average and false in the worst case. Both facts matter; know which you are quoting.

Dropping constants when they matter. Big-O hides constants by definition. For two algorithms in $O(n)$ , one with constant 3 and one with constant 30, the second is 10× slower at every input size. On a hot path, that constant decides whether your service meets its SLO. Big-O is for asymptotic comparison, not for picking between constant-factor variants.

Conflating time and space. Many algorithms have different time and space complexity. Hash-table lookup is $O(1)$ time and $O(n)$ space; merge sort is $O(n \log n)$ time and $O(n)$ space. State which dimension you are bounding.

Ignoring the input distribution. " $O(n^2)$ in the worst case" is a statement about adversarial inputs. If your traffic is structured (sorted arrays, sparse graphs, low entropy), the worst case may never occur. The right answer depends on what you actually run, not what an adversary could pick.

References

Cormen, Leiserson, Rivest, Stein. Introduction to Algorithms (4th ed., 2022). Ch. 3 — Growth of Functions.
Knuth. The Art of Computer Programming, Volume 1: Fundamental Algorithms (3rd ed.). §1.2.11.

Next Topics

Complexity analysis — time, space, worst case, average case, and amortized cost.
Big-O, Big-Omega, and Big-Theta — upper, lower, and tight bounds.
Arrays — the base random-access data structure.
Arrays and pointers — the simplest concrete data structure and the source of $O(1)$ random access.
Binary search — the canonical $O(\log n)$ algorithm and a trap-filled implementation exercise.