Why This Matters
The array is the simplest data structure and the foundation of almost every more complex one. Hash tables are arrays of buckets. Heaps are arrays interpreted as binary trees. Strings in C are arrays of bytes. Stacks and queues backed by ring buffers are arrays. Understanding why arrays work — at the hardware level, not just the API level — is the prerequisite for understanding every data structure that follows.
The pointer is the simplest abstraction over memory. Once a beginner internalizes that "this variable holds an address, not a value," half of systems programming becomes accessible.
For the lower-level computation lane, use ComputationPath's pointers and addresses, cache hierarchy, and memory allocation. DSAPath uses those facts to reason about data-structure costs.
Memory Layout
An array of elements, each of size bytes, starting at address , occupies the contiguous bytes from to . The address of element is
This is one multiplication and one addition. On a modern CPU it is a single addressing-mode operand to a load instruction. The cost is constant in — that is the entire reason for the random access claim.
Compare this to a linked list, where reaching element requires pointer dereferences and each dereference may stall the CPU waiting for memory. Linked lists are to access element , and the constant is large.
Cache Locality
Modern CPUs read memory in cache-line-sized chunks (typically 64 bytes). When you read array element , the CPU also reads elements for free up to the line boundary. Sequential iteration over an array is therefore much faster per element than the same number of accesses to scattered addresses.
This is why a sequential scan over an array often beats a traversal of a binary search tree on small inputs. The asymptotic complexity is worse; the constant factor (cache misses) wins.
Pointers as Addresses
A pointer is a fixed-width integer (typically 64 bits on a modern OS) holding a memory address. The expressions *p (dereference) and &x (address-of) are inverses: *&x is x, and &*p is p whenever p is a valid pointer.
The arithmetic p + i advances the pointer by elements (not bytes). The compiler scales the offset by the size of the type. This is why arr[i] and *(arr + i) are equivalent in C: the array name decays to a pointer, and the index does the multiplication implicitly.
Common Mistakes
Off-by-one at the array boundary. Indexing arr[n] on an array of length reads one byte past the end. In C this is undefined behavior, in JavaScript it returns undefined, in Python it raises IndexError. Most security CVEs in systems software are some flavor of this — buffer overruns, off-by-one in length checks, signed/unsigned comparison bugs.
Confusing pointer-to-array with pointer-to-element. In C, int *p and int (*p)[10] are different types. The first points at one int; the second points at an array of ten ints. Pointer arithmetic on them advances by different strides.
Treating linked lists as drop-in replacements. Linked lists trade insertion at known positions for access by index and worse cache behavior. They are the right structure when you insert and delete from the middle and never index. They are the wrong structure for almost everything else.
References
- Cormen, Leiserson, Rivest, Stein. Introduction to Algorithms (4th ed., 2022). §10.1 — Arrays.
- Hennessy & Patterson. Computer Architecture: A Quantitative Approach (6th ed., 2017). Ch. 2 — Memory hierarchy and cache locality.
- Bryant & O'Hallaron. Computer Systems: A Programmer's Perspective (3rd ed., 2015). Ch. 6 — The Memory Hierarchy.
Next Topics
- Binary search — the cleanest application of on a sorted array.
- Big-O notation — back to the asymptotic-analysis foundation.
- Linked lists — the contrasting pointer-heavy linear structure.
- Hash tables — arrays plus hashing and collision handling.