Dynamic Arrays
An array is an ordered list of elements stored sequentially in memory, without gaps. Arrays are convenient because you can easily access any element by index in time.
Arrays come in two flavors: fixed-size arrays and dynamic arrays.
With fixed-size arrays, we set the size upfront, and when they are full, they cannot accommodate more data. In contrast, dynamic arrays can expand or contract to accommodate a changing number of elements. High-level languages like Python and JavaScript typically come with dynamic arrays so that you do not need to worry about resizing.
Why Learn This?
Understanding how dynamic arrays work under the hood has multiple benefits:
- Interview Readiness: Directly relevant to data structure design questions.
- Algorithmic Techniques: Teaches reusable ideas like amortization.
- Intuitive Big O: The Big O analysis for operations becomes intuitive rather than something tedious to memorize.
Implementation Details
Let's look at how we might implement a dynamic array using only fixed-size arrays.
The Challenge: Resizing
The challenging case is when the underlying fixed-size array is full. We call this process resizing.
If we simply grew the array by 1 slot every time it filled up, every append operation would take time to copy elements. This is too slow.
A better approach is to double the capacity, ending up with a half-empty fixed-size array after resizing.
This hits a sweet spot where we rarely need to do expensive resizings while also not wasting too much space.
Code Structure
We track two key properties:
size: The number of actual elements.capacity: The number of "slots" in the underlying fixed-size array.
class DynamicArray:
def __init__(self):
self.capacity = 10
self.size = 0
self.fixed_array = [None] * self.capacity
def append(self, x):
if self.size == self.capacity:
self._resize(self.capacity * 2)
self.fixed_array[self.size] = x
self.size += 1
def _resize(self, new_capacity):
new_fixed_array = [None] * new_capacity
for i in range(self.size):
new_fixed_array[i] = self.fixed_array[i]
self.fixed_array = new_fixed_array
self.capacity = new_capacity
Shrinking Strategy
When removing elements (pop_back), when should we shrink the array to save memory?
- Shrink at 50%? Bad idea. If we hover around 50%, we might constantly resize and shrink (thrashing).
- Shrink at 25%? Good idea. If we halve the capacity when the array is 25% full, the underlying array will be 50% full after shrinking. This avoids excessive shrinking while not wasting too much space.
Amortized Time Analysis
What does "amortized" mean?
Amortized runtime is the average runtime per operation over the course of the entire program, not just a single operation.
Appending to a dynamic array has poor worst-case performance (when resizing happens, it takes ), but excellent amortized performance ().
Why? Imagine starting with an empty array and appending elements.
- The final resize copied at most elements.
- The one before that copied .
- The one before that , and so on.
- Summing all copies: .
- Total work for operations is .
- Therefore, average work per operation is .
Extra Operations
It's important to distinguish between operations at the end of the array vs. arbitrary indices.
-
pop(i): Removes element at indexi.- Time: . We must shift all elements after
ione slot to the left to close the gap. - Note: If you need to pop from both ends often, consider a Deque (Double-Ended Queue).
- Time: . We must shift all elements after
-
insert(i, x): Insertsxat indexi.- Time: . We must shift all elements from
ionwards one slot to the right.
- Time: . We must shift all elements from
-
contains(x): Checks ifxexists.- Time: . In an unordered array, we must scan linearly.
Key Takeaways
| Operation | Time Complexity | Notes |
|---|---|---|
get(i) | Constant access time | |
set(i, x) | Constant update time | |
append(x) | (amortized) | worst case (resize) |
pop_back() | (amortized) | worst case (shrink) |
pop(i) | Requires shifting elements | |
insert(i, x) | Requires shifting elements | |
contains(x) | Linear scan required |
Covering the applications of dynamic arrays in a single chapter would be impossible. Instead, we'll use them throughout the rest of the journey.