Skip to content

Data Structures & Algorithms

DSA II as the structures that keep data ordered and findable.

This page picks up where DSA I left off and only covers what is in my DSA II notes: binary search trees and their traversals, BST removal and SkipLists, binary heaps behind the priority queue, and HashMaps with collisions and load factor.

A Java and Big-O review carried over from DSA I: Iterable/Iterator, Comparable vs Comparator, and worst-case growth
Binary search trees: the left < node < right invariant, and how insertion order decides O(log n) vs a degenerate O(n) chain
BST removal across the leaf, one-child, and two-child cases, plus SkipLists reaching O(log n) by coin flips
Binary heaps as an array-backed priority queue: O(1) peek, O(log n) add and remove
HashMaps: hashing a key to a bucket, then resolving collisions by chaining or probing under a load factor

learning map

  1. Module 4 · BST

    Binary Search Trees

    A BST is a binary tree with one ordering rule: for every node, everything in the left subtree is smaller and everything in the right is larger. That invariant is what turns a tree into something you can search in O(log n).

  2. Module 5 · BST ops + SkipLists

    Removing from a BST, and SkipLists

    I worked through the three BST removal cases: leaf, one-child, and the tricky two-child case that swaps in an in-order successor (the GT course convention). Then SkipLists, which reach the same O(log n) by stacking sorted linked lists and promoting nodes with coin flips.

  3. Module 6 · Heaps

    Heaps and the Priority Queue

    Heaps are the BST cousin in the binary-tree family: a complete tree that pins the min (or max) at the root for O(1) peek and O(log n) add/remove. The order property is parent-vs-child only, so siblings stay unordered, and a plain 1-indexed array with i/2, 2i, 2i+1 arithmetic backs the whole thing without pointers.

  4. Module 7 · HashMaps

    HashMaps: O(1) average by hashing

    This is where I stopped climbing a comparison tree and started computing the index straight from the key. The whole module is really about one unavoidable consequence, collisions, and the strategies for handling them.

visual lab

BST traversals

A fixed BST, built by inserting 13, 7, 29, 5, 11, 19. Pick a traversal and step through the visit order.

1/6

Left, node, right. On a BST this always comes out sorted: 5, 7, 11, 13, 19, 29.

1375112919
Visit order
5

Built by inserting: 13, 7, 5, 11, 29, 19

BST removal (two-child)

Removing 4 (the root) from a 7-node BST using the in-order successor, the GT course convention.

1/6
4target261357

1. Start at the root: 4 is the target, so the node to remove is the root itself.

Min-heap sift

A 1-indexed min-heap. Add swims a value up; remove-min sinks the last leaf down. The array and tree stay in sync.

Add values to watch them swim up; remove the min to watch the last leaf sink.

Backing array (1-indexed)
0 empty
empty
Complete tree

empty

HashMap collisions

index = key mod capacity. Keys 5, 12, 19 all hash to bucket 5. Watch chaining vs linear probing resolve it, and a resize when the load factor crosses 0.75.

Insert keys to watch collisions resolve. Each key is its own hash; index = key mod capacity.

Load factor: 0 / 7 = 0.00

0 empty
1 empty
2 empty
3 empty
4 empty
5 empty
6 empty

concept notes

The BST ordering invariant

For every node, all data in the left subtree is less than the node and all data in the right is greater. At each step you can discard half the candidates, which is what makes O(log n) search possible.

Module 4 · BST

SkipList express lanes

A SkipList stacks sorted linked lists; level 0 has every element and each insert flips a coin to promote a node up a level, P(level i) = (1/2)^i. Search moves right while the next node is smaller than the target and drops down when it would overshoot. The upper levels skip over data for O(log n) on average.

Module 5 · SkipLists

The heap is just an array

Because a heap is a complete binary tree, it fits a 1-indexed array with no gaps: the node at i has parent i/2 and children 2i and 2i+1. Completeness is exactly what makes that index arithmetic always land on a real node.

Module 6 · Heaps

Load factor and resize

Load factor = size / capacity, and crossing the threshold (typically 0.67 to 0.75) triggers a resize: a fresh, larger table where every entry is rehashed with the new capacity, an O(n) operation. Lower thresholds mean fewer collisions but more memory.

Module 7 · HashMaps

test yourself

Module 4 · BST

Binary Search Trees

What is the BST ordering invariant?

For every node, all data in its left subtree is less than the node and all data in its right subtree is greater. It holds recursively at every node, not just the root.

Why are BST operations O(log n) on average but O(n) in the worst case?

On a balanced tree each comparison drops half the remaining nodes, so you walk a path of height about log n. But shape depends on insertion order. Inserting already-sorted values builds a one-sided chain that behaves like a linked list, so it degrades to O(n).

Module 5 · BST ops + SkipLists

Removing from a BST, and SkipLists

Removing a BST node with two children: why not just detach it, and what does the course do instead?

Detaching would drop a whole subtree. Instead you keep the node position and overwrite only its data with a replacement, then remove that replacement (which has 0 or 1 child). The course uses the in-order successor: step right once, then go all the way left.

How does a SkipList get O(log n) expected search from plain linked lists?

It stacks sorted lists; level 0 holds everything and each higher level keeps a coin-flip-promoted subset, P(level i) = (1/2)^i. Upper levels are express lanes: walk right until the next node would overshoot, then drop down, which halves the search space like binary search.

Module 6 · Heaps

Heaps and the Priority Queue

In a 1-indexed array heap, how do you reach the parent, left child, and right child of index i?

parent = i/2 (integer floor), left child = 2*i, right child = 2*i + 1. Index 0 is left null, so the root sits at index 1 and every move is plain arithmetic, with no node pointers.

Why does add bubble up while removeMin bubbles down?

add appends at the next leaf and swims up against its parent only, until it is no longer smaller. removeMin moves the last element to the root, then sinks down, each level swapping with the smaller of the two children. Each walks one path in one direction.

Module 7 · HashMaps

HashMaps: O(1) average by hashing

What are the two families of collision resolution?

Closed addressing keeps colliding keys at the original index in a chain. That is External or Separate Chaining, a linked list per bucket. Open addressing puts them elsewhere via probing: linear, quadratic, or double hashing. The notes lean hardest on External Chaining.

What is the load factor, and what happens when it crosses the threshold?

Load factor = size / capacity. When it crosses the threshold (typically 0.67 to 0.75), the table resizes: a larger backing array, and every live entry is rehashed with the new capacity. Resize is O(n).

enko