stackademic

The leading education platform for anyone with an interest in software development.

Binary Search Patterns

Apply binary search beyond sorted arrays to find boundaries and answers

Overview

Binary search repeatedly halves a search space, giving O(log n) time. Beyond finding an exact value in a sorted array, the real power is searching for a boundary: the first or last index that satisfies a condition. A related pattern, "binary search on the answer", searches a numeric range instead of an array whenever the feasibility test is monotonic.

Syntax / Usage

Keep a half-open interval [lo, hi) and shrink it based on a predicate. The invariant is that the answer always stays inside the current window, which avoids most off-by-one bugs.

def lower_bound(nums, target):   # O(log n) time, O(1) space
    lo, hi = 0, len(nums)        # half-open interval [lo, hi)
    while lo < hi:
        mid = (lo + hi) // 2     # no overflow risk in Python
        if nums[mid] < target:
            lo = mid + 1         # mid too small, discard left half
        else:
            hi = mid             # mid may be the answer, keep it
    return lo                    # first index where nums[i] >= target

Examples

Find the insertion point (first index not less than the target):

print(lower_bound([1, 3, 3, 5, 8], 3))  # 1  (first 3)
print(lower_bound([1, 3, 3, 5, 8], 4))  # 3  (would insert before 5)

Binary search on the answer: minimum ship capacity to move packages in d days.

def min_capacity(weights, days):     # O(n log(sum)) time
    def feasible(cap):
        need, load = 1, 0
        for w in weights:
            if load + w > cap:
                need, load = need + 1, 0
            load += w
        return need <= days
    lo, hi = max(weights), sum(weights)
    while lo < hi:
        mid = (lo + hi) // 2
        if feasible(mid):
            hi = mid
        else:
            lo = mid + 1
    return lo

print(min_capacity([1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 5))  # 15

Common Mistakes

  • Mixing closed and half-open intervals, which produces off-by-one errors at the edges.
  • Using mid = (lo + hi) // 2 fine in Python, but overflow-prone in languages with fixed ints.
  • Forgetting to prove the predicate is monotonic before searching on the answer.
  • Looping with lo <= hi and inclusive hi, then updating bounds inconsistently.
  • Returning mid instead of a boundary when the problem wants the first/last match.

See Also

algorithms-searching algorithms-divide-and-conquer algorithms-two-pointers