Variable-Size-Decrease

Introduction

In the variable-size-decrease strategy, each step reduces the problem size by a non-fixed amount—often determined by a selection, partitioning, or transformation process that depends on the input itself. Unlike constant or constant-factor reductions, the size of the next subproblem is not predictable in advance. This variability makes analysis more nuanced, but many powerful algorithms—such as QuickSelect and Quicksort—rely on this approach to achieve optimal or near-optimal performance in practice.

Typically, these algorithms spend \(O(n)\) time on a partitioning or scanning step, followed by a recursive call on a subproblem of size \(k\), where \(k\) may range from 0 to \(n - 1\). In the best case, the reduction is large, leading to fast convergence; in the worst case, the progress is slow—sometimes resulting in poor time complexity unless additional strategies (like randomization or median-of-medians) are applied.

Example 5: Euclidean Algorithm (Variable-Size-Decrease)

The Greatest Common Divisor (GCD) problem asks for the largest integer that divides two positive integers \(a\) and \(b\) without leaving a remainder. If you are unfamiliar with GCD, make sure you read the GCD page. While a brute-force approach would check all numbers from \(\min(a,b)\) down to 1, there is a much faster method based on a variable-size-decrease strategy.

The Euclidean Algorithm for computing \(\gcd(a, b)\) works as follows:

  1. If \(b = 0\), return \(a\). (We are done.)
  2. Otherwise, recursively compute \(\gcd(b, a \bmod b)\).

To understand why this works, recall that any number that divides both \(a\) and \(b\) (where \(a\ge b\)) must also divide the remainder when \(a\) is divided by \(b\)—that is, \(a \bmod b\). This is because we can write \(a = bq + r\), where \(r = a \bmod b\), and any common divisor of \(a\) and \(b\) must also divide \(r\). So the set of common divisors of \((a, b)\) is the same as that of \((b, a \bmod b)\), and thus their greatest common divisor is the same.

Intuitively, the Euclidean Algorithm works by replacing the original problem with a "smaller but equivalent" one: instead of asking "what divides both \(a\) and \(b\)?", we ask "what divides both \(b\) and the leftover part of \(a\) after removing as many full \(b\)'s as possible?"

At each step, the size of the problem decreases from \((a,b)\) to \((b, a \bmod b)\), but the amount it decreases depends on the values of \(a\) and \(b\). In the worst case, it may only shrink slightly; in the best case, it decreases rapidly. This makes it a clear example of a variable-size-decrease algorithm.

Here's more formal pseudocode for the recursive version of the algorithm:

gcd(a, b):
    if b == 0:
      return a
    else:
      return gcd(b, a % b)

The following demonstration shows the algorithm in action.

Time Complexity: Let \(T(a,b)\) be the time to compute \(\gcd(a,b)\). Each recursive call performs a constant-time modulus operation and then recurses on \(\bigl(b,\;a \bmod b\bigr)\). In the worst case—when \((a,b)\) are consecutive Fibonacci numbers—this recursion makes only \(O(\log b)\) steps, so \[ T(a,b) \;=\; O(\log b) \;=\; O\bigl(\log \min(a,b)\bigr). \] Thus the Euclidean Algorithm runs in time logarithmic in the smaller of its two inputs, making it extremely efficient even for very large integers. (We realize we skimped on the details of this analysis. If you are really interested in understanding the details, you can find them in various other places including Euclidean Algorithm (Wikipedia).)

Example 6: Quickselect (Variable-Size-Decrease)

Quickselect solves the kth order statistic problem, which is simply to determine the \(k\)-th smallest element in a (presumably unsorted) array. It works as follows:

  1. Choosing a pivot and partitioning the array into \([\lt pivot \; | \; pivot \; | \gt pivot]\).
  2. Recursing only on the group that contains the \(k\)th element.

Here is a very simple demo showing a high-level view of Quickselect. It glosses over the details of the partition step, but it should give you a general idea of how the algorithm works.

Notice that once the algorithm has finished, the element at the desired index (in this case \(k=5\)) is in the proper location, elements smaller are to the left, and elements larger are to the right, but the array is not totally ordered. Quickselect orders it just enough to be able to put the \(k\)-th element in place.

Time Complexity: On average Quickselect runs in \(O(n)\) time (with careful pivot selection), though in the worst case it can be \(O(n^2)\).

This algorithm is a lot like Quicksort, except it only recurses on one part instead of both. Because there are subtleties that make this algorithm a little more complicated than the other examples in this section, we defer full details, incuding pseudocode, to the Quickselect page.

Algorithms Using This Technique

When to Use

Limitations

Variable-Size-Decrease may be less suitable in these scenarios:

Implementation Tips

Common Pitfalls

Real-World Applications

Summary & Key Takeaways

Reading Comprehension Questions

  1. Euclidean Algorithm Type
    Why is the Euclidean Algorithm classified as variable-size-decrease, and what is its worst-case time complexity in terms of its inputs?
  2. Definition Insight:
    What distinguishes a variable-size-decrease algorithm from decrease-by-a-constant or decrease-by-a-constant-factor?
  3. Quickselect Behavior:
    In Quickselect, why do we only recurse on one partition after choosing the pivot? What is the benefit of this?
  4. GCD Efficiency:
    Why is the Euclidean algorithm so efficient despite the variability in how much the input decreases each step?
  5. BST Height Impact:
    How does the height of a binary search tree impact the performance of search or insert operations in an unbalanced tree?
  6. Performance Spectrum:
    Why are variable-size-decrease algorithms sometimes closer to constant-decrease in the worst case but closer to constant-factor in the average case?

In-Class Activities

  1. Euclidean Algorithm Walkthrough:
    Compute \(\gcd(1071,462)\) by hand using the Euclidean algorithm. Write down each remainder step, and count how many recursive calls occur.
  2. Quickselect Simulation:
    Given the array [17, 32, 8, 24, 13, 29, 41, 19, 5, 11, 27] and \(k = 6\), simulate Quickselect with the first element as the pivot. Show each partitioning step, updated value of \(k\), and which subarray is recursed on.
  3. GCD Variability:
    Compare the number of Euclidean algorithm steps when computing \(\gcd(987, 610)\) and \(\gcd(996, 249)\). Explain why one takes more steps than the other.
  4. Height of BST:
    Construct a binary search tree by inserting [50, 25, 75, 10, 30, 60, 90, 5, 20, 27, 65, 85, 95] in order. Then construct one using [5, 10, 20, 25, 27, 30, 50, 60, 65, 75, 85, 90, 95]. Compare max depth in each.
  5. Pivot Choice Effects:
    Try Quickselect on the array [62, 71, 38, 54, 85, 23, 91, 33, 47, 19, 68, 75, 99] to find the 7th smallest element. Do this twice: once using the first element as the pivot, once using the true median. How many steps does each take?
  6. Pivot Behavior Exploration:
    Consider the array [42, 17, 63, 5, 89, 28, 34, 57, 71, 21, 46, 79]. Apply Quickselect with three pivot strategies to find the 6th smallest element:
    • (a) First element as pivot
    • (b) Last element as pivot
    • (c) Median-of-three (choose median of first, middle, last)
    For each case, record the partitions created and number of steps. Which strategy gives the most balanced partitioning?

Homework Problems

  1. Euclidean Algorithm Walkthrough:
    Compute \(\gcd(119, 34)\) by hand using the Euclidean algorithm. List each remainder step and determine how many recursive calls occur.
  2. Quickselect by Hand:
    Use Quickselect to find the 8th smallest number in the array [48, 15, 67, 34, 92, 10, 56, 74, 23, 29, 88, 12, 61]. Use the first element as the pivot at each step. Show the updated array, partition index, and range for the next recursive call at each step.
  3. Euclidean Edge Case:
    Compute \(\gcd(1597, 987)\) by hand. Count the number of recursive calls. (Hint: These are consecutive Fibonacci numbers.)
  4. Tree Performance:
    Insert the following numbers into a binary search tree in the given order: [500, 200, 800, 100, 300, 700, 900, 50, 150, 250, 350, 600, 750, 850, 950]. Draw the tree and compute the maximum depth, as well as the depth of each leaf node.
  5. Worst-Case Quickselect:
    Construct an array of 9 distinct elements such that Quickselect (using pivot-at-start) takes the maximum number of steps to find the 5th smallest. Explain why this array causes the worst-case behavior.