Greedy Interval Scheduling

Design and Strategy

The goal is to select as many non-overlapping intervals as possible. A natural local rule is: always take the interval that finishes earliest. Intuitively, finishing sooner leaves the largest remaining “window” for whatever comes next. This is a classic Greedy move: make an irrevocable, locally optimal choice and never look back. The optimality of this greedy rule can be proven using a standard exchange argument.

Greedy-Choice (Exchange) Argument. Let E be the interval with the earliest finish time. Suppose some optimal schedule O does not start with E, but with a different interval X. Because E finishes no later than X, we can swap X out and start with E instead; the rest of O still fits (or fits at least as well) after E. Thus there exists an optimal schedule that begins with E. After taking E, the problem reduces to the same question on the intervals that start at or after finish(E). By induction on the number of remaining intervals, repeatedly taking the earliest-finishing interval is optimal.

Why not another rule? If the goal is to fit as many intervals as possible, why not pick the ones that start earliest, or the shortest ones? It turns out those seemingly natural ideas can block better options down the line. Choosing by earliest finish is the only rule that always leaves room for the rest, as the exchange argument confirms.

Algorithm outline. The idea is simple: sort the intervals by finish time, then sweep through them once. For each interval, ask whether it begins after the last selected one ends—if it does, include it; otherwise, skip it. Here is the algorithm in pseudocode:

Sort intervals by nondecreasing finish time (break ties arbitrarily or by later start).
Set \( currentEnd = -\infty \) and initialize an empty selection list.
For each interval \([\,start, end\,]\) in sorted order:
- If \( start \ge currentEnd \), select it and set \( currentEnd = end \).
- Otherwise, skip it.

Worked Example (step-by-step):

Intervals (unsorted):
\(\small A(6,10),\ B(0,6),\ C(5,9),\ D(2,13),\ E(5,7),\ F(3,5),\ G(12,14),\ H(8,12),\ I(8,11),\ J(1,4),\ K(3,8) \).

Sorted by finish time:
\(\small J(1,4),\ F(3,5),\ B(0,6),\ E(5,7),\ K(3,8),\ C(5,9),\ A(6,10),\ I(8,11),\ H(8,12),\ D(2,13),\ G(12,14) \).

\(J(1,4)\): \(1 \ge -\infty\) → select. \( currentEnd = 4 \).
\(F(3,5)\): \(3 < 4\) → overlaps → skip.
\(B(0,6)\): \(0 < 4\) → skip.
\(E(5,7)\): \(5 \ge 4\) → select. \( currentEnd = 7 \).
\(K(3,8)\): \(3 < 7\) → skip.
\(C(5,9)\): \(5 < 7\) → skip.
\(A(6,10)\): \(6 < 7\) → skip.
\(I(8,11)\): \(8 \ge 7\) → select. \( currentEnd = 11 \).
\(H(8,12)\): \(8 < 11\) → skip.
\(D(2,13)\): \(2 < 11\) → skip.
\(G(12,14)\): \(12 \ge 11\) → select. \( currentEnd = 14 \).

Selected set: \( \{E, G, I, J\} \) (4 intervals).

Tie-handling note. If two intervals share the same finish time, either is safe; picking the later-starting one can reduce “wasted” slack before it, but correctness does not depend on this tie-break.

This fits the greedy technique: we commit to the earliest-finishing interval at each step (a local optimum), and an exchange argument shows that doing so never harms the possibility of a globally optimal schedule. It can also be viewed as an instance of transform-and-conquer—the initial presorting of intervals by finish time transforms the data into an order where the greedy rule becomes both natural and provably optimal.

Implementation in Java, C++, Python

The core idea is the same across all three implementations: sort intervals by nondecreasing finish time, then sweep through them, keeping an interval whenever its start is at or after the end of the last chosen interval. Each version maintains a running variable \( \textit{currentEnd} \) (or \( \textit{current_end} \)) and a count of chosen intervals. These snippets intentionally return only the number of non-overlapping intervals, not the schedule itself; returning the actual intervals is left to the reader as a small extension. We assume valid inputs with \( \textit{start} \le \textit{end} \). Tie-handling in these implementations is not explicit: Java (\( \texttt{Arrays.sort} \) on \( \texttt{Comparable} \)) and Python (\( \texttt{list.sort} \)) are stable and will preserve the input order among equal finish times; C++ uses \( \texttt{std::sort} \), which is not stable, so equal-finish intervals may appear in any order.

Java: Defines an Interval class implementing Comparable<Interval> so that Arrays.sort orders intervals by finish time. A simple loop keeps an interval when \( \textit{start} \ge \textit{currentEnd} \), incrementing the count and updating \( \textit{currentEnd} \).
C++: Uses a struct Interval and a compare function on end, passed to std::sort for the presort. The greedy loop applies the same rule \( \textit{start} \ge \textit{currentEnd} \) to decide which intervals to keep.
Python: Represents each interval as a tuple \( (\,\textit{start},\,\textit{end}\,) \) and sorts with list.sort using a small key function \( \textit{end(iv)}=iv[1]\). This function simply returns the second point in the pair. The single-pass loop selects intervals using the same \( \textit{start} \ge \textit{current_end} \) test.

class Interval implements Comparable<Interval> {
  int start, end;
  Interval(int s, int e) { start = s; end = e; }

  @Override
  public int compareTo(Interval other) {
    return Integer.compare(this.end, other.end); // ascending by finish time
  }
}

int intervalSchedule(Interval[] intervals) {
  // Handle empty input early
  if (intervals.length == 0) return 0;
  // Sort by finish time (ascending)
  Arrays.sort(intervals);

  // Seed from the first (earliest-finishing) interval
  int count = 1;
  int currentEnd = intervals[0].end;
  for (int i = 1; i < intervals.length; i++) {
    Interval iv = intervals[i];
    if (iv.start >= currentEnd) {
      count++;
      currentEnd = iv.end;
    }
  }
  return count;
}

#include <algorithm>
#include <vector>

struct Interval {
  int start, end;
};

bool compare(const Interval& a, const Interval& b) {
  return a.end < b.end; // ascending by finish time
}

int intervalSchedule(std::vector<Interval>& intervals) {
  // Handle empty input early
  if (intervals.empty()) return 0;
  std::sort(intervals.begin(), intervals.end(), compare);
  int count = 1;                    // take the first (earliest-finishing) interval
  int currentEnd = intervals[0].end; // seed from its end time
  int n = intervals.size();
  for (int i = 1; i < n; ++i) {
    if (intervals[i].start >= currentEnd) {
      ++count;
      currentEnd = intervals[i].end;
    }
  }
  return count;
}

def end(iv):
    return iv[1]

def interval_schedule(intervals):
    # Handle empty input early
    if not intervals:
        return 0
    # Sort by finish time (ascending)
    intervals.sort(key=end)
    # Seed from the first (earliest-finishing) interval
    count = 1
    current_end = intervals[0][1]
    for start, end_ in intervals[1:]:
        if start >= current_end:
            count += 1
            current_end = end_
    return count

These implementations are intentionally minimal to emphasize the greedy structure and they only determine how many intervals can be scheduled. To return the schedule, record each selected interval (or its index) when chosen and return that list instead of just the count. This change does not affect asymptotic time or space complexity. If desired, you can also standardize tie-breaking (e.g., prefer later starts among equal finishes) without changing correctness.

Variations/Improvements

There is not much to improve upon with this algorithm, other than returning the intervals in the solution. However, there are a wide variety of scheduling problems.

Weighted Interval Scheduling: Assigns each interval a value or profit and seeks the subset with maximum total weight rather than the largest count. The greedy rule no longer works; a dynamic programming approach is required.
Interval Partitioning (Minimizing Resources): Flips the goal—given all intervals, minimize the number of machines or classrooms so that no overlaps share one. This version still uses sorting by start time but applies a different greedy rule.
Scheduling with Overlaps or Buffers: Some real systems permit limited overlaps or require setup gaps between intervals. These extensions often lead to priority-based or constraint-satisfaction formulations rather than pure greedy ones.
Scheduling with Release Times and Deadlines: Each job \( j \) has a release time \( r_j \) and a deadline \( d_j \); the task is to maximize the number of jobs completed within their available windows. This generalization often uses dynamic programming or specialized greedy heuristics.
Online Interval Scheduling: Intervals arrive one by one, and decisions must be made immediately. Algorithms are compared by their competitive ratio against an optimal offline schedule.
Multiple-Machine Scheduling: Extends the single-machine case to \( m \) identical machines, balancing load or maximizing total scheduled time. Greedy or heap-based strategies are common.

Reading Comprehension Questions

Greedy rule: What local decision does the algorithm make at each step, and which quantity does it maintain to support that decision?
Why it works: In one or two sentences, summarize the exchange argument that justifies choosing the earliest-finishing interval.
Transform-and-Conquer: Why can the greedy interval scheduling algorithm also be classified as a transform-and-conquer algorithm? What specific transformation simplifies the problem before applying the greedy rule?
Selection condition: Write the precise condition (in symbols) that determines whether the current interval is taken. What does equality mean in this context?
Complexities: State the overall time complexity and the extra space complexity when sorting is in place. How (if at all) do these change if we also return the actual selected intervals?
Ties & stability: If two intervals share the same finish time, does the choice between them affect correctness? Does the stability of the sorting routine matter here?
Counterexample instinct: Give a small example where sorting by start time and taking compatible intervals greedily produces a suboptimal set.
Edge case check: Suppose the last chosen interval ends at 10. For a candidate interval with start 10 and end 13, is it selected? What about start 9 and end 13? Explain briefly.

In-Class Activities

Hand Trace: Apply the algorithm step by step on 6–10 unsorted intervals, first writing them in their original order and then in finish-time order. Mark which ones are chosen.
Sorting Comparison: Run or simulate the algorithm twice—once sorting by start time and once by finish time—and explain why the results differ.
Code Exercise: Implement the greedy interval scheduler in a language of your choice. Then modify it to also return the selected intervals, not just their count.
Transform Step Focus: Discuss why the presorting step qualifies this as a transform-and-conquer algorithm. How does sorting make the greedy decision rule possible?
Tie Handling: Create an example with two or more intervals that share the same finish time. Try different tie-breaking rules and verify that the total count remains optimal.
Visual Timeline: Draw a horizontal timeline of all intervals and shade in the selected ones to illustrate how the algorithm skips overlapping regions.
Counterexample Design: Construct an instance where a greedy strategy based on earliest start or shortest duration fails to find the optimal schedule.
Real-World Modeling: Model a real scheduling scenario (classrooms, bandwidth, interview slots, etc.) as an interval problem. Identify what "overlap" and "finish time" represent in your context. Discuss whether or not your problem seems like it would be solvable by a greedy algorithm.
Greedy Rule Experiment: Invent your own scheduling rule (for example, choose the shortest interval first) and find a small example where it fails to be optimal.
Greedy Comparison: Compare Interval Scheduling to another greedy algorithm you’ve studied, such as Fractional Knapsack or Kruskal’s Minimum Spanning Tree. What makes each greedy choice "safe" to commit to?
Dual Problem Exploration: Using the same set of intervals, compute both the maximum number of compatible intervals (as in this chapter) and the minimum number of resources needed so that all intervals can run without overlapping. Here, a "resource" could mean a classroom, machine, or worker—each can handle only one interval at a time. The second problem (Interval Partitioning) asks for the fewest such resources required to schedule every interval, and it can be solved with a different greedy rule based on earliest start times. Compare how the two problems—maximizing usage vs. minimizing resources—mirror each other.

Homework Problems

Step-by-Step Selection: Apply the greedy algorithm to the following intervals and show which are selected, in order. Label each step clearly and indicate the value of \( \textit{currentEnd} \) after each selection.
\(\small A(1,4),\ B(3,5),\ C(0,6),\ D(5,7),\ E(3,9),\ F(5,9),\ G(6,10),\ H(8,11),\ I(8,12),\ J(2,14) \)
Proof of Correctness: Prove that always selecting the earliest-finishing interval yields an optimal schedule. Be explicit about your reasoning—state the inductive hypothesis or exchange argument clearly and justify each step.
Interval Partitioning: Given a set of intervals that must all be scheduled, design a greedy algorithm to minimize the number of resources (e.g., machines, classrooms) required so that no two overlapping intervals share one resource. Describe your algorithm precisely, give its time complexity, and explain how it differs from the one used for Interval Scheduling.
Transform Step Analysis: The algorithm begins with a presorting step that costs \(O(n \log n)\). Suppose instead the intervals arrive already sorted by finish time. What is the resulting time complexity, and what part of the algorithm dominates the cost now?
Algorithm Extension: Modify the greedy algorithm so it returns the list of selected intervals instead of just the count. Describe how much additional space and time this requires, and whether it changes the asymptotic complexity.
Dual Perspective: Explain how Interval Scheduling (maximizing one resource’s usage) and Interval Partitioning (minimizing resources for full coverage) represent two sides of the same underlying idea. Provide a short example showing how both algorithms behave on the same set of intervals.