Subset Sum (Exhaustive Search)

Design and Strategy

The exhaustive search algorithm for the Subset Sum problem generates and checks all possible subsets of the input set to determine whether any of them sum to a given target value \( T \). Since a set of size \( n \) has \( 2^n \) subsets, the algorithm examines each one in turn.

There are many ways of generating all subsets. Here we will describe an algorithm that involves using binary masks. The subsets are generated by iterating over all binary masks of length \( n \), where each bit indicates whether the corresponding element is included (1) or excluded (0) from the subset. For each mask, the algorithm constructs the corresponding subset, computes its sum, and checks whether the sum equals \( T \).

Example

Let the input set be \( S = \{2, 5, 8\} \) and the target sum \( T = 10 \).

There are \( 2^3 = 8 \) subsets, corresponding to binary masks from \( 000 \) to \( 111 \). Each bitmask selects a subset by indicating inclusion (1) or exclusion (0) for each element in low-to-high order, where:

Bit 0 (rightmost) corresponds to 2,
Bit 1 to 5,
Bit 2 to 8.

Index	Bitmask	Subset	Sum	Matches Target?
0	000	\(\{\}\)	0	No
1	001	\(\{2\}\)	2	No
2	010	\(\{5\}\)	5	No
3	011	\(\{2, 5\}\)	7	No
4	100	\(\{8\}\)	8	No
5	101	\(\{2, 8\}\)	10	Yes
6	110	\(\{5, 8\}\)	13	No
7	111	\(\{2, 5, 8\}\)	15	No

The subset \(\{2, 8\}\) has sum 10, which matches the target.

In general, the algorithm proceeds as follows:

For each integer \( m \) from \( 0 \) to \( 2^n - 1 \):
1. Interpret the binary representation of \( m \) as a subset mask over the \( n \) elements.
2. Construct the subset by including each element whose corresponding bit is 1.
3. Compute the sum of the elements in the subset.
4. If the sum equals the target \( T \), record the subset as a solution.
After generating all subsets, return the full list of solutions or determine that none exist.

Implementation in Java, C++, Python

These implementations generate all \( 2^n \) subsets using bitmasks and print any subset whose elements sum to the target value.

void subsetSum(int[] S, int T) {
    int n = S.length;
    int total = 1 << n; // 2^n
    for (int mask = 0; mask < total; mask++) {
        int sum = 0;
        List<Integer> subset = new ArrayList<>();
        for (int i = 0; i < n; i++) {
            if ((mask & (1 << i)) != 0) {
                subset.add(S[i]);
                sum += S[i];
            }
        }
        if (sum == T) {
            System.out.println(subset);
        }
    }
}


void subsetSum(int S[], int n, int T) {
    int total = 1 << n; // 2^n
    int* subset = new int[n]; 
    for (int mask = 0; mask < total; mask++) {
        int sum = 0, k = 0;
        for (int i = 0; i < n; i++) {
            if (mask & (1 << i)) {
                subset[k++] = S[i];
                sum += S[i];
            }
        }
        if (sum == T) {
            cout << "{ ";
            for (int j = 0; j < k; j++) {
                cout << subset[j] << " ";
            }
            cout << "}" << endl;
        }
    }
}

def subset_sum(S, T):
    n = len(S)
    for mask in range(1 << n):
        subset = []
        total = 0
        for i in range(n):
            if mask & (1 << i):
                subset.append(S[i])
                total += S[i]
        if total == T:
            print(subset)

Time/Space Analysis

Time Complexity: An input set of size \( n \) has \( 2^n \) subsets. The algorithm loops through all bitmasks from \( 0 \) to \( 2^n - 1 \), and for each mask, it checks which elements are included and computes the sum of the corresponding subset. The inner loop runs in \( O(n) \) time per subset, so the total time complexity is \(O(n \cdot 2^n).\) This is optimal for exhaustive enumeration, since there are \( 2^n \) subsets and each one may contain up to \( n \) elements.

Space Complexity: The algorithm uses \( O(n) \) extra space to store the current subset being considered. No recursion is used, and only a few additional local variables. The total space used is \(O(n)\). If the algorithm also stores all matching subsets, then the output space could be as large as \( O(k \cdot n) \), where \( k \) is the number of solutions.

Variations/Improvements

Dynamic Programming: A pseudo-polynomial solution runs in \(O(n \times T)\) time and \(O(T)\) space by building a table of reachable sums. Efficient when the target \(T\) is not too large.
Meet-in-the-Middle: Split the input into two halves, generate all subset sums of each half, and look for complementary pairs. This reduces time complexity to \(O(2^{n/2})\) with additional sorting or hashing.
Backtracking and Pruning: Use recursion with early exits: stop exploring a subset if its sum exceeds \(T\), or if the remaining elements cannot bridge the gap. Helps in practice but does not improve worst-case complexity.
Special Cases: Subset Sum is a special case of the 0-1 Knapsack problem (with equal weights and values), and also relates to the Partition problem. Techniques for these problems may apply.

Reading Comprehension Questions

How many subsets are there of a set with \( n \) elements? How is this related to bitmasks?
What does each bit in the binary representation of a number represent in this algorithm?
Why does the algorithm need to examine every subset in the worst case?
What does the inner loop do for each bitmask?
How is a subset’s sum computed, and how is it compared to the target?
What is the overall time complexity of the algorithm, and where does each part come from?
What is the space complexity of the algorithm if we only store the current subset?
What would happen if the input list had duplicate numbers? Would the algorithm still work correctly?

In-Class Activities

Bitmask Mechanics: Let \( S = \{4, 7, 9\} \), so \( n = 3 \). There are 8 bitmasks from 000 to 111. For each value of mask from 0 to 7:
- Write its 3-bit binary representation.
- For each index \( i \) from 0 to 2, evaluate (mask & (1 << i)) and explain whether element \( S[i] \) is included.
- List the resulting subset.
What pattern do you notice? How does the expression (mask & (1 << i)) relate to inclusion of each element?
Manual Tracing: Manually trace the bitmask algorithm on the input set \( S = \{1, 3, 4, 7\} \) with target \( T = 8 \). List all 16 subsets and mark which ones are solutions.
Demo Exploration: Use the interactive demo to find all solutions for \( S = \{2, 5, 8, 11\} \) and target \( T = 16 \). Are there a lot or a few solutions, relative to the total number of subsets?
Empty and Oversized Sets: Try running the algorithm when all elements are larger than the target (e.g., \( S = \{10, 12, 15\} \), \( T = 5 \)). What happens and why?
Fixed-Size Subsets: Modify the algorithm or tracing process to only consider subsets of size 3. How many such subsets are there? Do any sum to your chosen target? What are the advantages and disadvantages to an approach like this?
Product-Based Search: Design a brute-force algorithm similar to Subset Sum to find if any subset of numbers has product exactly \( P \). What changes must be made?
Count Instead of Print: Modify the bitmask-based code so that it counts how many subsets sum to the target, rather than printing them.
Scalability Calculation: Suppose the algorithm runs in \( 5n 2^n \) steps for input size \( n \), and assume a computer can process 100 million steps per second.
- About how long (in seconds, hours, or days) would it take to solve the problem for \( n = 25 \)? What about \( n = 30 \)?
- What is the largest value of \( n \) that could reasonably be solved in:
  - 1 second?
  - 1 hour?
  - 1 day?
  - 1 year?
- What does this tell you about the feasibility of brute-force approaches for large input sizes?
Compare Variants: Compare the bitmask-based algorithm to a recursive inclusion–exclusion version. What are the tradeoffs in clarity, performance, and memory usage? What other ideas can you come up with to enumerate all of the subsets?

Homework Problems

Empirical Runtime Study: Implement the bitmask-based subset sum algorithm and measure how long it takes to run for increasing values of \( n \). Try inputs from \( n = 3 \) up to the largest size that completes in under 2 hours on your computer. Plot your results on a graph and describe the growth rate.
Count Matches: Modify your implementation so that it counts how many subsets match the target sum, instead of printing them. Test it on several small inputs and compare your counts with expectations.
Subset Product: Adapt the subset sum algorithm to decide whether any subset multiplies to a target product \( T \). Handle edge cases like 0 and 1 carefully. Test with small sets and analyze any difficulties.
Fixed-Size Subsets: Modify the bitmask algorithm to only consider subsets of size exactly \( k \). Use this to determine whether any \( k \)-element subset sums to a given target. Try different values of \( k \). Give the computational complexity of your algorithm.
Subset of Size 3: Write a brute-force algorithm to find whether any 3-element subset sums to the target value. How is this simpler than the general case? Give the computational complexity of your algorithm.
Weight-Constrained Subsets: Suppose each item has a weight and a value. Given a weight limit \( W \), check whether any subset has total weight at most \( W \) and value at least \( V \). Use a brute-force approach. Give the computational complexity of your algorithm.
Max Solvable Input Size: Assume your algorithm runs in \( 5n \cdot 2^n \) steps and your machine processes 100 million steps/second. Estimate the largest value of \( n \) solvable in 1 second, 1 hour, 1 day, 1 month, and 1 year. Show your work.
Subset Closest to Target: Write an exhaustive search algorithm to find the subset of a given set whose sum is closest to, but does not exceed, a target value \( T \). If multiple subsets tie, return any one. Explain how your approach is similar to the subset sum algorithm. Give the computational complexity of your algorithm.
Subsets with Bounded Range: Design an exhaustive search algorithm that finds all non-empty subsets of a given set \( S \) whose range (maximum minus minimum value) is less than or equal to a given threshold \( d \). For each valid subset, print the elements and their range. Give the computational complexity of your algorithm.