LeetCode #1531 — HARD

String Compression II

Break down a hard problem into reliable checkpoints, edge-case handling, and complexity trade-offs.

Solve on LeetCode
The Problem

Problem Statement

Run-length encoding is a string compression method that works by replacing consecutive identical characters (repeated 2 or more times) with the concatenation of the character and the number marking the count of the characters (length of the run). For example, to compress the string "aabccc" we replace "aa" by "a2" and replace "ccc" by "c3". Thus the compressed string becomes "a2bc3".

Notice that in this problem, we are not adding '1' after single characters.

Given a string s and an integer k. You need to delete at most k characters from s such that the run-length encoded version of s has minimum length.

Find the minimum length of the run-length encoded version of s after deleting at most k characters.

Example 1:

Input: s = "aaabcccd", k = 2
Output: 4
Explanation: Compressing s without deleting anything will give us "a3bc3d" of length 6. Deleting any of the characters 'a' or 'c' would at most decrease the length of the compressed string to 5, for instance delete 2 'a' then we will have s = "abcccd" which compressed is abc3d. Therefore, the optimal way is to delete 'b' and 'd', then the compressed version of s will be "a3c3" of length 4.

Example 2:

Input: s = "aabbaa", k = 2
Output: 2
Explanation: If we delete both 'b' characters, the resulting compressed string would be "a4" of length 2.

Example 3:

Input: s = "aaaaaaaaaaa", k = 0
Output: 3
Explanation: Since k is zero, we cannot delete anything. The compressed string is "a11" of length 3.

Constraints:

  • 1 <= s.length <= 100
  • 0 <= k <= s.length
  • s contains only lowercase English letters.
Patterns Used

Roadmap

  1. Brute Force Baseline
  2. Core Insight
  3. Algorithm Walkthrough
  4. Edge Cases
  5. Full Annotated Code
  6. Interactive Study Demo
  7. Complexity Analysis
Step 01

Brute Force Baseline

Problem summary: Run-length encoding is a string compression method that works by replacing consecutive identical characters (repeated 2 or more times) with the concatenation of the character and the number marking the count of the characters (length of the run). For example, to compress the string "aabccc" we replace "aa" by "a2" and replace "ccc" by "c3". Thus the compressed string becomes "a2bc3". Notice that in this problem, we are not adding '1' after single characters. Given a string s and an integer k. You need to delete at most k characters from s such that the run-length encoded version of s has minimum length. Find the minimum length of the run-length encoded version of s after deleting at most k characters.

Baseline thinking

Start with the most direct exhaustive search. That gives a correctness anchor before optimizing.

Pattern signal: Dynamic Programming

Example 1

"aaabcccd"
2

Example 2

"aabbaa"
2

Example 3

"aaaaaaaaaaa"
0

Related Problems

  • String Compression III (string-compression-iii)
Step 02

Core Insight

What unlocks the optimal approach

  • Use dynamic programming.
  • The state of the DP can be the current index and the remaining characters to delete.
  • Having a prefix sum for each character can help you determine for a certain character c in some specific range, how many characters you need to delete to merge all occurrences of c in that range.
Interview move: turn each hint into an invariant you can check after every iteration/recursion step.
Step 03

Algorithm Walkthrough

Iteration Checklist

  1. Define state (indices, window, stack, map, DP cell, or recursion frame).
  2. Apply one transition step and update the invariant.
  3. Record answer candidate when condition is met.
  4. Continue until all input is consumed.
Use the first example testcase as your mental trace to verify each transition.
Step 04

Edge Cases

Minimum Input
Single element / shortest valid input
Validate boundary behavior before entering the main loop or recursion.
Duplicates & Repeats
Repeated values / repeated states
Decide whether duplicates should be merged, skipped, or counted explicitly.
Extreme Constraints
Largest constraint values
Re-check complexity target against constraints to avoid time-limit issues.
Invalid / Corner Shape
Empty collections, zeros, or disconnected structures
Handle special-case structure before the core algorithm path.
Step 05

Full Annotated Code

Source-backed implementations are provided below for direct study and interview prep.

// Accepted solution for LeetCode #1531: String Compression II
class Solution {
    public int getLengthOfOptimalCompression(String s, int k) {
        // dp[i][k] := the length of the optimal compression of s[i..n) with at most
        // k deletion
        dp = new int[s.length()][k + 1];
        Arrays.stream(dp).forEach(A -> Arrays.fill(A, K_MAX));
        return compression(s, 0, k);
    }

    private static final int K_MAX = 101;
    private int[][] dp;

    private int compression(final String s, int i, int k) {
        if (k < 0) {
            return K_MAX;
        }
        if (i == s.length() || s.length() - i <= k) {
            return 0;
        }
        if (dp[i][k] != K_MAX) {
            return dp[i][k];
        }
        int maxFreq = 0;
        int[] count = new int[128];
        // Make letters in s[i..j] be the same.
        // Keep the letter that has the maximum frequency in this range and remove
        // the other letters.
        for (int j = i; j < s.length(); ++j) {
            maxFreq = Math.max(maxFreq, ++count[s.charAt(j)]);
            dp[i][k] = Math.min(
                dp[i][k], getLength(maxFreq) + compression(s, j + 1, k - (j - i + 1 - maxFreq)));
        }
        return dp[i][k];
    }

    // Returns the length to compress `maxFreq`.
    private int getLength(int maxFreq) {
        if (maxFreq == 1) {
            return 1; // c
        }
        if (maxFreq < 10) {
            return 2; // [1-9]c
        }
        if (maxFreq < 100) {
            return 3; // [1-9][0-9]c
        }
        return 4; // [1-9][0-9][0-9]c
    }
}
Step 06

Interactive Study Demo

Use this to step through a reusable interview workflow for this problem.

Press Step or Run All to begin.
Step 07

Complexity Analysis

Time
O(n × m)
Space
O(n × m)

Approach Breakdown

RECURSIVE
O(2ⁿ) time
O(n) space

Pure recursion explores every possible choice at each step. With two choices per state (take or skip), the decision tree has 2ⁿ leaves. The recursion stack uses O(n) space. Many subproblems are recomputed exponentially many times.

DYNAMIC PROGRAMMING
O(n × m) time
O(n × m) space

Each cell in the DP table is computed exactly once from previously solved subproblems. The table dimensions determine both time and space. Look for the state variables — each unique combination of state values is one cell. Often a rolling array can reduce space by one dimension.

Shortcut: Count your DP state dimensions → that’s your time. Can you drop one? That’s your space optimization.
Coach Notes

Common Mistakes

Review these before coding to avoid predictable interview regressions.

State misses one required dimension

Wrong move: An incomplete state merges distinct subproblems and caches incorrect answers.

Usually fails on: Correctness breaks on cases that differ only in hidden state.

Fix: Define state so each unique subproblem maps to one DP cell.