LeetCode #30 — Hard

Substring with Concatenation
of All Words

A complete visual walkthrough of the sliding window approach with word-sized steps.

Solve on LeetCode
The Problem

Problem Statement

You are given a string s and an array of strings words. All the strings of words are of the same length.

A concatenated string is a string that exactly contains all the strings of any permutation of words concatenated.

  • For example, if words = ["ab","cd","ef"], then "abcdef", "abefcd", "cdabef", "cdefab", "efabcd", and "efcdab" are all concatenated strings. "acdbef" is not a concatenated string because it is not the concatenation of any permutation of words.

Return an array of the starting indices of all the concatenated substrings in s. You can return the answer in any order.

Example 1:

Input: s = "barfoothefoobarman", words = ["foo","bar"]

Output: [0,9]

Explanation:

The substring starting at 0 is "barfoo". It is the concatenation of ["bar","foo"] which is a permutation of words.
The substring starting at 9 is "foobar". It is the concatenation of ["foo","bar"] which is a permutation of words.

Example 2:

Input: s = "wordgoodgoodgoodbestword", words = ["word","good","best","word"]

Output: []

Explanation:

There is no concatenated substring.

Example 3:

Input: s = "barfoofoobarthefoobarman", words = ["bar","foo","the"]

Output: [6,9,12]

Explanation:

The substring starting at 6 is "foobarthe". It is the concatenation of ["foo","bar","the"].
The substring starting at 9 is "barthefoo". It is the concatenation of ["bar","the","foo"].
The substring starting at 12 is "thefoobar". It is the concatenation of ["the","foo","bar"].

Constraints:

  • 1 <= s.length <= 104
  • 1 <= words.length <= 5000
  • 1 <= words[i].length <= 30
  • s and words[i] consist of lowercase English letters.
Patterns Used

Roadmap

  1. Brute Force — Check Every Position
  2. The Core Insight — Sliding Window with Word Steps
  3. Walkthrough: "barfoothefoobarman"
  4. Edge Cases
  5. Full Annotated Code
  6. Interactive Demo
  7. Complexity Analysis
Step 01

Brute Force — Check Every Position

The naive approach: for each starting index in s, extract a substring of length wordCount * wordLen, split it into word-sized chunks, and check if those chunks form a permutation of the target words.

Example Setup

s = "barfoothefoobarman"
words = "foo" "bar"
wordLen = 3, wordCount = 2, totalLen = 6

We need to find starting indices where a 6-character substring is a concatenation of "foo" and "bar" (in any order).

b
a
r
f
o
o
t
h
e
f
o
o
b
a
r
m
a
n
Index: 0   1   2   3   4   5   6   7   8   9  10 11 12 13 14 15 16 17
Matches at indices [0, 9]

The brute force checks each of the ~n positions, splits into ~wordCount chunks, and compares each chunk. This gives O(n * wordCount * wordLen) — very slow for large inputs.

💡

The key observation: All words have equal length. This means we can slide a window that moves in word-sized steps, reusing work from previous positions instead of rechecking everything.

Step 02

The Core Insight — Sliding Window with Word Steps

Since all words have the same length w, we can treat the string as a sequence of word-sized slots. We run w different passes (starting at offsets 0, 1, ..., w-1) and in each pass, slide a window by one word at a time.

The Algorithm

1. Build a target frequency map
Count how many times each word appears in the words array.
2. For each offset i = 0 to wordLen-1
Start a sliding window at position i. Extract words at positions i, i+w, i+2w, ...
3. Expand right, shrink left
Add each new word to a window map. If it exceeds the target count, shrink from the left. If an unknown word appears, reset the window entirely. When count == wordCount, record the left index.

Why multiple offsets? Words could start at any character position mod wordLen. For wordLen=3, the word boundaries starting at index 0 are different from those starting at index 1 or 2. We run a pass for each possible alignment to cover all cases.

Step 03

Walkthrough: "barfoothefoobarman"

Let's trace the algorithm with s = "barfoothefoobarman" and words = ["foo", "bar"]. With wordLen=3, we run 3 passes (offsets 0, 1, 2). The interesting one is offset 0.

Offset 0: positions 0, 3, 6, 9, 12, 15

1
j=0: word = "bar"
Target word! window = {bar:1}, count=1. Need 2 words total.
2
j=3: word = "foo"
Target word! window = {bar:1, foo:1}, count=2. count == wordCount! Record left=0
3
j=6: word = "the"
Not a target word! Reset window. window = {}, count=0, left=9.
4
j=9: word = "foo"
Target word! window = {foo:1}, count=1.
5
j=12: word = "bar"
Target word! window = {foo:1, bar:1}, count=2. count == wordCount! Record left=9
6
j=15: word = "man"
Not a target word! Reset. Done with this offset.
Result: [0, 9]
Step 04

Edge Cases

Tricky Scenarios

Duplicate words
words = ["foo","foo"]. The frequency map has {foo:2}. The window must accumulate foo twice before it counts as a match. If foo appears a third time, shrink from the left.
Excess word count
When a valid word exceeds its target count, we shrink the window from the left until that word's count drops back to the target. This is the classic "shrink" step of sliding window.
Invalid word breaks the chain
When a non-target word appears, we can't include it in any valid concatenation. Clear the entire window and jump past it.
String shorter than total word length
If s.length() < wordCount * wordLen, no valid substring can exist. Return an empty list.
Step 05

Full Annotated Code

class Solution {
    public List<Integer> findSubstring(String s, String[] words) {
        List<Integer> result = new ArrayList<>();
        int wordLen = words[0].length();
        int wordCount = words.length;
        int totalLen = wordLen * wordCount;

        // Build target frequency map
        Map<String, Integer> target = new HashMap<>();
        for (String w : words)
            target.merge(w, 1, Integer::sum);

        // Try each starting offset (0 to wordLen-1)
        for (int i = 0; i < wordLen; i++) {
            Map<String, Integer> window = new HashMap<>();
            int left = i, count = 0;

            // Slide through word-sized chunks
            for (int j = i; j + wordLen <= s.length(); j += wordLen) {
                String word = s.substring(j, j + wordLen);

                if (target.containsKey(word)) {
                    window.merge(word, 1, Integer::sum);
                    count++;

                    // Shrink if this word exceeds target count
                    while (window.get(word) > target.get(word)) {
                        String leftWord = s.substring(left, left + wordLen);
                        window.merge(leftWord, -1, Integer::sum);
                        count--;
                        left += wordLen;
                    }

                    // All words matched!
                    if (count == wordCount) result.add(left);
                } else {
                    // Unknown word — reset everything
                    window.clear();
                    count = 0;
                    left = j + wordLen;
                }
            }
        }
        return result;
    }
}
Step 06

Interactive Demo

Enter a string and words to find all concatenation substring indices.

Sliding Window Visualizer



Press "Step →" or "Run All" to begin.
Step 07

Complexity Analysis

Time
O(m × k)
Space
O(n × k)

Where n = string length, m = number of words, and k = word length. We run k offset passes. Within each pass, the sliding window scans n/k word-sized chunks, and extracting each substring costs O(k). The window map holds at most m entries, and each hash comparison costs O(k), giving O(n × m × k) worst case. Space is O(m) for the two hash maps (target + window).

Approach Breakdown

BRUTE FORCE
O(n × m × k) time
O(m) space

For each of the n starting positions, extract m word-sized chunks (each costing O(k) to slice and hash), then compare their frequency map against the target. The nested structure -- n positions times m words times k characters per hash -- gives O(n*m*k). A hash map of up to m entries is rebuilt at every position.

OPTIMAL
O(n × k) time
O(m) space

We run k offset passes (0 to k-1). Within each pass the sliding window advances one word at a time across n/k positions, adding one word on the right and removing at most one on the left -- eliminating the factor of m per step. Each substring extraction costs O(k), giving O(n*k) total. The window hash map holds at most m entries.

🎯

The sliding window slides k positions at a time, checking word-sized chunks. Instead of re-validating all m words at every starting position, the window adds one word on the right and removes at most one on the left -- eliminating the factor of m per step in practice.

Coach Notes

Common Mistakes

Review these before coding to avoid predictable interview regressions.

Shrinking the window only once

Wrong move: Using `if` instead of `while` leaves the window invalid for multiple iterations.

Usually fails on: Over-limit windows stay invalid and produce wrong lengths/counts.

Fix: Shrink in a `while` loop until the invariant is valid again.

Mutating counts without cleanup

Wrong move: Zero-count keys stay in map and break distinct/count constraints.

Usually fails on: Window/map size checks are consistently off by one.

Fix: Delete keys when count reaches zero.