LeetCode #28 — Easy

Find the Index of the
First Occurrence

A visual walkthrough of substring search using a sliding window — from brute force to the KMP algorithm insight.

Solve on LeetCode
The Problem

Problem Statement

Given two strings needle and haystack, return the index of the first occurrence of needle in haystack, or -1 if needle is not part of haystack.

Example 1:

Input: haystack = "sadbutsad", needle = "sad"
Output: 0
Explanation: "sad" occurs at index 0 and 6.
The first occurrence is at index 0, so we return 0.

Example 2:

Input: haystack = "leetcode", needle = "leeto"
Output: -1
Explanation: "leeto" did not occur in "leetcode", so we return -1.

Constraints:

  • 1 <= haystack.length, needle.length <= 104
  • haystack and needle consist of only lowercase English characters.
Patterns Used

Roadmap

  1. Brute Force — Check Every Position
  2. The Core Insight — Sliding Window
  3. Visual Walkthrough
  4. Edge Cases
  5. Full Annotated Code
  6. Interactive Demo
  7. Complexity Analysis
  8. Beyond Brute Force — KMP Preview
Step 01

Brute Force — Check Every Position

The straightforward approach: for each starting position i in the haystack, check if the substring starting at i matches the needle character by character.

Try Every Starting Position

haystack = "sadbutsad", needle = "sad"

0 1 2 3 4 5 6 7 8
haystack
s
a
d
b
u
t
s
a
d
needle
s
a
d
Match at i=0! Return 0

At each position, we compare up to n characters (the length of the needle). If any character mismatches, we move to the next starting position. This is essentially a sliding window of size n.

Step 02

The Core Insight — Sliding Window

We slide a window of size needle.length across the haystack. At each position, we compare the window contents with the needle. The key optimization: we only need to check positions 0 through m - n.

Why Stop at m - n?

If the haystack has m characters and the needle has n, any starting position beyond m - n would not leave enough room for the needle to fit. Checking further is pointless.

i = 0 to m - n
Valid starting positions. The window [i, i+n) fits within the haystack.
i > m - n
Not enough characters left. The needle cannot possibly fit here.
💡

Key idea: At each valid position i, extract the substring haystack[i..i+n] and compare it to the needle. If they match, return i immediately. If no position matches, return -1.

Step 03

Visual Walkthrough

Let's trace through haystack = "hello", needle = "ll":

i=0: Compare "he" with "ll"

0 1 2 3 4
haystack
h
e
l
l
o
needle
l
l

"he" != "ll" → no match. Slide window right.

i=1: Compare "el" with "ll"

0 1 2 3 4
haystack
h
e
l
l
o
needle
l
l

"el" != "ll" → no match. Slide window right.

i=2: Compare "ll" with "ll"

0 1 2 3 4
haystack
h
e
l
l
o
needle
l
l
"ll" == "ll" → Match! Return 2
Step 04

Edge Cases

Empty Needle

needle = "" → return 0

An empty string is found at every position. By convention, return 0. In Java, "hello".indexOf("") returns 0.

Needle Longer Than Haystack

haystack = "ab", needle = "abc"

The loop condition i <= m - n evaluates to i <= -1, so the loop never executes. Return -1.

Needle at the Very End

0 1 2 3 4
haystack
a
b
c
d
e
needle
d
e

needle = "de". Found at the last valid position i = 3. This is why we loop up to m - n inclusive.

Exact Match

a
b
c

haystack = "abc", needle = "abc". Only one valid position: i=0. Full match. Return 0.

No Match Found

haystack = "abcdef", needle = "xyz"

After checking all valid positions, no substring matches. Return -1.

Step 05

Full Annotated Code

class Solution {
    public int strStr(String haystack, String needle) {

        int m = haystack.length();
        int n = needle.length();

        // Try each valid starting position
        for (int i = 0; i <= m - n; i++) {

            // Extract substring of length n starting at i
            // and compare with needle
            if (haystack.substring(i, i + n).equals(needle))
                return i;  // first match found!
        }

        // No match found in any position
        return -1;
    }
}

Alternative without substring: You can avoid creating substring objects by comparing character-by-character. This is more memory-efficient and sometimes faster in practice:

class Solution {
    public int strStr(String haystack, String needle) {
        int m = haystack.length(), n = needle.length();

        for (int i = 0; i <= m - n; i++) {
            int j = 0;
            // Compare character by character
            while (j < n && haystack.charAt(i + j) == needle.charAt(j))
                j++;
            if (j == n) return i; // all n chars matched
        }

        return -1;
    }
}
Step 06

Interactive Demo

Enter a haystack and needle string, then step through the sliding window comparisons character by character.

⚙ Substring Search Visualizer



Step 07

Complexity Analysis

Time
O((n-m)
KMP optimal
Space
O(1)
failure table

The brute force checks up to m - n + 1 starting positions with up to n character comparisons each, giving O(m · n) worst-case time. KMP pre-computes a failure function for the needle in O(n), then scans the haystack in a single pass of O(m) -- the text pointer never backtracks. Total: O(m + n) time with O(n) space for the prefix table.

Approach Breakdown

BRUTE FORCE
O(m · n) time
O(1) space

Tries each of the m-n+1 starting positions and compares up to n characters at each one. On adversarial inputs like "aaa...aab" this hits the full O(m·n) worst case. Only index variables are needed -- no extra allocations.

RABIN-KARP
O(m + n) time
O(1) space

Computes a rolling hash for the needle in O(n), then slides the hash window across the haystack. Each window shift updates the hash in O(1) via arithmetic, giving O(m) for the scan. Only a few integer variables store the hash values.

KMP
O(m + n) time
O(n) space

Pre-computes a failure table for the needle in O(n), then scans the haystack in one pass of O(m) without ever backtracking the text pointer. The failure table requires an O(n) array storing the longest proper prefix-suffix for each position.

🎯

KMP's failure function avoids re-comparing characters -- the text pointer never backtracks. When a mismatch occurs after matching j characters, the failure table tells us the longest proper prefix of the needle that is also a suffix, letting us skip ahead instead of restarting from scratch. In practice the brute force O(m · n) solution is accepted in interviews, but KMP guarantees linear time even on adversarial inputs like "aaa...aab".

Step 08

Beyond Brute Force — KMP Preview

The Knuth-Morris-Pratt (KMP) algorithm achieves O(m + n) by never re-comparing characters we have already matched. It pre-computes a "failure function" (also called the prefix table) for the needle.

The Key Idea Behind KMP

Brute Force Problem
When a mismatch occurs at position j in the needle, brute force restarts from i+1 and j=0 — discarding everything we learned.
KMP Solution
Instead of resetting j to 0, KMP uses the prefix table to jump j to the longest proper prefix that is also a suffix. The haystack pointer i never moves backward.
🎯

When to use KMP: In interviews, KMP is rarely required for this problem. The brute force O(m*n) solution is almost always accepted. KMP becomes essential when you need guaranteed linear-time pattern matching in production systems (text editors, network packet inspection, DNA sequence matching).

Coach Notes

Common Mistakes

Review these before coding to avoid predictable interview regressions.

Moving both pointers on every comparison

Wrong move: Advancing both pointers shrinks the search space too aggressively and skips candidates.

Usually fails on: A valid pair can be skipped when only one side should move.

Fix: Move exactly one pointer per decision branch based on invariant.