LeetCode #49 — Medium

Group Anagrams

Group strings by their anagram equivalence class — using a sorted key and a hash map.

Solve on LeetCode
The Problem

Problem Statement

Given an array of strings strs, group the anagrams together. You can return the answer in any order.

Example 1:

Input: strs = ["eat","tea","tan","ate","nat","bat"]

Output: [["bat"],["nat","tan"],["ate","eat","tea"]]

Explanation:

  • There is no string in strs that can be rearranged to form "bat".
  • The strings "nat" and "tan" are anagrams as they can be rearranged to form each other.
  • The strings "ate", "eat", and "tea" are anagrams as they can be rearranged to form each other.

Example 2:

Input: strs = [""]

Output: [[""]]

Example 3:

Input: strs = ["a"]

Output: [["a"]]

Constraints:

  • 1 <= strs.length <= 104
  • 0 <= strs[i].length <= 100
  • strs[i] consists of lowercase English letters.

Roadmap

  1. Brute Force — Compare Every Pair
  2. The Core Insight — Sorted String as Key
  3. Walkthrough — Grouping Step by Step
  4. Edge Cases
  5. Full Annotated Code
  6. Interactive Demo
  7. Complexity Analysis
Step 01

Brute Force — Compare Every Pair

The naive approach: for each pair of strings, check if they're anagrams (same character counts). Group matching strings together.

Pairwise Comparison

For n strings, we'd need to compare every pair — that's O(n^2) comparisons, each taking O(k) time to check character counts.

"eat" vs "tea" → anagram
"eat" vs "tan" → not anagram
"eat" vs "ate" → anagram
"eat" vs "nat" → not anagram
"eat" vs "bat" → not anagram
... 10 more comparisons ...

Total: O(n^2 * k) time. We also need a union-find or visited set to form groups. This is slow and complicated.

💡

Better idea: Instead of comparing strings to each other, compute a canonical form for each string that's the same for all anagrams. Then just group by canonical form using a hash map — one pass!

Step 02

The Core Insight — Sorted String as Key

Two strings are anagrams if and only if they contain the same characters in the same frequencies. Sorting a string produces a canonical form that's identical for all anagrams.

The Canonical Form

"eat" → sort → "aet"
"tea" → sort → "aet"
"ate" → sort → "aet"
"tan" → sort → "ant"
"nat" → sort → "ant"
"bat" → sort → "abt"

Same sorted form = same anagram group. Use sorted string as the HashMap key.

Alternative Key: Character Count

Instead of sorting, count character frequencies and use the count array as the key. This avoids the O(k log k) sort cost.

// Example: "eat" → "1#0#0#0#1#...#1#0#..." (counts of a,b,c,...)
int[] count = new int[26];
for (char c : s.toCharArray()) count[c - 'a']++;
String key = Arrays.toString(count); // Use as HashMap key

This runs in O(k) per string instead of O(k log k), but the key string is longer (26 numbers). Both approaches work well in practice.

Hash maps turn grouping into a single pass. For each string: compute key, look up in map, append to group list. No pair comparisons needed. The map does the grouping automatically.

Step 03

Walkthrough — Grouping Step by Step

Let's process ["eat","tea","tan","ate","nat","bat"] one string at a time and watch the HashMap build up.

Processing Each String

PROCESS "eat"
"eat" sort "aet" new key, create bucket
"aet"
"eat"
PROCESS "tea"
"tea" "aet" key exists, append
"aet"
"eat" "tea"
PROCESS "tan"
"tan" "ant" new key
"aet"
"eat" "tea"
"ant"
"tan"
PROCESS "ate", "nat", "bat"
"ate" → "aet" (existing), "nat" → "ant" (existing), "bat" → "abt" (new key)

Final HashMap State

"aet"
"eat" "tea" "ate"
"ant"
"tan" "nat"
"abt"
"bat"

Return the values: [["eat","tea","ate"], ["tan","nat"], ["bat"]]

Step 04

Edge Cases

strs = [""]
Empty string: sorted form is "". Result: [["]]. Works naturally.
strs = ["a"]
Single character: sorted form is "a". Result: [["a"]].
strs = ["abc","def","ghi"]
No anagrams: each string gets its own bucket. Result: [["abc"],["def"],["ghi"]].
strs = ["",""]
Multiple empty strings: both map to key "". Result: [["",""]]. They are anagrams of each other.
Step 05

Full Annotated Code

class Solution {
    public List<List<String>> groupAnagrams(String[] strs) {

        // Key: sorted string, Value: list of original strings
        Map<String, List<String>> map = new HashMap<>();

        for (String s : strs) {
            // Compute canonical form by sorting characters
            char[] chars = s.toCharArray();
            Arrays.sort(chars);
            String key = new String(chars);

            // Add to the bucket for this key
            // computeIfAbsent creates the list if key is new
            map.computeIfAbsent(key, k -> new ArrayList<>()).add(s);
        }

        // Return all bucket values as a list of lists
        return new ArrayList<>(map.values());
    }
}
Step 06

Interactive Demo

Enter comma-separated strings and step through the grouping process. Watch the hash map build up entry by entry.

Anagram Grouper


Step 07

Complexity Analysis

Time
O(n × k log k)
Space
O(n × k)

We process n strings. For each string of length k, sorting takes O(k log k). HashMap operations are O(k) amortized (for hashing the key). Total: O(n × k log k). Space stores all strings in the map: O(n × k) for the keys and value lists.

Approach Breakdown

BRUTE FORCE
O(n2 × k) time
O(n × k) space

Compare every pair of n strings to check if they are anagrams. Each comparison builds two frequency arrays of size 26 and compares them in O(k) time. With n(n-1)/2 pairs, total work is O(n2 × k). A union-find or visited set is also needed to form groups.

OPTIMAL
O(n × k log k) time
O(n × k) space

One pass over n strings: sort each string's characters in O(k log k) to produce a canonical key, then insert into a HashMap in O(k) amortized time. The map stores all n strings and their keys, using O(n × k) space. Using a character-count key reduces time to O(n × k).

🎯

Sorting each string gives a canonical key -- all anagrams produce the same sorted string. Alternative: use a character count array as the key for O(n × k) time, avoiding the O(k log k) sort. In practice, the sort approach is simpler and fast enough for typical string lengths.

Coach Notes

Common Mistakes

Review these before coding to avoid predictable interview regressions.

Mutating counts without cleanup

Wrong move: Zero-count keys stay in map and break distinct/count constraints.

Usually fails on: Window/map size checks are consistently off by one.

Fix: Delete keys when count reaches zero.