LeetCode #839 — HARD

Similar String Groups

Break down a hard problem into reliable checkpoints, edge-case handling, and complexity trade-offs.

The Problem

Problem Statement

Two strings, X and Y, are considered similar if either they are identical or we can make them equivalent by swapping at most two letters (in distinct positions) within the string X.

For example, "tars" and "rats" are similar (swapping at positions 0 and 2), and "rats" and "arts" are similar, but "star" is not similar to "tars", "rats", or "arts".

Together, these form two connected groups by similarity: {"tars", "rats", "arts"} and {"star"}. Notice that "tars" and "arts" are in the same group even though they are not similar. Formally, each group is such that a word is in the group if and only if it is similar to at least one other word in the group.

We are given a list strs of strings where every string in strs is an anagram of every other string in strs. How many groups are there?

Example 1:

Input: strs = ["tars","rats","arts","star"]
Output: 2

Example 2:

Input: strs = ["omv","ovm"]
Output: 1

Constraints:

1 <= strs.length <= 300
1 <= strs[i].length <= 300
strs[i] consists of lowercase letters only.
All words in strs have the same length and are anagrams of each other.

Step 01

Brute Force Baseline

Problem summary: Two strings, X and Y, are considered similar if either they are identical or we can make them equivalent by swapping at most two letters (in distinct positions) within the string X. For example, "tars" and "rats" are similar (swapping at positions 0 and 2), and "rats" and "arts" are similar, but "star" is not similar to "tars", "rats", or "arts". Together, these form two connected groups by similarity: {"tars", "rats", "arts"} and {"star"}. Notice that "tars" and "arts" are in the same group even though they are not similar. Formally, each group is such that a word is in the group if and only if it is similar to at least one other word in the group. We are given a list strs of strings where every string in strs is an anagram of every other string in strs. How many groups are there?

Baseline thinking

Start with the most direct exhaustive search. That gives a correctness anchor before optimizing.

Pattern signal: Array · Hash Map · Union-Find

Example 1

["tars","rats","arts","star"]

Example 2

["omv","ovm"]

Core Insight

What unlocks the optimal approach

No official hints in dataset. Start from constraints and look for a monotonic or reusable state.

Interview move: turn each hint into an invariant you can check after every iteration/recursion step.

Step 03

Algorithm Walkthrough

Iteration Checklist

Define state (indices, window, stack, map, DP cell, or recursion frame).
Apply one transition step and update the invariant.
Record answer candidate when condition is met.
Continue until all input is consumed.

Use the first example testcase as your mental trace to verify each transition.

Step 04

Edge Cases

Minimum Input

Single element / shortest valid input

Validate boundary behavior before entering the main loop or recursion.

Duplicates & Repeats

Repeated values / repeated states

Decide whether duplicates should be merged, skipped, or counted explicitly.

Extreme Constraints

Largest constraint values

Re-check complexity target against constraints to avoid time-limit issues.

Invalid / Corner Shape

Empty collections, zeros, or disconnected structures

Handle special-case structure before the core algorithm path.

Step 05

Full Annotated Code

Source-backed implementations are provided below for direct study and interview prep.

// Accepted solution for LeetCode #839: Similar String Groups
class UnionFind {
    private final int[] p;
    private final int[] size;

    public UnionFind(int n) {
        p = new int[n];
        size = new int[n];
        for (int i = 0; i < n; ++i) {
            p[i] = i;
            size[i] = 1;
        }
    }

    public int find(int x) {
        if (p[x] != x) {
            p[x] = find(p[x]);
        }
        return p[x];
    }

    public boolean union(int a, int b) {
        int pa = find(a), pb = find(b);
        if (pa == pb) {
            return false;
        }
        if (size[pa] > size[pb]) {
            p[pb] = pa;
            size[pa] += size[pb];
        } else {
            p[pa] = pb;
            size[pb] += size[pa];
        }
        return true;
    }
}

class Solution {
    public int numSimilarGroups(String[] strs) {
        int n = strs.length, m = strs[0].length();
        UnionFind uf = new UnionFind(n);
        int cnt = n;
        for (int i = 0; i < n; ++i) {
            for (int j = 0; j < i; ++j) {
                int diff = 0;
                for (int k = 0; k < m; ++k) {
                    if (strs[i].charAt(k) != strs[j].charAt(k)) {
                        ++diff;
                    }
                }
                if (diff <= 2 && uf.union(i, j)) {
                    --cnt;
                }
            }
        }
        return cnt;
    }
}

// Accepted solution for LeetCode #839: Similar String Groups
type unionFind struct {
	p, size []int
}

func newUnionFind(n int) *unionFind {
	p := make([]int, n)
	size := make([]int, n)
	for i := range p {
		p[i] = i
		size[i] = 1
	}
	return &unionFind{p, size}
}

func (uf *unionFind) find(x int) int {
	if uf.p[x] != x {
		uf.p[x] = uf.find(uf.p[x])
	}
	return uf.p[x]
}

func (uf *unionFind) union(a, b int) bool {
	pa, pb := uf.find(a), uf.find(b)
	if pa == pb {
		return false
	}
	if uf.size[pa] > uf.size[pb] {
		uf.p[pb] = pa
		uf.size[pa] += uf.size[pb]
	} else {
		uf.p[pa] = pb
		uf.size[pb] += uf.size[pa]
	}
	return true
}

func numSimilarGroups(strs []string) int {
	n := len(strs)
	uf := newUnionFind(n)
	for i, s := range strs {
		for j, t := range strs[:i] {
			diff := 0
			for k := range s {
				if s[k] != t[k] {
					diff++
				}
			}
			if diff <= 2 && uf.union(i, j) {
				n--
			}
		}
	}
	return n
}

# Accepted solution for LeetCode #839: Similar String Groups
class UnionFind:
    def __init__(self, n):
        self.p = list(range(n))
        self.size = [1] * n

    def find(self, x):
        if self.p[x] != x:
            self.p[x] = self.find(self.p[x])
        return self.p[x]

    def union(self, a, b):
        pa, pb = self.find(a), self.find(b)
        if pa == pb:
            return False
        if self.size[pa] > self.size[pb]:
            self.p[pb] = pa
            self.size[pa] += self.size[pb]
        else:
            self.p[pa] = pb
            self.size[pb] += self.size[pa]
        return True


class Solution:
    def numSimilarGroups(self, strs: List[str]) -> int:
        n, m = len(strs), len(strs[0])
        uf = UnionFind(n)
        for i, s in enumerate(strs):
            for j, t in enumerate(strs[:i]):
                if sum(s[k] != t[k] for k in range(m)) <= 2 and uf.union(i, j):
                    n -= 1
        return n

// Accepted solution for LeetCode #839: Similar String Groups
/**
 * [0839] Similar String Groups
 *
 * Two strings X and Y are similar if we can swap two letters (in different positions) of X, so that it equals Y. Also two strings X and Y are similar if they are equal.
 * For example, "tars" and "rats" are similar (swapping at positions 0 and 2), and "rats" and "arts" are similar, but "star" is not similar to "tars", "rats", or "arts".
 * Together, these form two connected groups by similarity: {"tars", "rats", "arts"} and {"star"}.  Notice that "tars" and "arts" are in the same group even though they are not similar.  Formally, each group is such that a word is in the group if and only if it is similar to at least one other word in the group.
 * We are given a list strs of strings where every string in strs is an anagram of every other string in strs. How many groups are there?
 *  
 * Example 1:
 *
 * Input: strs = ["tars","rats","arts","star"]
 * Output: 2
 *
 * Example 2:
 *
 * Input: strs = ["omv","ovm"]
 * Output: 1
 *
 *  
 * Constraints:
 *
 * 	1 <= strs.length <= 300
 * 	1 <= strs[i].length <= 300
 * 	strs[i] consists of lowercase letters only.
 * 	All words in strs have the same length and are anagrams of each other.
 *
 */
pub struct Solution {}

// problem: https://leetcode.com/problems/similar-string-groups/
// discuss: https://leetcode.com/problems/similar-string-groups/discuss/?currentPage=1&orderBy=most_votes&query=

// submission codes start here

struct UnionFind {
    count: usize,
    parent: std::cell::RefCell<Vec<usize>>,
    size: Vec<usize>,
}

impl UnionFind {
    pub fn new(size: usize) -> Self {
        Self {
            count: size,
            parent: std::cell::RefCell::new((0..size).collect()),
            size: vec![1; size],
        }
    }
    pub fn count(&self) -> usize {
        self.count
    }
    pub fn find(&self, p: usize) -> usize {
        let mut root = p;
        while root != self.parent.borrow()[root] {
            root = self.parent.borrow()[root];
        }
        let mut p = p;
        while p != root {
            let next = self.parent.borrow()[p];
            self.parent.borrow_mut()[p] = root;
            p = next;
        }
        root
    }

    pub fn is_connected(&self, p: usize, q: usize) -> bool {
        self.find(p) == self.find(q)
    }
    pub fn connect(&mut self, p: usize, q: usize) {
        let (p_root, q_root) = (self.find(p), self.find(q));
        if p_root == q_root {
            return;
        }
        if self.size[p_root] < self.size[q_root] {
            self.parent.borrow_mut()[p_root] = q_root;
            self.size[q_root] += self.size[p_root];
        } else {
            self.parent.borrow_mut()[q_root] = p_root;
            self.size[p_root] += self.size[q_root];
        }
        self.count -= 1;
    }
}

impl Solution {
    pub fn num_similar_groups(strs: Vec<String>) -> i32 {
        fn is_similar(s1: &str, s2: &str) -> bool {
            s1.chars().zip(s2.chars()).filter(|(a, b)| !a.eq(b)).count() <= 2
        }
        let mut uf = UnionFind::new(strs.len());

        for i in 0..strs.len() {
            let s1 = strs.get(i).unwrap();
            for j in i + 1..strs.len() {
                if uf.is_connected(i, j) {
                    continue;
                }

                let s2 = strs.get(j).unwrap();
                if is_similar(s1, s2) {
                    uf.connect(i, j);
                }
            }
        }

        uf.count() as i32
    }
}

// submission codes end

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_0839_example_1() {
        let strs = vec_string!["tars", "rats", "arts", "star"];
        let result = 2;

        assert_eq!(Solution::num_similar_groups(strs), result);
    }

    #[test]
    fn test_0839_example_2() {
        let strs = vec_string!["omv", "ovm"];
        let result = 1;

        assert_eq!(Solution::num_similar_groups(strs), result);
    }
}

// Accepted solution for LeetCode #839: Similar String Groups
class UnionFind {
    private p: number[];
    private size: number[];

    constructor(n: number) {
        this.p = Array.from({ length: n }, (_, i) => i);
        this.size = Array(n).fill(1);
    }

    union(a: number, b: number): boolean {
        const pa = this.find(a);
        const pb = this.find(b);
        if (pa === pb) {
            return false;
        }
        if (this.size[pa] > this.size[pb]) {
            this.p[pb] = pa;
            this.size[pa] += this.size[pb];
        } else {
            this.p[pa] = pb;
            this.size[pb] += this.size[pa];
        }
        return true;
    }

    find(x: number): number {
        if (this.p[x] !== x) {
            this.p[x] = this.find(this.p[x]);
        }
        return this.p[x];
    }
}

function numSimilarGroups(strs: string[]): number {
    const n = strs.length;
    const m = strs[0].length;
    const uf = new UnionFind(n);
    let cnt = n;
    for (let i = 0; i < n; ++i) {
        for (let j = 0; j < i; ++j) {
            let diff = 0;
            for (let k = 0; k < m; ++k) {
                if (strs[i][k] !== strs[j][k]) {
                    diff++;
                }
            }
            if (diff <= 2 && uf.union(i, j)) {
                cnt--;
            }
        }
    }
    return cnt;
}

Step 06

Interactive Study Demo

Use this to step through a reusable interview workflow for this problem.

Custom checkpoints (one per line)

Press Step or Run All to begin.

Step 07

Complexity Analysis

Time

O(n^2 × (m + \alpha(n)

Space

O(n)

Approach Breakdown

BRUTE FORCE

O(n²) time

O(n) space

Track components with a list or adjacency matrix. Each union operation may need to update all n elements’ component labels, giving O(n) per union. For n union operations total: O(n²). Find is O(1) with direct lookup, but union dominates.

UNION-FIND

O(α(n)) time

O(n) space

With path compression and union by rank, each find/union operation takes O(α(n)) amortized time, where α is the inverse Ackermann function — effectively constant. Space is O(n) for the parent and rank arrays. For m operations on n elements: O(m × α(n)) total.

Shortcut: Union-Find with path compression + rank → O(α(n)) per operation ≈ O(1). Just say “nearly constant.”

Coach Notes

Common Mistakes

Review these before coding to avoid predictable interview regressions.

Off-by-one on range boundaries

Wrong move: Loop endpoints miss first/last candidate.

Usually fails on: Fails on minimal arrays and exact-boundary answers.

Fix: Re-derive loops from inclusive/exclusive ranges before coding.

Mutating counts without cleanup

Wrong move: Zero-count keys stay in map and break distinct/count constraints.

Usually fails on: Window/map size checks are consistently off by one.

Fix: Delete keys when count reaches zero.