LeetCode #1948 — HARD

Delete Duplicate Folders in System

Break down a hard problem into reliable checkpoints, edge-case handling, and complexity trade-offs.

The Problem

Problem Statement

Due to a bug, there are many duplicate folders in a file system. You are given a 2D array paths, where paths[i] is an array representing an absolute path to the i^th folder in the file system.

For example, ["one", "two", "three"] represents the path "/one/two/three".

Two folders (not necessarily on the same level) are identical if they contain the same non-empty set of identical subfolders and underlying subfolder structure. The folders do not need to be at the root level to be identical. If two or more folders are identical, then mark the folders as well as all their subfolders.

For example, folders "/a" and "/b" in the file structure below are identical. They (as well as their subfolders) should all be marked:
- /a
- /a/x
- /a/x/y
- /a/z
- /b
- /b/x
- /b/x/y
- /b/z
However, if the file structure also included the path "/b/w", then the folders "/a" and "/b" would not be identical. Note that "/a/x" and "/b/x" would still be considered identical even with the added folder.

Once all the identical folders and their subfolders have been marked, the file system will delete all of them. The file system only runs the deletion once, so any folders that become identical after the initial deletion are not deleted.

Return the 2D array ans containing the paths of the remaining folders after deleting all the marked folders. The paths may be returned in any order.

Example 1:

Input: paths = [["a"],["c"],["d"],["a","b"],["c","b"],["d","a"]]
Output: [["d"],["d","a"]]
Explanation: The file structure is as shown.
Folders "/a" and "/c" (and their subfolders) are marked for deletion because they both contain an empty
folder named "b".

Example 2:

Input: paths = [["a"],["c"],["a","b"],["c","b"],["a","b","x"],["a","b","x","y"],["w"],["w","y"]]
Output: [["c"],["c","b"],["a"],["a","b"]]
Explanation: The file structure is as shown. 
Folders "/a/b/x" and "/w" (and their subfolders) are marked for deletion because they both contain an empty folder named "y".
Note that folders "/a" and "/c" are identical after the deletion, but they are not deleted because they were not marked beforehand.

Example 3:

Input: paths = [["a","b"],["c","d"],["c"],["a"]]
Output: [["c"],["c","d"],["a"],["a","b"]]
Explanation: All folders are unique in the file system.
Note that the returned array can be in a different order as the order does not matter.

Constraints:

1 <= paths.length <= 2 * 10⁴
1 <= paths[i].length <= 500
1 <= paths[i][j].length <= 10
1 <= sum(paths[i][j].length) <= 2 * 10⁵
path[i][j] consists of lowercase English letters.
No two paths lead to the same folder.
For any folder not at the root level, its parent folder will also be in the input.

Step 01

Brute Force Baseline

Problem summary: Due to a bug, there are many duplicate folders in a file system. You are given a 2D array paths, where paths[i] is an array representing an absolute path to the ith folder in the file system. For example, ["one", "two", "three"] represents the path "/one/two/three". Two folders (not necessarily on the same level) are identical if they contain the same non-empty set of identical subfolders and underlying subfolder structure. The folders do not need to be at the root level to be identical. If two or more folders are identical, then mark the folders as well as all their subfolders. For example, folders "/a" and "/b" in the file structure below are identical. They (as well as their subfolders) should all be marked: /a /a/x /a/x/y /a/z /b /b/x /b/x/y /b/z However, if the file structure also included the path "/b/w", then the folders "/a" and "/b" would not be identical. Note that "/a/x" and

Baseline thinking

Start with the most direct exhaustive search. That gives a correctness anchor before optimizing.

Pattern signal: Array · Hash Map · Trie

Example 1

[["a"],["c"],["d"],["a","b"],["c","b"],["d","a"]]

Example 2

[["a"],["c"],["a","b"],["c","b"],["a","b","x"],["a","b","x","y"],["w"],["w","y"]]

Example 3

[["a","b"],["c","d"],["c"],["a"]]

Core Insight

What unlocks the optimal approach

Can we use a trie to build the folder structure?
Can we utilize hashing to hash the folder structures?

Interview move: turn each hint into an invariant you can check after every iteration/recursion step.

Step 03

Algorithm Walkthrough

Iteration Checklist

Define state (indices, window, stack, map, DP cell, or recursion frame).
Apply one transition step and update the invariant.
Record answer candidate when condition is met.
Continue until all input is consumed.

Use the first example testcase as your mental trace to verify each transition.

Step 04

Edge Cases

Minimum Input

Single element / shortest valid input

Validate boundary behavior before entering the main loop or recursion.

Duplicates & Repeats

Repeated values / repeated states

Decide whether duplicates should be merged, skipped, or counted explicitly.

Extreme Constraints

Largest constraint values

Re-check complexity target against constraints to avoid time-limit issues.

Invalid / Corner Shape

Empty collections, zeros, or disconnected structures

Handle special-case structure before the core algorithm path.

Step 05

Full Annotated Code

Source-backed implementations are provided below for direct study and interview prep.

// Accepted solution for LeetCode #1948: Delete Duplicate Folders in System
class Trie {
    Map<String, Trie> children;
    boolean deleted;

    public Trie() {
        children = new HashMap<>();
        deleted = false;
    }
}

class Solution {
    public List<List<String>> deleteDuplicateFolder(List<List<String>> paths) {
        Trie root = new Trie();
        for (List<String> path : paths) {
            Trie cur = root;
            for (String name : path) {
                if (!cur.children.containsKey(name)) {
                    cur.children.put(name, new Trie());
                }
                cur = cur.children.get(name);
            }
        }

        Map<String, Trie> g = new HashMap<>();

        var dfs = new Function<Trie, String>() {
            @Override
            public String apply(Trie node) {
                if (node.children.isEmpty()) {
                    return "";
                }
                List<String> subs = new ArrayList<>();
                for (var entry : node.children.entrySet()) {
                    subs.add(entry.getKey() + "(" + apply(entry.getValue()) + ")");
                }
                Collections.sort(subs);
                String s = String.join("", subs);
                if (g.containsKey(s)) {
                    node.deleted = true;
                    g.get(s).deleted = true;
                } else {
                    g.put(s, node);
                }
                return s;
            }
        };

        dfs.apply(root);

        List<List<String>> ans = new ArrayList<>();
        List<String> path = new ArrayList<>();

        var dfs2 = new Function<Trie, Void>() {
            @Override
            public Void apply(Trie node) {
                if (node.deleted) {
                    return null;
                }
                if (!path.isEmpty()) {
                    ans.add(new ArrayList<>(path));
                }
                for (Map.Entry<String, Trie> entry : node.children.entrySet()) {
                    path.add(entry.getKey());
                    apply(entry.getValue());
                    path.remove(path.size() - 1);
                }
                return null;
            }
        };

        dfs2.apply(root);

        return ans;
    }
}

// Accepted solution for LeetCode #1948: Delete Duplicate Folders in System
type Trie struct {
	children map[string]*Trie
	deleted  bool
}

func NewTrie() *Trie {
	return &Trie{
		children: make(map[string]*Trie),
	}
}

func deleteDuplicateFolder(paths [][]string) (ans [][]string) {
	root := NewTrie()
	for _, path := range paths {
		cur := root
		for _, name := range path {
			if _, exists := cur.children[name]; !exists {
				cur.children[name] = NewTrie()
			}
			cur = cur.children[name]
		}
	}

	g := make(map[string]*Trie)

	var dfs func(*Trie) string
	dfs = func(node *Trie) string {
		if len(node.children) == 0 {
			return ""
		}
		var subs []string
		for name, child := range node.children {
			subs = append(subs, name+"("+dfs(child)+")")
		}
		sort.Strings(subs)
		s := strings.Join(subs, "")
		if existingNode, exists := g[s]; exists {
			node.deleted = true
			existingNode.deleted = true
		} else {
			g[s] = node
		}
		return s
	}

	var dfs2 func(*Trie, []string)
	dfs2 = func(node *Trie, path []string) {
		if node.deleted {
			return
		}
		if len(path) > 0 {
			ans = append(ans, append([]string{}, path...))
		}
		for name, child := range node.children {
			dfs2(child, append(path, name))
		}
	}

	dfs(root)
	dfs2(root, []string{})
	return ans
}

# Accepted solution for LeetCode #1948: Delete Duplicate Folders in System
class Trie:
    def __init__(self):
        self.children: Dict[str, "Trie"] = defaultdict(Trie)
        self.deleted: bool = False


class Solution:
    def deleteDuplicateFolder(self, paths: List[List[str]]) -> List[List[str]]:
        root = Trie()
        for path in paths:
            cur = root
            for name in path:
                if cur.children[name] is None:
                    cur.children[name] = Trie()
                cur = cur.children[name]

        g: Dict[str, Trie] = {}

        def dfs(node: Trie) -> str:
            if not node.children:
                return ""
            subs: List[str] = []
            for name, child in node.children.items():
                subs.append(f"{name}({dfs(child)})")
            s = "".join(sorted(subs))
            if s in g:
                node.deleted = g[s].deleted = True
            else:
                g[s] = node
            return s

        def dfs2(node: Trie) -> None:
            if node.deleted:
                return
            if path:
                ans.append(path[:])
            for name, child in node.children.items():
                path.append(name)
                dfs2(child)
                path.pop()

        dfs(root)
        ans: List[List[str]] = []
        path: List[str] = []
        dfs2(root)
        return ans

// Accepted solution for LeetCode #1948: Delete Duplicate Folders in System
/**
 * [1948] Delete Duplicate Folders in System
 *
 * Due to a bug, there are many duplicate folders in a file system. You are given a 2D array paths, where paths[i] is an array representing an absolute path to the i^th folder in the file system.
 *
 * 	For example, ["one", "two", "three"] represents the path "/one/two/three".
 *
 * Two folders (not necessarily on the same level) are identical if they contain the same non-empty set of identical subfolders and underlying subfolder structure. The folders do not need to be at the root level to be identical. If two or more folders are identical, then mark the folders as well as all their subfolders.
 *
 * 	For example, folders "/a" and "/b" in the file structure below are identical. They (as well as their subfolders) should all be marked:
 *
 * 		/a
 * 		/a/x
 * 		/a/x/y
 * 		/a/z
 * 		/b
 * 		/b/x
 * 		/b/x/y
 * 		/b/z
 *
 *
 * 	However, if the file structure also included the path "/b/w", then the folders "/a" and "/b" would not be identical. Note that "/a/x" and "/b/x" would still be considered identical even with the added folder.
 *
 * Once all the identical folders and their subfolders have been marked, the file system will delete all of them. The file system only runs the deletion once, so any folders that become identical after the initial deletion are not deleted.
 * Return the 2D array ans containing the paths of the remaining folders after deleting all the marked folders. The paths may be returned in any order.
 *  
 * Example 1:
 * <img alt="" src="https://assets.leetcode.com/uploads/2021/07/19/lc-dupfolder1.jpg" style="width: 200px; height: 218px;" />
 * Input: paths = [["a"],["c"],["d"],["a","b"],["c","b"],["d","a"]]
 * Output: [["d"],["d","a"]]
 * Explanation: The file structure is as shown.
 * Folders "/a" and "/c" (and their subfolders) are marked for deletion because they both contain an empty
 * folder named "b".
 *
 * Example 2:
 * <img alt="" src="https://assets.leetcode.com/uploads/2021/07/19/lc-dupfolder2.jpg" style="width: 200px; height: 355px;" />
 * Input: paths = [["a"],["c"],["a","b"],["c","b"],["a","b","x"],["a","b","x","y"],["w"],["w","y"]]
 * Output: [["c"],["c","b"],["a"],["a","b"]]
 * Explanation: The file structure is as shown.
 * Folders "/a/b/x" and "/w" (and their subfolders) are marked for deletion because they both contain an empty folder named "y".
 * Note that folders "/a" and "/c" are identical after the deletion, but they are not deleted because they were not marked beforehand.
 *
 * Example 3:
 * <img alt="" src="https://assets.leetcode.com/uploads/2021/07/19/lc-dupfolder3.jpg" style="width: 200px; height: 201px;" />
 * Input: paths = [["a","b"],["c","d"],["c"],["a"]]
 * Output: [["c"],["c","d"],["a"],["a","b"]]
 * Explanation: All folders are unique in the file system.
 * Note that the returned array can be in a different order as the order does not matter.
 *
 *  
 * Constraints:
 *
 * 	1 <= paths.length <= 2 * 10^4
 * 	1 <= paths[i].length <= 500
 * 	1 <= paths[i][j].length <= 10
 * 	1 <= sum(paths[i][j].length) <= 2 * 10^5
 * 	path[i][j] consists of lowercase English letters.
 * 	No two paths lead to the same folder.
 * 	For any folder not at the root level, its parent folder will also be in the input.
 *
 */
pub struct Solution {}

// problem: https://leetcode.com/problems/delete-duplicate-folders-in-system/
// discuss: https://leetcode.com/problems/delete-duplicate-folders-in-system/discuss/?currentPage=1&orderBy=most_votes&query=

// submission codes start here

impl Solution {
    pub fn delete_duplicate_folder(paths: Vec<Vec<String>>) -> Vec<Vec<String>> {
        vec![]
    }
}

// submission codes end

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    #[ignore]
    fn test_1948_example_1() {
        let paths = vec![
            vec_string!["a"],
            vec_string!["c"],
            vec_string!["d"],
            vec_string!["a", "b"],
            vec_string!["c", "b"],
            vec_string!["d", "a"],
        ];

        let result = vec![
            vec_string!["c"],
            vec_string!["c", "d"],
            vec_string!["a"],
            vec_string!["a", "b"],
        ];

        assert_eq!(Solution::delete_duplicate_folder(paths), result);
    }

    #[test]
    #[ignore]
    fn test_1948_example_2() {
        let paths = vec![
            vec_string!["a"],
            vec_string!["c"],
            vec_string!["a", "b"],
            vec_string!["c", "b"],
            vec_string!["a", "b", "x"],
            vec_string!["a", "b", "x", "y"],
            vec_string!["w"],
            vec_string!["w", "y"],
        ];

        let result = vec![
            vec_string!["c"],
            vec_string!["c", "b"],
            vec_string!["a"],
            vec_string!["a", "b"],
        ];

        assert_eq!(Solution::delete_duplicate_folder(paths), result);
    }

    #[test]
    #[ignore]
    fn test_1948_example_3() {
        let paths = vec![
            vec_string!["a", "b"],
            vec_string!["c", "d"],
            vec_string!["c"],
            vec_string!["a"],
        ];

        let result = vec![
            vec_string!["c"],
            vec_string!["c", "d"],
            vec_string!["a"],
            vec_string!["a", "b"],
        ];

        assert_eq!(Solution::delete_duplicate_folder(paths), result);
    }
}

// Accepted solution for LeetCode #1948: Delete Duplicate Folders in System
function deleteDuplicateFolder(paths: string[][]): string[][] {
    class Trie {
        children: { [key: string]: Trie } = {};
        deleted: boolean = false;
    }

    const root = new Trie();

    for (const path of paths) {
        let cur = root;
        for (const name of path) {
            if (!cur.children[name]) {
                cur.children[name] = new Trie();
            }
            cur = cur.children[name];
        }
    }

    const g: { [key: string]: Trie } = {};

    const dfs = (node: Trie): string => {
        if (Object.keys(node.children).length === 0) return '';

        const subs: string[] = [];
        for (const [name, child] of Object.entries(node.children)) {
            subs.push(`${name}(${dfs(child)})`);
        }
        subs.sort();
        const s = subs.join('');

        if (g[s]) {
            node.deleted = true;
            g[s].deleted = true;
        } else {
            g[s] = node;
        }
        return s;
    };

    dfs(root);

    const ans: string[][] = [];
    const path: string[] = [];

    const dfs2 = (node: Trie): void => {
        if (node.deleted) return;
        if (path.length > 0) {
            ans.push([...path]);
        }
        for (const [name, child] of Object.entries(node.children)) {
            path.push(name);
            dfs2(child);
            path.pop();
        }
    };

    dfs2(root);

    return ans;
}

Step 06

Interactive Study Demo

Use this to step through a reusable interview workflow for this problem.

Custom checkpoints (one per line)

Press Step or Run All to begin.

Step 07

Complexity Analysis

Time

O(L)

Space

O(N × L)

Approach Breakdown

HASH SET

O(N × L) time

O(N × L) space

Store all N words in a hash set. Each insert/lookup hashes the entire word of length L, giving O(L) per operation. Prefix queries require checking every stored word against the prefix — O(N × L) per prefix search. Space is O(N × L) for storing all characters.

TRIE

O(L) time

O(N × L) space

Each operation (insert, search, prefix) takes O(L) time where L is the word length — one node visited per character. Total space is bounded by the sum of all stored word lengths. Tries win over hash sets when you need prefix matching: O(L) prefix search vs. checking every stored word.

Shortcut: One node per character → O(L) per operation. Prefix queries are what make tries worthwhile.

Coach Notes

Common Mistakes

Review these before coding to avoid predictable interview regressions.

Off-by-one on range boundaries

Wrong move: Loop endpoints miss first/last candidate.

Usually fails on: Fails on minimal arrays and exact-boundary answers.

Fix: Re-derive loops from inclusive/exclusive ranges before coding.

Mutating counts without cleanup

Wrong move: Zero-count keys stay in map and break distinct/count constraints.

Usually fails on: Window/map size checks are consistently off by one.

Fix: Delete keys when count reaches zero.