LeetCode #8 — Medium

String to Integer
(atoi)

A complete visual walkthrough of the one-pass character scanning approach — from messy edge cases to a clean state machine.

Solve on LeetCode
The Problem

Problem Statement

Implement the myAtoi(string s) function, which converts a string to a 32-bit signed integer.

The algorithm for myAtoi(string s) is as follows:

  1. Whitespace: Ignore any leading whitespace (" ").
  2. Signedness: Determine the sign by checking if the next character is '-' or '+', assuming positivity if neither present.
  3. Conversion: Read the integer by skipping leading zeros until a non-digit character is encountered or the end of the string is reached. If no digits were read, then the result is 0.
  4. Rounding: If the integer is out of the 32-bit signed integer range [-231, 231 - 1], then round the integer to remain in the range. Specifically, integers less than -231 should be rounded to -231, and integers greater than 231 - 1 should be rounded to 231 - 1.

Return the integer as the final result.

Example 1:

Input: s = "42"

Output: 42

Explanation:

The underlined characters are what is read in and the caret is the current reader position.
Step 1: "42" (no characters read because there is no leading whitespace)
         ^
Step 2: "42" (no characters read because there is neither a '-' nor '+')
         ^
Step 3: "42" ("42" is read in)
           ^

Example 2:

Input: s = " -042"

Output: -42

Explanation:

Step 1: "   -042" (leading whitespace is read and ignored)
            ^
Step 2: "   -042" ('-' is read, so the result should be negative)
             ^
Step 3: "   -042" ("042" is read in, leading zeros ignored in the result)
               ^

Example 3:

Input: s = "1337c0d3"

Output: 1337

Explanation:

Step 1: "1337c0d3" (no characters read because there is no leading whitespace)
         ^
Step 2: "1337c0d3" (no characters read because there is neither a '-' nor '+')
         ^
Step 3: "1337c0d3" ("1337" is read in; reading stops because the next character is a non-digit)
             ^

Example 4:

Input: s = "0-1"

Output: 0

Explanation:

Step 1: "0-1" (no characters read because there is no leading whitespace)
         ^
Step 2: "0-1" (no characters read because there is neither a '-' nor '+')
         ^
Step 3: "0-1" ("0" is read in; reading stops because the next character is a non-digit)
          ^

Example 5:

Input: s = "words and 987"

Output: 0

Explanation:

Reading stops at the first non-digit character 'w'.

Constraints:

  • 0 <= s.length <= 200
  • s consists of English letters (lower-case and upper-case), digits (0-9), ' ', '+', '-', and '.'.

Roadmap

  1. The Brute Force Temptation
  2. The Core Insight — A State Machine
  3. Algorithm Walkthrough
  4. Edge Cases
  5. Full Annotated Code
  6. Interactive Demo
  7. Complexity Analysis
Step 01

The Brute Force Temptation

Many developers reach for regex or try to handle every edge case with nested if-else blocks. Let's see why that gets ugly fast:

The "Kitchen Sink" Approach

// Attempt 1: regex (seems clean... until it isn't)
Pattern p = Pattern.compile("^\\s*([+-]?\\d+)");
// But wait — how do you handle overflow?
// Long.parseLong("99999999999999999999") throws!

// Attempt 2: strip + parse (multiple passes)
s = s.trim();                   // pass 1: strip whitespace
// check sign...                 // pass 2: inspect sign
// extract digits...              // pass 3: find digit span
// handle overflow somehow...     // ??? easy to get wrong

The problem is not parsing the happy path. It's handling all the edge cases simultaneously: leading whitespace, optional sign, leading zeros, overflow mid-parse, and non-digit characters appearing anywhere.

💡

The real challenge: You must detect overflow before it happens, during digit accumulation. You can't just parse and then clamp afterward because the intermediate value might already overflow your integer type.

Step 02

The Core Insight — A State Machine

Instead of fighting edge cases, think of the problem as a simple state machine with four states. Process the string character by character in a single left-to-right pass:

The State Machine

SKIP
WHITESPACE
READ
SIGN
READ
DIGITS
DONE

Each state either consumes characters and advances to the next, or transitions directly to DONE. There is no backtracking. The pointer moves forward exactly once per character.

Transition Rules

SKIP WHITESPACE
While s[i] == ' ', advance i++. Any other character transitions to the next state.
READ SIGN
If s[i] is + or -, record the sign and advance. This state consumes at most one character.
READ DIGITS
While s[i] is a digit, accumulate: result = result * 10 + digit. Check for overflow before each multiplication.
DONE
Multiply result by sign and return. Any non-digit character or end of string triggers this.

The overflow trick: Before computing result * 10 + digit, check if result > (MAX_VALUE - digit) / 10. If true, the next multiplication would overflow, so clamp immediately. This avoids ever storing a value that exceeds 32-bit bounds.

Step 03

Algorithm Walkthrough

Let's trace through the input " -42abc" character by character. Watch the pointer advance through each phase.

Input: " -42abc"

Whitespace
Sign
Digits
Stop
-
4
2
a
b
c
0
1
2
3
4
5
6
7
8

Step-by-Step Trace

i=0,1,2 : SKIP WHITESPACE
Characters at indices 0, 1, 2 are all ' '. Skip them. Pointer lands on i = 3.
i=3 : READ SIGN
Character is '-'. Set sign = -1. Advance to i = 4.
i=4 : READ DIGIT '4'
Overflow check: 0 > (2147483647 - 4) / 10? No. result = 0 * 10 + 4 = 4. Advance to i = 5.
i=5 : READ DIGIT '2'
Overflow check: 4 > (2147483647 - 2) / 10? No. result = 4 * 10 + 2 = 42. Advance to i = 6.
i=6 : STOP (non-digit 'a')
Character 'a' is not a digit. Exit the loop.
RETURN
result * sign = 42 * (-1) = -42
Step 04

Edge Cases

This problem is essentially an edge-case gauntlet. Here are all the tricky inputs and what the algorithm returns for each:

"42"
→ 42
Basic case, no tricks
" -42"
→ -42
Leading whitespace + negative sign
"+1"
→ 1
Explicit positive sign
"+-12"
→ 0
Second sign is not a digit, stops
"4193 with words"
→ 4193
Stops at first non-digit
"words and 987"
→ 0
First non-space is not digit/sign
"91283472332"
→ 2147483647
Overflow: clamp to INT_MAX
"-91283472332"
→ -2147483648
Underflow: clamp to INT_MIN
"0032"
→ 32
Leading zeros are just digits
""
→ 0
Empty string, nothing to read
" "
→ 0
Only whitespace, no digits found
"-"
→ 0
Sign only, no digits follow
🎯

No special cases needed. The beauty of the one-pass approach is that every edge case above is handled naturally by the same three-phase loop. There are no separate if branches for empty strings, sign-only inputs, or leading zeros.

Step 05

Full Annotated Code

class Solution {
    public int myAtoi(String s) {
        int i = 0, n = s.length();

        // -------- Phase 1: Skip leading whitespace --------
        while (i < n && s.charAt(i) == ' ') i++;
        if (i == n) return 0;          // entire string was spaces

        // -------- Phase 2: Read optional sign --------
        int sign = 1;
        if (s.charAt(i) == '-' || s.charAt(i) == '+') {
            sign = s.charAt(i) == '-' ? -1 : 1;
            i++;                          // consume the sign character
        }

        // -------- Phase 3: Read digits with overflow guard --------
        int result = 0;
        while (i < n && Character.isDigit(s.charAt(i))) {
            int digit = s.charAt(i) - '0';

            // Will result * 10 + digit overflow int?
            // Rearranged: result > (MAX_VALUE - digit) / 10
            if (result > (Integer.MAX_VALUE - digit) / 10) {
                return sign == 1
                    ? Integer.MAX_VALUE   // 2147483647
                    : Integer.MIN_VALUE;  // -2147483648
            }

            result = result * 10 + digit; // safe to accumulate
            i++;
        }

        return result * sign;
    }
}
💡

Why the overflow check works: We want to know if result * 10 + digit > 2147483647. Rearranging gives result > (2147483647 - digit) / 10. Since all values are positive integers and we're using integer division, this comparison is safe and never overflows itself.

Step 06

Interactive Demo

Enter any string and step through the atoi algorithm character by character. Watch the state machine in action.

⚙ atoi Step-by-Step Visualizer


Try:
Step 07

Complexity Analysis

Time
O(n)
Space
O(1)

We scan each character at most once in a single left-to-right pass. The pointer i never moves backward, and the overflow check (result > (MAX_VALUE - digit) / 10) is O(1) per digit. We use only a fixed number of variables (i, sign, result, digit) regardless of input size.

Approach Breakdown

REGEX / MULTI-PASS
O(n) time
O(n) space

A regex engine compiles the pattern and scans the string in O(n), but internally creates match objects and substring copies that allocate O(n) memory. Multi-pass approaches (trim, then find sign, then extract digits) also traverse the string multiple times, though still linear overall.

SINGLE PASS
O(n) time
O(1) space

A single left-to-right scan with a three-phase state machine (skip whitespace, read sign, accumulate digits) touches each character exactly once for O(n) time. Only a fixed number of integer variables (i, sign, result, digit) are used regardless of input length, giving O(1) space.

🎯

The complexity isn't the hard part here. Both approaches are O(n) in time. The real challenge is detecting overflow before it happens, using the rearranged check result > (MAX_VALUE - digit) / 10. That single line is the difference between a correct solution and a subtle bug.

Coach Notes

Common Mistakes

Review these before coding to avoid predictable interview regressions.

Overflow in intermediate arithmetic

Wrong move: Temporary multiplications exceed integer bounds.

Usually fails on: Large inputs wrap around unexpectedly.

Fix: Use wider types, modular arithmetic, or rearranged operations.