Regex Lookahead and Lookbehind: Complete Examples and Patterns

Quick answer

💡A lookahead (?=...) asserts that the pattern inside matches at the current position without consuming characters. Use it to enforce conditions on a string without changing what the overall match captures. Negative lookahead (?!...) asserts the opposite: the position must not be followed by the pattern. Both forms are zero-width — they never advance the match cursor.

Error symptoms

  • Pattern matches a substring when you expected a full-string validation failure
  • Lookbehind throws 'Invalid regular expression' in Safari or older Node.js
  • Password regex accepts strings missing a required character class
  • Pattern hangs or causes CPU spike on certain inputs (catastrophic backtracking)
  • Lookahead pattern that works in Regex101 fails with Python re.error: look-behind requires fixed width
  • Global regex with lookahead keeps advancing lastIndex and producing wrong results

Common causes

  • Missing ^ and $ anchors so the lookahead passes on a substring of an invalid string
  • Variable-width lookbehind used in JavaScript or Python re, which only support fixed-width lookbehind
  • Multiple nested quantifiers inside a lookahead enabling catastrophic backtracking
  • Copying a PCRE lookbehind pattern into a RE2-based environment like Google RE2 or Go regexp
  • Lookahead written without an anchor so it matches the empty string at every position
  • Not using the global flag correctly with stateful RegExp.prototype.test() in JavaScript

When it happens

  • Validating passwords with multiple character-class requirements
  • Matching filenames or extensions without including the suffix in the captured group
  • Porting patterns from Regex101 (PCRE) to JavaScript or Python
  • Running find-and-replace in VS Code, Sublime Text, or sed with lookahead conditions
  • Building server-side input validation that must match browser-side validation behavior

Positive lookahead (?=...) syntax and how it matches without consuming

A positive lookahead is written as (?=pattern) and asserts that the enclosed pattern matches at the current position in the string, without advancing the match cursor. The term zero-width refers to this non-consuming property: the regex engine checks whether the lookahead pattern could match starting from the current position, then immediately resets the cursor to where it was before the lookahead began. The text that the lookahead examined is not part of the overall match and will not be included in capture groups unless it is also matched by the rest of the expression.

The most common use of positive lookahead is enforcing constraints that cannot be expressed by pure sequential matching. Consider matching a word only when it is followed by a colon, but without including the colon in the match: \w+(?=:). The \w+ part consumes the word characters, and (?=:) asserts that the next character is a colon without consuming it. If you wrote \w+: instead, the colon would be part of the match, which may be undesirable when you want to extract only the word.

Lookaheads can be chained to enforce multiple independent conditions simultaneously. The pattern (?=.*[A-Z])(?=.*[0-9]).{8,} enforces that the string contains at least one uppercase letter, at least one digit, and is at least eight characters long. Each lookahead runs independently from the same starting position, which is what makes them composable: each (?=.*[class]) scans the entire remaining string for one required character class without affecting the others.

The key syntactic rule is that lookaheads must be anchored relative to a fixed position to work as intended. Without anchoring, a lookahead like (?=.*[A-Z]) alone would match the empty string at every position in the string, because .* can match zero characters and then [A-Z] could potentially match wherever the cursor is. Pairing lookaheads with ^ and $ anchors ensures the entire string is evaluated: ^(?=.*[A-Z])(?=.*[0-9]).{8,}$ validates the full input rather than a substring. Missing anchors are the single most common reason a regex appears to validate correctly in a tester but allows invalid inputs in production.

Lookaheads work identically across all major regex flavors — PCRE, JavaScript, Python re, and RE2 — for the basic form. This is one of their advantages: a pattern using only positive lookaheads is maximally portable. The portability differences begin with lookbehind assertions and variable-width patterns, which are addressed in later sections.

Always anchor full-string validation patterns

Without ^ at the start and $ at the end, a lookahead-based validation pattern can pass by matching a valid substring inside an invalid string. Every password, email, or URL validation pattern should be anchored to force evaluation of the complete input.

Negative lookahead (?!...) and lookahead anchoring techniques

A negative lookahead (?!pattern) is the logical complement of positive lookahead: it succeeds only when the enclosed pattern does not match at the current position. Like positive lookahead, it is zero-width and does not consume characters. Negative lookaheads are particularly useful for exclusion patterns — matching something that is not followed by something else — which are awkward or impossible to express with standard character classes or quantifiers.

A practical example is matching a word that is not followed by a specific suffix. To match color but not colour, the pattern color(?!u) checks that after 'color', the next character is not 'u'. Another common use is matching a keyword only when it does not appear as part of a longer identifier: \bif(?!\w) matches the keyword 'if' only when it is not immediately followed by another word character. This is more precise than a word boundary alone in contexts where the surrounding text uses non-word characters as separators.

Negative lookaheads can also be combined: (?!word1)(?!word2) asserts that the current position is not the start of either word1 or word2. A pattern like (?!null|undefined|NaN)\w+ matches any word-like string except those three JavaScript special values. The two negative lookaheads run independently from the same position, so both conditions must be satisfied for the overall match to proceed.

Anchoring is critical for both positive and negative lookaheads, but the failure mode differs. With a missing anchor and a positive lookahead, the regex may match unintended substrings. With a missing anchor and a negative lookahead, the regex may match an unexpected position in the string. Consider (?!.*\.exe)\w+: without a ^ anchor, this pattern would still match the word portion of a filename like report.exe by starting the match from a position after the dot. With ^(?!.*\.exe) at the beginning, the entire string is rejected if it contains .exe anywhere.

The technique of placing the negative lookahead at the very start, before any consuming pattern, is called a leading exclusion lookahead. It is a clean way to express rejections at the string level: ^(?!.*forbidden_pattern).*$ rejects any string containing the forbidden pattern, and the rest of the expression matches whatever is left. This is particularly useful in configuration validation and access-control rule patterns where the primary concern is exclusion of bad inputs rather than the positive structure of good ones.

Example 1

Enforce uppercase, lowercase, digit, and special character requirements with independent lookaheads and proper anchoring.

Password validation with chained positive lookaheads

❌ Wrong

// Missing anchors — matches a valid substring inside an invalid string
const re = /(?=.*[A-Z])(?=.*[a-z])(?=.*[0-9]).{8,}/;

// 'aaBB11!!INVALID PASSWORD WITH SPACES' passes because
// the regex matches the valid substring 'aaBB11!!'
console.log(re.test('aaBB11!!INVALID PASSWORD WITH SPACES')); // true
console.log(re.test('short'));                                // false (good)
console.log(re.test('nouppercase1!'));                        // false (good)
console.log(re.test('NoDigitHere!!'));                        // false (good)

✅ Fixed

// Anchored, length-bounded, with special character requirement
const re = /^(?=.*[A-Z])(?=.*[a-z])(?=.*[0-9])(?=.*[!@#$%^&*]).{12,128}$/;

// Now the full string must satisfy all conditions
console.log(re.test('aaBB11!!INVALID PASSWORD WITH SPACES')); // false
console.log(re.test('Dock2026!Secure'));                       // true
console.log(re.test('short1A!'));                              // false (< 12 chars)
console.log(re.test('nouppercase1!longer'));                   // false (no uppercase)

// For user-facing validation, test each requirement separately
const checks = [
  [/[A-Z]/, 'at least one uppercase letter'],
  [/[a-z]/, 'at least one lowercase letter'],
  [/[0-9]/, 'at least one digit'],
  [/[!@#$%^&*]/, 'at least one special character (!@#$%^&*)'],
  [/.{12,128}/, 'between 12 and 128 characters']
];
const failures = checks
  .filter(([pattern]) => !pattern.test(password))
  .map(([, msg]) => msg);

Without ^ and $ anchors, the regex engine scans for a matching substring anywhere in the string. A password with spaces passes because the engine finds a valid 8-character window inside it. The fixed version anchors the pattern to the full string and adds a maximum length of 128 characters to prevent multi-megabyte inputs from consuming excessive hashing resources. The per-requirement checks produce specific error messages for users.

Lookbehind assertions (?<=...) and (?<!...) with browser compatibility

Lookbehind assertions are the reverse of lookaheads: they assert conditions about what appears before the current position rather than after it. Positive lookbehind (?<=pattern) succeeds when the text immediately preceding the current position matches the enclosed pattern. Negative lookbehind (?<!pattern) succeeds when it does not. Like lookaheads, both forms are zero-width — they check the text behind the cursor without consuming it or including it in the match.

A classic use case for positive lookbehind is extracting a value that follows a known prefix, without including the prefix in the capture. The pattern (?<=price: )\d+ matches a number that is immediately preceded by the literal text 'price: ', capturing only the digits. Without lookbehind, you would need a capturing group: (price: )(\d+) and then extract group 2, which is more verbose and changes the match structure.

Negative lookbehind is useful for rejecting patterns that appear in certain contexts. The pattern (?<!\d)\d{3}(?!\d) matches exactly three consecutive digits that are not part of a longer number — for example, finding area codes in text where they are not surrounded by other digits.

The critical compatibility constraint is that JavaScript's regex engine (V8 in Node.js and Chrome, SpiderMonkey in Firefox, JavaScriptCore in Safari) added lookbehind support at different times. V8 added lookbehind support in 2017 (Node.js 8 and Chrome 62). JavaScriptCore in Safari added it in Safari 16.4, released in March 2023. If you are writing frontend JavaScript that must support Safari versions before 16.4, lookbehind assertions will throw a SyntaxError at the time the RegExp object is constructed — not at match time. The page will break silently or noisily depending on your error handling.

Python's re module supports lookbehind but only for fixed-width patterns. The pattern (?<=ab|abc)x would fail with re.error: look-behind requires fixed width because ab and abc have different lengths. Python's regex module (a third-party package) supports variable-width lookbehind. PCRE, Perl, and .NET support variable-width lookbehind. JavaScript supports variable-width lookbehind in modern environments (V8 2017 onward). RE2 (used in Google's infrastructure and Go's regexp package) does not support lookbehind at all because it guarantees linear-time matching without backtracking, and lookbehind complicates that guarantee.

For maximum portability, prefer reformulating patterns to use lookahead instead of lookbehind where possible. If you need lookbehind for a browser-facing feature, add a feature-detection guard and a fallback. For Node.js-only code targeting Node.js 8 or newer, lookbehind is safe. For Go or any RE2-based environment, restructure the pattern entirely.

Check browser compatibility before using lookbehind

Lookbehind assertions throw a SyntaxError in Safari before version 16.4 (March 2023). If your frontend JavaScript must support older Safari, use a capturing group instead and extract the relevant group index, or add a try-catch feature detection block that falls back to a non-lookbehind pattern.

Combining multiple lookaheads for password validation patterns

Password validation is the canonical use case for multiple chained positive lookaheads. A typical enterprise password policy requires a minimum length, at least one uppercase letter, at least one lowercase letter, at least one digit, and at least one special character. Expressing all five constraints in a single regex using only linear matching is either impossible or produces an unreadably complex pattern with factorial alternations. Lookaheads make it declarative and composable.

The pattern for a password with all five requirements is: ^(?=.*[A-Z])(?=.*[a-z])(?=.*[0-9])(?=.*[!@#$%^&*]).{12,128}$. Breaking this down: the ^ anchor starts at the beginning. Each (?=.*[class]) lookahead scans the full remaining string for one required character class without consuming anything. The .{12,128} then consumes the entire string, enforcing the length constraint. The $ anchor ensures the full string is consumed, not just a prefix. All five assertions must pass simultaneously at position 0 for the full match to succeed.

Each lookahead is independent and can be modified, removed, or extended without affecting the others. To add a requirement that the password not start with a digit, add (?![0-9]) immediately after ^: ^(?![0-9])(?=.*[A-Z])(?=.*[a-z]).... To add a requirement that the password not contain spaces, add (?!.* ): ^(?!.* )(?=.*[A-Z]).... This composability makes the pattern easy to update to match policy changes, in contrast to a manually enumerated alternation pattern.

One important practical consideration is the character class for special characters. The pattern [!@#$%^&*] enumerates specific allowed characters, which may not match your policy definition. A broader class like [^A-Za-z0-9] (any character that is not a letter or digit) is more permissive and avoids the need to enumerate every allowed symbol. However, some policies explicitly forbid certain special characters (for example, characters that have meaning in shell scripts or SQL), in which case explicit enumeration is the right approach.

For user-facing validation, the pattern should be split into individual checks that can provide specific feedback: 'Password must contain at least one uppercase letter', rather than a generic 'Invalid password'. Run each lookahead as a separate regex test and collect which ones fail, then surface the specific requirements that are not met. The combined regex is appropriate for a final gate check, but individual pattern tests produce better error messages for users trying to construct a compliant password.

Always set a maximum length in the character class at the end. The pattern .{12,128} enforces both a minimum of 12 and a maximum of 128 characters. Without an upper bound, a user could submit a multi-megabyte string that passes validation and consumes excessive server resources during hashing. NIST SP 800-63B recommends allowing passwords up to at least 64 characters; many implementations cap at 128 or 256 characters to limit hashing cost.

Example 2

Use positive lookbehind to match a value that follows a known prefix, without including the prefix in the match.

Lookbehind to extract a value without capturing the prefix

❌ Wrong

// Without lookbehind: must capture prefix in group 1 and discard it
const re = /(version: )(\d+\.\d+\.\d+)/;
const match = 'package version: 2.14.0 released'.match(re);
const version = match ? match[2] : null;
// Group indexing is fragile — adding a group changes group numbers

# Python version with the same fragility
import re
m = re.search(r'(version: )(\d+\.\d+\.\d+)', text)
version = m.group(2) if m else None

✅ Fixed

// With lookbehind: match captures only the version number
const re = /(?<=version: )\d+\.\d+\.\d+/;
const match = 'package version: 2.14.0 released'.match(re);
const version = match ? match[0] : null;
// match[0] is always the full match — no fragile group indexing

# Python: lookbehind with fixed-width prefix (required by re module)
import re
m = re.search(r'(?<=version: )\d+\.\d+\.\d+', text)
version = m.group(0) if m else None

# Note: Safari < 16.4 does not support lookbehind.
# For browser-compatible code, use a capturing group instead:
const safere = /version: (\d+\.\d+\.\d+)/;
const safeversion = text.match(safere)?.[1];

Using a capturing group to consume the prefix requires referencing the correct group index, which breaks if a group is added anywhere in the pattern. Lookbehind extracts only the target value into match[0], making the pattern more robust. The browser compatibility note is important: Safari before version 16.4 (March 2023) throws a SyntaxError on lookbehind patterns, so include a feature detection guard for frontend code.

ReDoS: when lookaheads lead to catastrophic backtracking

Regular expression denial of service (ReDoS) is a class of vulnerability where certain regex patterns take exponential time to evaluate against crafted inputs. Lookaheads are not inherently vulnerable, but they participate in backtracking in ways that can create severe performance problems when combined with nested quantifiers over overlapping character sets.

Catastrophic backtracking occurs when the regex engine can reach the same position in the string via many different combinations of quantifier choices, and those paths are explored exhaustively before determining a non-match. The classic example is ^(a+)+$ against a string like aaaaaaaaaaX. The outer + and the inner a+ can split the 'a' characters in exponentially many ways, and the engine must try all of them before failing at the X. With a lookahead like ^(?=.*a)(a+)+$, the lookahead itself is not the source of the exponential behavior, but the consuming pattern after it inherits the same backtracking problem.

Lookaheads become a direct participant in ReDoS when they contain their own nested quantifiers over overlapping sets. The pattern ^(?=(a*)+b) is vulnerable because a* inside the quantified group creates the same exponential ambiguity. An input like aaaaaaaaaaaaaaac triggers the engine to try every way the a characters could be divided across iterations of the outer + before concluding the lookahead fails. Against a 20-character string of 'a's followed by 'c', this can take millions of steps.

The primary defense is to avoid nesting quantifiers over overlapping character classes inside lookaheads. Replace (a+)+ with a+ or a{n,m} with specific bounds. Use atomic groups or possessive quantifiers where your regex engine supports them — these prevent the engine from revisiting quantifier choices once they are committed. JavaScript and Python re do not support atomic groups or possessive quantifiers in their built-in engines, but Python's regex module and PCRE do.

For any user-controlled input in a production system, validate regex complexity before deployment. Tools like safe-regex (Node.js), reDOS detector, and the regex testing built into some linters can identify patterns with exponential worst-case time. For maximum safety, use a linear-time regex engine: RE2 (Go's regexp package, Google's RE2 library via C++ bindings, or the re2 Python package) guarantees O(n) matching time for all patterns but cannot support backreferences or lookbehind assertions. For lookahead-heavy validation patterns, test with inputs consisting of 50, 100, and 200 characters from the problematic character class to check for performance degradation before deploying.

Lookahead in find-and-replace workflows across editors and tools

Lookaheads are particularly powerful in find-and-replace workflows because they allow conditions to be placed around a replacement target without including the surrounding text in the replacement. In VS Code, Sublime Text, JetBrains IDEs, and sed with Perl-compatible regex, you can use lookaheads and lookbehinds in the search pattern to scope replacements with precision.

A common use case is adding a prefix or suffix to a term only when it appears in a specific context. In VS Code using PCRE2 regex mode, the search pattern (?<=function )\w+ matches a function name only when preceded by the keyword function. The replacement string then captures and transforms just the matched function name, leaving the keyword in place. Without lookbehind, you would need to capture the keyword in a group and include it in the replacement string: (function )(\w+) with the replacement $1newprefix_$2.

Negative lookahead enables selective replacement — replacing a term in most contexts while leaving it unchanged in specific contexts. The search pattern color(?!s|ed|ing) matches 'color' only when not followed by a plural or inflection suffix, useful for replacing standalone occurrences without affecting 'colors', 'colored', or 'coloring'. In a large codebase refactoring, this kind of precision prevents unintended replacements that would break neighboring words.

In sed, lookahead support depends on the implementation. GNU sed (the version on Linux) supports PCRE when invoked with the -P flag: sed -P 's/(?<=prefix_)\w+/replacement/g'. BSD sed on macOS does not support PCRE lookaheads even with -E. For macOS workflows requiring lookaheads, install GNU sed via Homebrew (brew install gnu-sed) and invoke it as gsed. This is a frequent source of cross-platform scripting failures when developers write regex-heavy sed commands on Linux that break on macOS CI agents.

In grep, the -P flag enables PCRE mode with lookahead support on GNU grep (Linux). BSD grep on macOS does not support -P. Use pcregrep (from the pcre package) on macOS, or use ripgrep (rg), which uses Rust's regex crate — but note that ripgrep's default engine does not support lookahead. The rg --pcre2 flag enables a PCRE2 engine in ripgrep that supports lookaheads, lookbehinds, and backreferences.

For scripted find-and-replace in large codebases, consider using a dedicated tool like ast-grep (which operates on syntax trees rather than text patterns) or a language-specific transformation tool. Text-based regex replacement is error-prone in code files because identifiers, strings, and comments can all match the same pattern with different intended effects. Lookaheads help narrow the scope, but they cannot distinguish semantic contexts the way a syntax-aware tool can.

Frequently asked questions

What does zero-width mean for a lookahead assertion?

Zero-width means the lookahead checks whether the enclosed pattern could match at the current position without advancing the match cursor. The characters examined by the lookahead are not consumed and are not part of the final match result. This allows the same characters to be checked by both the lookahead and the consuming part of the pattern, which is what makes multiple chained lookaheads work from the same starting position.

Why does my password regex accept invalid passwords when tested against real inputs?

The most common cause is missing ^ and $ anchors. Without anchors, the regex engine searches for a matching substring anywhere in the input string. An 8-character valid segment inside a 50-character invalid string will satisfy the lookaheads, and the match will succeed. Add ^ at the start and $ at the end to force the entire string to satisfy all lookahead conditions.

Does lookbehind work in all browsers?

Lookbehind assertions work in Chrome and Node.js (V8) since 2017, and in Firefox for a similar timeframe. Safari added lookbehind support in version 16.4, released March 2023. For production frontend JavaScript that must support older Safari versions, replace lookbehind with a capturing group and extract the relevant group index, or use a feature detection guard with a fallback.

What is the difference between a lookahead and a capturing group?

A capturing group (pattern) matches and consumes the text, including it in the overall match and making it accessible as a numbered group in the result. A lookahead (?=pattern) checks whether the pattern could match at the current position without consuming text and without creating a capture group. Lookaheads are used when you need a condition rather than a capture — enforcing a requirement without including the checked text in the match result.

Can lookaheads cause performance problems (ReDoS)?

Lookaheads themselves are not inherently slow, but they can participate in catastrophic backtracking when they contain nested quantifiers over overlapping character classes. A pattern like (?=(a*)+b) is vulnerable because the nested quantifiers create exponential ambiguity. The defense is to avoid nesting quantifiers inside lookaheads, use possessive quantifiers or atomic groups where available, and test patterns against long strings of the problematic character class before deploying.

Does Python's re module support variable-width lookbehind?

No. Python's built-in re module only supports fixed-width lookbehind assertions, meaning all alternatives inside the lookbehind must have the same length. A lookbehind like (?<=ab|abc) fails with re.error: look-behind requires fixed width. Install the third-party regex module (pip install regex) for variable-width lookbehind support. JavaScript's V8 engine and PCRE both support variable-width lookbehind in modern versions.

How do I use lookaheads in VS Code find and replace?

Enable regex mode in the search panel (click the .* icon or press Alt+R). VS Code uses JavaScript regex (with PCRE2 in some extension contexts), so positive and negative lookaheads work in both the search and replace fields. Use (?=...) in the search field to match text followed by a condition without including the condition in the match. Use lookbehind (?<=...) to match text preceded by a context. The replacement field supports capture group references like $1 but not lookaheads.

Do lookaheads work in grep and sed?

Standard GNU grep and GNU sed require the -P flag to enable PCRE mode, which supports lookaheads. BSD grep and BSD sed on macOS do not support -P. On macOS, use pcregrep for grep-like lookahead matching, or install GNU sed via Homebrew (brew install gnu-sed) and invoke it as gsed. The ripgrep tool supports lookaheads when invoked with --pcre2, but its default Rust regex engine does not support lookahead.

What is the difference between (?=...) and (?:...) in regex?

(?:...) is a non-capturing group: it groups pattern elements together without creating a numbered capture group, but it does consume the matched text. (?=...) is a positive lookahead: it checks that the pattern matches at the current position without consuming any text and without creating a capture group. Use (?:...) for grouping and quantification; use (?=...) for asserting a condition ahead of the current position.

Related guides

All tools run in your browser. Your data never leaves your device. Last updated: 2026-05-07.