Quantifiers¶
Quantifiers specify how many times the preceding token or group must appear.
Quantifier Reference¶
| Quantifier | Meaning | Mode |
|---|---|---|
* |
0 or more | Greedy |
+ |
1 or more | Greedy |
? |
0 or 1 | Greedy |
{n} |
Exactly n times | Greedy |
{n,} |
n or more | Greedy |
{n,m} |
Between n and m | Greedy |
*? |
0 or more | Lazy (minimal) |
+? |
1 or more | Lazy (minimal) |
?? |
0 or 1 | Lazy (prefer skip) |
{n,m}? |
n to m | Lazy (minimal) |
*+ |
0 or more | Possessive (no backtrack) |
++ |
1 or more | Possessive (no backtrack) |
?+ |
0 or 1 | Possessive (no backtrack) |
Greedy, Lazy, and Possessive Compared¶
| Mode | Pattern | Result | Explanation |
|---|---|---|---|
| Greedy | ".*" |
"start" middle "end" |
Consumes max, one big match |
| Lazy | ".*?" |
"start" then "end" |
Two smallest matches |
| Preferred | "[^"]*" |
"start" then "end" |
Faster, no backtrack risk |
Best Practice
Prefer negated character classes ([^"]*, [^>]*) over lazy quantifiers (.*?) wherever the delimiter is a single known character.
They are faster, clearer, and cannot catastrophically backtrack.
Quantifier on Groups vs. Characters¶
\d{4} # exactly 4 digits
(?:ab){3} # "ababab"
(?:red|blue){2} # "redred", "redblue", "bluered", "blueblue"
-? # optional leading minus
\d+\.?\d* # simple decimal: 42, 3.14, 100.
Lazy ≠ Faster¶
Common Misconception
Lazy quantifiers are often assumed to be faster. They're not — they just find a different (smaller) match.
Both greedy and lazy do backtracking. The difference is which direction they explore the search tree.
Possessive quantifiers and atomic groups are the true performance win.
{n,m} Syntax Gotchas¶
{3} # exactly 3 — most engines
{3,} # 3 or more
{3,6} # 3 to 6 — NO space around comma (some engines reject it)
{0,1} # equivalent to ?
{1,} # equivalent to +
{0,} # equivalent to *
# In some engines, a {n} that looks like it can't be a quantifier
# is treated as a literal: "a{b}" might match literally "a{b}"
# Safest: always escape { } when you mean them literally
Possessive Quantifiers (PCRE / Java / Ruby)¶
Possessive quantifiers match as much as possible and never give characters back, even if the overall match fails. This eliminates the backtracking paths that cause exponential blowup.
\d++ # match digits possessively
(?:ab)++ # possessive on a group
[a-z]++[a-z] # will always fail if input is all-alpha — no backtrack
See Atomic Groups & Possessive and Catastrophic Backtracking for why this matters.