Matching a single character |
Characters that otherwise have special regexp meanings |
| \ |
Precedes characters that have a special meaning: \. \+ \* \? \| \{ \( \[ \^ \$ |
Characters that need to be written in a special way |
| \t |
The tab character |
| \n |
The newline (line feed) character |
| \r |
The carriage-return character |
| \f |
The form-feed character |
Matching a single character with a predefined character class |
| . |
Any character (may or may not match line terminators) |
| \d |
A digit: [0-9] |
| \D |
A non-digit: [^0-9] |
| \s |
A whitespace character: [ \t\n\x0B\f\r] |
| \S |
A non-whitespace character: [^\s] |
| \w |
A word character: [a-zA-Z_0-9] |
| \W |
A non-word character: [^\w] |
Defining Character classes (match one character) |
| Character classes provide a way to specify a set of characters.
The class specification is enclosed in [].
The set can also be expressed by what must
not be in it by beginning the set with a caret, "^".
Minus, "-", can be used to indicate
a range of character values. Altho a character class matches only one character,
a quantifier following it can be used to match multiple characters. |
| [abc] |
a, b, or c (simple class) |
| [^abc] |
Any character except a, b, or c (negation) |
| [a-zA-Z] |
a through z
or A through Z, inclusive (range) |
Position and Boundary patterns (match zero characters) |
| ^ |
The beginning of a line. Very useful. |
| $ |
The end of a line. Very userful. ^$ matches all emtpy lines. |
| \b |
A word boundary |
| \B |
A non-word boundary |
| \A |
The beginning of the input |
| \G |
The end of the previous match |
| \Z |
The end of the input but for the final
terminator, if any |
| \z |
The end of the input |
Quantifiers (repeating the previous element) |
| |
| Greedy quantifiers - Expand as much as possible |
| X? |
X, once or not at all |
| X* |
X, zero or more times |
| X+ |
X, one or more times |
| X{n} |
X, exactly n times |
| X{n,} |
X, at least n times |
| X{n,m} |
X, at least n but not more than m times |
| |
| Reluctant quantifiers - Expand only if forced by later failure to match |
| X?? |
X, once or not at all |
| X*? |
X, zero or more times |
| X+? |
X, one or more times |
| X{n}? |
X, exactly n times |
| X{n,}? |
X, at least n times |
| X{n,m}? |
X, at least n but not more than m times |
Other |
| |
| Alternation |
| X|Y |
Tries matching X first, if that doesn't work, tries Y |
| |
| Grouping - Parentheses both group and create a numbered element that can be used later. |
| (X) |
X. This capturing group is remembered so it can be referenced later. Numbered starting at 1. |