Next: Regexp Example, Previous: Regexps, Up: Search
For the most part, ‘\’ followed by any character matches only that character. However, there are several exceptions: two-character sequences starting with ‘\’ that have special meanings. The second character in the sequence is always an ordinary character when used on its own. Here is a table of ‘\’ constructs.
Thus, ‘foo\|bar’ matches either ‘foo’ or ‘bar’ but no other string.
‘\|’ applies to the largest possible surrounding expressions. Only a surrounding ‘\( ... \)’ grouping can limit the grouping power of ‘\|’.
Full backtracking capability exists to handle multiple uses of ‘\|’.
This last application is not a consequence of the idea of a
parenthetical grouping; it is a separate feature that is assigned as a
second meaning to the same ‘\( ... \)’ construct. In practice
there is usually no conflict between the two meanings; when there is
a conflict, you can use a “shy” group.
After the end of a ‘\( ... \)’ construct, the matcher remembers the beginning and end of the text matched by that construct. Then, later on in the regular expression, you can use ‘\’ followed by the digit d to mean “match the same text matched the dth time by the ‘\( ... \)’ construct”.
The strings matching the first nine ‘\( ... \)’ constructs appearing in a regular expression are assigned numbers 1 through 9 in the order that the open-parentheses appear in the regular expression. So you can use ‘\1’ through ‘\9’ to refer to the text matched by the corresponding ‘\( ... \)’ constructs.
For example, ‘\(.*\)\1’ matches any newline-free string that is composed of two identical halves. The ‘\(.*\)’ matches the first half, which may be anything, but the ‘\1’ that follows must match the same exact text.
If a particular ‘\( ... \)’ construct matches more than once
(which can easily happen if it is followed by ‘*’), only the last
match is recorded.
‘\b’ matches at the beginning or end of the buffer
regardless of what text appears next to it.
The constructs that pertain to words and syntax are controlled by the setting of the syntax table. See Syntax Tables.