When used on the right hand side of the `~' or `!~' operators, a regexp constant merely stands for the regexp that is to be matched.
Regexp constants (such as /foo/
) may be used like simple expressions.
When a
regexp constant appears by itself, it has the same meaning as if it appeared
in a pattern, i.e. `($0 ~ /foo/)' (d.c.)
(see section Expressions as Patterns).
This means that the two code segments,
if ($0 ~ /barfly/ || $0 ~ /camelot/) print "found"
and
if (/barfly/ || /camelot/) print "found"
are exactly equivalent.
One rather bizarre consequence of this rule is that the following boolean expression is valid, but does not do what the user probably intended:
# note that /foo/ is on the left of the ~ if (/foo/ ~ $1) print "found foo"
This code is "obviously" testing $1
for a match against the regexp
/foo/
. But in fact, the expression `/foo/ ~ $1' actually means
`($0 ~ /foo/) ~ $1'. In other words, first match the input record
against the regexp /foo/
. The result will be either zero or one,
depending upon the success or failure of the match. Then match that result
against the first field in the record.
Since it is unlikely that you would ever really wish to make this kind of
test, gawk
will issue a warning when it sees this construct in
a program.
Another consequence of this rule is that the assignment statement
matches = /foo/
will assign either zero or one to the variable matches
, depending
upon the contents of the current input record.
This feature of the language was never well documented until the POSIX specification.
Constant regular expressions are also used as the first argument for
the gensub
, sub
and gsub
functions, and as the
second argument of the match
function
(see section Built-in Functions for String Manipulation).
Modern implementations of awk
, including gawk
, allow
the third argument of split
to be a regexp constant, while some
older implementations do not (d.c.).
This can lead to confusion when attempting to use regexp constants as arguments to user defined functions (see section User-defined Functions). For example:
function mysub(pat, repl, str, global) { if (global) gsub(pat, repl, str) else sub(pat, repl, str) return str } { ... text = "hi! hi yourself!" mysub(/hi/, "howdy", text, 1) ... }
In this example, the programmer wishes to pass a regexp constant to the
user-defined function mysub
, which will in turn pass it on to
either sub
or gsub
. However, what really happens is that
the pat
parameter will be either one or zero, depending upon whether
or not $0
matches /hi/
.
As it is unlikely that you would ever really wish to pass a truth value
in this way, gawk
will issue a warning when it sees a regexp
constant used as a parameter to a user-defined function.
Go to the first, previous, next, last section, table of contents.