| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
awk Language
This Web page describes the GNU implementation of awk, which follows
the POSIX specification.
Many long-time awk users learned awk programming
with the original awk implementation in Version 7 Unix.
(This implementation was the basis for awk in Berkeley Unix,
through 4.3--Reno. Subsequent versions of Berkeley Unix, and systems
derived from 4.4BSD--Lite, use various versions of gawk
for their awk.)
This chapter briefly describes the
evolution of the awk language, with cross references to other parts
of the Web page where you can find more information.
A.1 Major Changes Between V7 and SVR3.1 The major changes between V7 and System V Release 3.1. A.2 Changes Between SVR3.1 and SVR4 Minor changes between System V Releases 3.1 and 4. A.3 Changes Between SVR4 and POSIX awkNew features from the POSIX standard. A.4 Extensions in the Bell Laboratories awkNew features from the Bell Laboratories version of awk.A.5 Extensions in gawkNot in POSIXawkThe extensions in gawknot in POSIXawk.
A.6 Major Contributors to gawkThe major contributors to gawk.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The awk language evolved considerably between the release of
Version 7 Unix (1978) and the new version that was first made generally available in
System V Release 3.1 (1987). This section summarizes the changes, with
cross-references to further details:
awk Statements Versus Lines).
return statement
(see section User-Defined Functions).
delete statement (see section The delete Statement).
do-while statement
(see section The do-while Statement).
atan2, cos, sin, rand, and
srand (see section 9.1.2 Numeric Functions).
gsub, sub, and match
(see section String Manipulation Functions).
close and system
(see section Input/Output Functions).
ARGC, ARGV, FNR, RLENGTH, RSTART,
and SUBSEP built-in variables (see section 7.5 Built-in Variables).
awk
programs (see section Operator Precedence (How Operators Nest)).
FS
(see section Specifying How Fields Are Separated) and as the
third argument to the split function
(see section String Manipulation Functions).
awk to
recognize `\b', `\f', and `\r', but this is not
something you can rely on.)
getline function
(see section Explicit Input with getline).
BEGIN and END rules
(see section The BEGIN and END Special Patterns).
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The System V Release 4 (1989) version of Unix awk added these features
(some of which originated in gawk):
ENVIRON variable (see section 7.5 Built-in Variables).
srand built-in function
(see section 9.1.2 Numeric Functions).
toupper and tolower built-in string functions
for case translation
(see section String Manipulation Functions).
printf function
(see section Format-Control Letters).
"%*.*d")
in the argument list of the printf function
(see section Format-Control Letters).
/foo/, as expressions, where
they are equivalent to using the matching operator, as in `$0 ~ /foo/'
(see section Using Regular Expression Constants).
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
awk
The POSIX Command Language and Utilities standard for awk (1992)
introduced the following changes into the language:
CONVFMT for controlling the conversion of numbers
to strings (see section Conversion of Strings and Numbers).
The following common extensions are not permitted by the POSIX standard:
\x escape sequences are not recognized
(see section 3.2 Escape Sequences).
FS is
equal to a single space
(see section Examining Fields).
func for the keyword function is not
recognized (see section Function Definition Syntax).
FS to be a single tab character
(see section Specifying How Fields Are Separated).
fflush built-in function is not supported
(see section Input/Output Functions).
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
awk
Brian Kernighan, one of the original designers of Unix awk,
has made his version available via his home page
(see section Other Freely Available awk Implementations).
This section describes extensions in his version of awk that are
not in POSIX awk.
awk no longer needs these options;
it continues to accept them to avoid breaking old programs.
fflush built-in function for flushing buffered output
(see section Input/Output Functions).
func as an abbreviation for function
(see section Function Definition Syntax).
The Bell Laboratories awk also incorporates the following extensions,
originally developed for gawk:
gawk).
FS and for the third
argument to split to be null strings
(see section Making Each Character a Separate Field).
nextfile statement
(see section Using gawk's nextfile Statement).
delete Statement).
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
gawk Not in POSIX awk
The GNU implementation, gawk, adds a large number of features.
This section lists them in the order they were added to gawk.
They can all be disabled with either the `--traditional' or
`--posix' options
(see section Command-Line Options).
Version 2.10 of gawk introduced the following features:
AWKPATH environment variable for specifying a path search for
the `-f' command-line option
(see section Command-Line Options).
IGNORECASE variable and its effects
(see section Case Sensitivity in Matching).
gawk).
Version 2.13 of gawk introduced the following features:
FIELDWIDTHS variable and its effects
(see section Reading Fixed-Width Data).
systime and strftime built-in functions for obtaining
and printing timestamps
(see section Using gawk's Timestamp Functions).
Version 2.14 of gawk introduced the following feature:
next file statement for skipping to the next data file
(see section Using gawk's nextfile Statement).
Version 2.15 of gawk introduced the following features:
ARGIND variable, which tracks the movement of FILENAME
through ARGV (see section 7.5 Built-in Variables).
ERRNO variable, which contains the system error message when
getline returns -1 or when close fails
(see section 7.5 Built-in Variables).
gawk).
delete Statement).
Version 3.0 of gawk introduced the following features:
IGNORECASE changed, now applying to string comparison as well
as regexp operations
(see section Case Sensitivity in Matching).
RT variable that contains the input text that
matched RS
(see section How Input Is Split into Records).
gensub function for more powerful text manipulation
(see section String Manipulation Functions).
strftime function acquired a default time format,
allowing it to be called with no arguments
(see section Using gawk's Timestamp Functions).
FS and for the third
argument to split to be null strings
(see section Making Each Character a Separate Field).
RS to be a regexp
(see section How Input Is Split into Records).
next file statement became nextfile
(see section Using gawk's nextfile Statement).
awk
(see section Major Changes Between V7 and SVR3.1).
fflush function from the
Bell Laboratories research version of awk
(see section Command-Line Options; also
see section Input/Output Functions).
gawk for Unix).
gawk on an Amiga).
Version 3.1 of gawk introduced the following features:
BINMODE special variable for non-POSIX systems,
which allows binary I/O for input and/or output files
(see section Using gawk on PC Operating Systems).
LINT special variable, which dynamically controls lint warnings
(see section 7.5 Built-in Variables).
PROCINFO array for providing process-related information
(see section 7.5 Built-in Variables).
TEXTDOMAIN special variable for setting an application's
internationalization text domain
(see section 7.5 Built-in Variables,
and
Internationalization with gawk).
awk
program source code
(see section Octal and Hexadecimal Numbers).
gawk for Network Programming).
close that allows closing one end
of a two-way pipe to a coprocess
(see section Two-Way Communications with Another Process).
match function
for capturing text-matching subexpressions within a regexp
(see section String Manipulation Functions).
printf formats for
making translations easier
(see section Rearranging printf Arguments).
asort function for sorting arrays
(see section Sorting Array Values and Indices with gawk).
bindtextdomain and dcgettext functions
for internationalization
(see section Internationalizing awk Programs).
extension built-in function and the ability to add
new built-in functions dynamically
(see section Adding New Built-in Functions to gawk).
mktime built-in function for creating timestamps
(see section Using gawk's Timestamp Functions).
and,
or,
xor,
compl,
lshift,
rshift,
and
strtonum built-in
functions
(see section Using gawk's Bit Manipulation Functions).
gawk's nextfile Statement).
pgawk, the
profiling version of gawk, for producing execution
profiles of awk programs
(see section Profiling Your awk Programs).
gawk with BSD Portals).
gawk for Unix).
gettext for gawk's own message output
(see section gawk Can Speak Your Language).
gawk on BeOS).
gawk on a Tandem).
gawk on the Atari ST).
ansi2knr to convert the code on systems with old compilers.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
gawk Always give credit where credit is due.
Anonymous
This section names the major contributors to gawk
and/or this Web page, in approximate chronological order:
awk,
from which gawk gets the majority of its feature set.
gawk.
gawk,
making it compatible with "new" awk, and
greatly improving its performance.
gawk to Cray systems.
gawk
works on non-32-bit systems.
extension
built-in function for dynamically adding new modules.
gawk to use
GNU Automake and gettext.
asort function
as well as the code for the new optional third argument to the match function.
gawk since 1988, at first
helping David Trueman, and as the primary maintainer since around 1994.
| [ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |