[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
awk
Language
This Web page describes the GNU implementation of awk
, which follows
the POSIX specification.
Many long-time awk
users learned awk
programming
with the original awk
implementation in Version 7 Unix.
(This implementation was the basis for awk
in Berkeley Unix,
through 4.3--Reno. Subsequent versions of Berkeley Unix, and systems
derived from 4.4BSD--Lite, use various versions of gawk
for their awk
.)
This chapter briefly describes the
evolution of the awk
language, with cross references to other parts
of the Web page where you can find more information.
A.1 Major Changes Between V7 and SVR3.1 The major changes between V7 and System V Release 3.1. A.2 Changes Between SVR3.1 and SVR4 Minor changes between System V Releases 3.1 and 4. A.3 Changes Between SVR4 and POSIX awk
New features from the POSIX standard. A.4 Extensions in the Bell Laboratories awk
New features from the Bell Laboratories version of awk
.A.5 Extensions in gawk
Not in POSIXawk
The extensions in gawk
not in POSIXawk
.
A.6 Major Contributors to gawk
The major contributors to gawk
.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The awk
language evolved considerably between the release of
Version 7 Unix (1978) and the new version that was first made generally available in
System V Release 3.1 (1987). This section summarizes the changes, with
cross-references to further details:
awk
Statements Versus Lines).
return
statement
(see section User-Defined Functions).
delete
statement (see section The delete
Statement).
do
-while
statement
(see section The do
-while
Statement).
atan2
, cos
, sin
, rand
, and
srand
(see section 9.1.2 Numeric Functions).
gsub
, sub
, and match
(see section String Manipulation Functions).
close
and system
(see section Input/Output Functions).
ARGC
, ARGV
, FNR
, RLENGTH
, RSTART
,
and SUBSEP
built-in variables (see section 7.5 Built-in Variables).
awk
programs (see section Operator Precedence (How Operators Nest)).
FS
(see section Specifying How Fields Are Separated) and as the
third argument to the split
function
(see section String Manipulation Functions).
awk
to
recognize `\b', `\f', and `\r', but this is not
something you can rely on.)
getline
function
(see section Explicit Input with getline
).
BEGIN
and END
rules
(see section The BEGIN
and END
Special Patterns).
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The System V Release 4 (1989) version of Unix awk
added these features
(some of which originated in gawk
):
ENVIRON
variable (see section 7.5 Built-in Variables).
srand
built-in function
(see section 9.1.2 Numeric Functions).
toupper
and tolower
built-in string functions
for case translation
(see section String Manipulation Functions).
printf
function
(see section Format-Control Letters).
"%*.*d"
)
in the argument list of the printf
function
(see section Format-Control Letters).
/foo/
, as expressions, where
they are equivalent to using the matching operator, as in `$0 ~ /foo/'
(see section Using Regular Expression Constants).
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
awk
The POSIX Command Language and Utilities standard for awk
(1992)
introduced the following changes into the language:
CONVFMT
for controlling the conversion of numbers
to strings (see section Conversion of Strings and Numbers).
The following common extensions are not permitted by the POSIX standard:
\x
escape sequences are not recognized
(see section 3.2 Escape Sequences).
FS
is
equal to a single space
(see section Examining Fields).
func
for the keyword function
is not
recognized (see section Function Definition Syntax).
FS
to be a single tab character
(see section Specifying How Fields Are Separated).
fflush
built-in function is not supported
(see section Input/Output Functions).
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
awk
Brian Kernighan, one of the original designers of Unix awk
,
has made his version available via his home page
(see section Other Freely Available awk
Implementations).
This section describes extensions in his version of awk
that are
not in POSIX awk
.
awk
no longer needs these options;
it continues to accept them to avoid breaking old programs.
fflush
built-in function for flushing buffered output
(see section Input/Output Functions).
func
as an abbreviation for function
(see section Function Definition Syntax).
The Bell Laboratories awk
also incorporates the following extensions,
originally developed for gawk
:
gawk
).
FS
and for the third
argument to split
to be null strings
(see section Making Each Character a Separate Field).
nextfile
statement
(see section Using gawk
's nextfile
Statement).
delete
Statement).
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
gawk
Not in POSIX awk
The GNU implementation, gawk
, adds a large number of features.
This section lists them in the order they were added to gawk
.
They can all be disabled with either the `--traditional' or
`--posix' options
(see section Command-Line Options).
Version 2.10 of gawk
introduced the following features:
AWKPATH
environment variable for specifying a path search for
the `-f' command-line option
(see section Command-Line Options).
IGNORECASE
variable and its effects
(see section Case Sensitivity in Matching).
gawk
).
Version 2.13 of gawk
introduced the following features:
FIELDWIDTHS
variable and its effects
(see section Reading Fixed-Width Data).
systime
and strftime
built-in functions for obtaining
and printing timestamps
(see section Using gawk
's Timestamp Functions).
Version 2.14 of gawk
introduced the following feature:
next file
statement for skipping to the next data file
(see section Using gawk
's nextfile
Statement).
Version 2.15 of gawk
introduced the following features:
ARGIND
variable, which tracks the movement of FILENAME
through ARGV
(see section 7.5 Built-in Variables).
ERRNO
variable, which contains the system error message when
getline
returns -1 or when close
fails
(see section 7.5 Built-in Variables).
gawk
).
delete
Statement).
Version 3.0 of gawk
introduced the following features:
IGNORECASE
changed, now applying to string comparison as well
as regexp operations
(see section Case Sensitivity in Matching).
RT
variable that contains the input text that
matched RS
(see section How Input Is Split into Records).
gensub
function for more powerful text manipulation
(see section String Manipulation Functions).
strftime
function acquired a default time format,
allowing it to be called with no arguments
(see section Using gawk
's Timestamp Functions).
FS
and for the third
argument to split
to be null strings
(see section Making Each Character a Separate Field).
RS
to be a regexp
(see section How Input Is Split into Records).
next file
statement became nextfile
(see section Using gawk
's nextfile
Statement).
awk
(see section Major Changes Between V7 and SVR3.1).
fflush
function from the
Bell Laboratories research version of awk
(see section Command-Line Options; also
see section Input/Output Functions).
gawk
for Unix).
gawk
on an Amiga).
Version 3.1 of gawk
introduced the following features:
BINMODE
special variable for non-POSIX systems,
which allows binary I/O for input and/or output files
(see section Using gawk
on PC Operating Systems).
LINT
special variable, which dynamically controls lint warnings
(see section 7.5 Built-in Variables).
PROCINFO
array for providing process-related information
(see section 7.5 Built-in Variables).
TEXTDOMAIN
special variable for setting an application's
internationalization text domain
(see section 7.5 Built-in Variables,
and
Internationalization with gawk
).
awk
program source code
(see section Octal and Hexadecimal Numbers).
gawk
for Network Programming).
close
that allows closing one end
of a two-way pipe to a coprocess
(see section Two-Way Communications with Another Process).
match
function
for capturing text-matching subexpressions within a regexp
(see section String Manipulation Functions).
printf
formats for
making translations easier
(see section Rearranging printf
Arguments).
asort
function for sorting arrays
(see section Sorting Array Values and Indices with gawk
).
bindtextdomain
and dcgettext
functions
for internationalization
(see section Internationalizing awk
Programs).
extension
built-in function and the ability to add
new built-in functions dynamically
(see section Adding New Built-in Functions to gawk
).
mktime
built-in function for creating timestamps
(see section Using gawk
's Timestamp Functions).
and
,
or
,
xor
,
compl
,
lshift
,
rshift
,
and
strtonum
built-in
functions
(see section Using gawk
's Bit Manipulation Functions).
gawk
's nextfile
Statement).
pgawk
, the
profiling version of gawk
, for producing execution
profiles of awk
programs
(see section Profiling Your awk
Programs).
gawk
with BSD Portals).
gawk
for Unix).
gettext
for gawk
's own message output
(see section gawk
Can Speak Your Language).
gawk
on BeOS).
gawk
on a Tandem).
gawk
on the Atari ST).
ansi2knr
to convert the code on systems with old compilers.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
gawk
Always give credit where credit is due.
Anonymous
This section names the major contributors to gawk
and/or this Web page, in approximate chronological order:
awk
,
from which gawk
gets the majority of its feature set.
gawk
.
gawk
,
making it compatible with "new" awk
, and
greatly improving its performance.
gawk
to Cray systems.
gawk
works on non-32-bit systems.
extension
built-in function for dynamically adding new modules.
gawk
to use
GNU Automake and gettext
.
asort
function
as well as the code for the new optional third argument to the match
function.
gawk
since 1988, at first
helping David Trueman, and as the primary maintainer since around 1994.
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |