gawk
This appendix provides instructions for installing gawk
on the
various platforms that are supported by the developers. The primary
developers support Unix (and one day, GNU), while the other ports were
contributed. The file `ACKNOWLEDGMENT' in the gawk
distribution lists the electronic mail addresses of the people who did
the respective ports, and they are also provided in
section Reporting Problems and Bugs.
gawk
Distribution
This section first describes how to get the gawk
distribution, how to extract it, and then what is in the various files and
subdirectories.
gawk
DistributionThere are three ways you can get GNU software.
gawk
directly from the Free Software Foundation.
Software distributions are available for Unix, MS-DOS, and VMS, on
tape and CD-ROM. The address is:
Ordering from the FSF directly contributes to the support of the foundation and to the production of more free software.Free Software Foundation
59 Temple Place--Suite 330
Boston, MA 02111-1307 USA
Phone: +1-617-542-5942
Fax (including Japan): +1-617-542-2652
E-mail:[email protected]
gawk
by using anonymous ftp
to the Internet host
ftp.gnu.ai.mit.edu
, in the directory `/pub/gnu'.
Here is a list of alternate ftp
sites from which you can obtain GNU
software. When a site is listed as "site:
directory" the
directory indicates the directory where GNU software is kept.
You should use a site that is geographically close to you.
cair-archive.kaist.ac.kr:/pub/gnu
ftp.cs.titech.ac.jp
ftp.nectec.or.th:/pub/mirrors/gnu
utsun.s.u-tokyo.ac.jp:/ftpsync/prep
archie.au:/gnu
archie.oz
or archie.oz.au
for ACSnet)
ftp.sun.ac.za:/pub/gnu
ftp.technion.ac.il:/pub/unsupported/gnu
archive.eu.net
ftp.denet.dk
ftp.eunet.ch
ftp.funet.fi:/pub/gnu
ftp.ieunet.ie:pub/gnu
ftp.informatik.rwth-aachen.de:/pub/gnu
ftp.informatik.tu-muenchen.de
ftp.luth.se:/pub/unix/gnu
ftp.mcc.ac.uk
ftp.stacken.kth.se
ftp.sunet.se:/pub/gnu
ftp.univ-lyon1.fr:pub/gnu
ftp.win.tue.nl:/pub/gnu
irisa.irisa.fr:/pub/gnu
isy.liu.se
nic.switch.ch:/mirror/gnu
src.doc.ic.ac.uk:/gnu
unix.hensa.ac.uk:/pub/uunet/systems/gnu
ftp.inf.utfsm.cl:/pub/gnu
ftp.unicamp.br:/pub/gnu
ftp.cs.ubc.ca:/mirror2/gnu
col.hp.com:/mirrors/gnu
f.ms.uky.edu:/pub3/gnu
ftp.cc.gatech.edu:/pub/gnu
ftp.cs.columbia.edu:/archives/gnu/prep
ftp.digex.net:/pub/gnu
ftp.hawaii.edu:/mirrors/gnu
ftp.kpc.com:/pub/mirror/gnu
ftp.uu.net:/systems/gnu
gatekeeper.dec.com:/pub/GNU
jaguar.utah.edu:/gnustuff
labrea.stanford.edu
mrcnext.cso.uiuc.edu:/pub/gnu
vixen.cso.uiuc.edu:/gnu
wuarchive.wustl.edu:/systems/gnu
gawk
is distributed as a tar
file compressed with the
GNU Zip program, gzip
.
Once you have the distribution (for example,
`gawk-3.0.3.tar.gz'), first use gzip
to expand the
file, and then use tar
to extract it. You can use the following
pipeline to produce the gawk
distribution:
# Under System V, add 'o' to the tar flags gzip -d -c gawk-3.0.3.tar.gz | tar -xvpf -
This will create a directory named `gawk-3.0.3' in the current directory.
The distribution file name is of the form
`gawk-V.R.n.tar.gz'.
The V represents the major version of gawk
,
the R represents the current release of version V, and
the n represents a patch level, meaning that minor bugs have
been fixed in the release. The current patch level is 3,
but when
retrieving distributions, you should get the version with the highest
version, release, and patch level. (Note that release levels greater than
or equal to 90 denote "beta," or non-production software; you may not wish
to retrieve such a version unless you don't mind experimenting.)
If you are not on a Unix system, you will need to make other arrangements
for getting and extracting the gawk
distribution. You should consult
a local expert.
gawk
Distribution
The gawk
distribution has a number of C source files,
documentation files,
subdirectories and files related to the configuration process
(see section Compiling and Installing gawk
on Unix),
and several subdirectories related to different, non-Unix,
operating systems.
gawk
source code.
gawk
under Unix, and the
rest for the various hardware and software combinations.
gawk
has been ported, and which
have successfully run the test suite.
gawk
since the last release or patch.
gawk
's performance.
Most of these depend on the hardware or operating system software, and
are not limits in gawk
itself.
awk
is
incorrect, and how gawk
handles the problem.
gawk
is a good language for
AI (Artificial Intelligence) programming.
troff
source for a five-color awk
reference card.
A modern version of troff
, such as GNU Troff (groff
) is
needed to produce the color version. See the file `README.card'
for instructions if you have an older troff
.
troff
source for a manual page describing gawk
.
This is distributed for the convenience of Unix users.
makeinfo
to produce an Info file.
troff
source for a manual page describing the igawk
program presented in
section An Easy Way to Use Library Functions.
gawk
for various Unix systems. They are explained in detail in
section Compiling and Installing gawk
on Unix.
configure
uses to generate a `Makefile'.
As part of the process of building gawk
, the library functions from
section A Library of awk
Functions,
and the igawk
program from
section An Easy Way to Use Library Functions,
are extracted into ready to use files.
They are installed as part of the installation process.
gawk
on an Atari ST.
See section Installing gawk
on the Atari ST, for details.
gawk
under MS-DOS and OS/2.
See section MS-DOS and OS/2 Installation and Compilation, for details.
gawk
under VMS.
See section How to Compile and Install gawk
on VMS, for details.
gawk
. You can use `make check' from the top level gawk
directory to run your version of gawk
against the test suite.
If gawk
successfully passes `make check' then you can
be confident of a successful port.
gawk
on Unix
Usually, you can compile and install gawk
by typing only two
commands. However, if you do use an unusual system, you may need
to configure gawk
for your system yourself.
gawk
for Unix
After you have extracted the gawk
distribution, cd
to `gawk-3.0.3'. Like most GNU software,
gawk
is configured
automatically for your Unix system by running the configure
program.
This program is a Bourne shell script that was generated automatically using
GNU autoconf
.
(The autoconf
software is
described fully in
Autoconf--Generating Automatic Configuration Scripts,
which is available from the Free Software Foundation.)
To configure gawk
, simply run configure
:
sh ./configure
This produces a `Makefile' and `config.h' tailored to your system.
The `config.h' file describes various facts about your system.
You may wish to edit the `Makefile' to
change the CFLAGS
variable, which controls
the command line options that are passed to the C compiler (such as
optimization levels, or compiling for debugging).
Alternatively, you can add your own values for most make
variables, such as CC
and CFLAGS
, on the command line when
running configure
:
CC=cc CFLAGS=-g sh ./configure
See the file `INSTALL' in the gawk
distribution for
all the details.
After you have run configure
, and possibly edited the `Makefile',
type:
make
and shortly thereafter, you should have an executable version of gawk
.
That's all there is to it!
(If these steps do not work, please send in a bug report;
see section Reporting Problems and Bugs.)
(This section is of interest only if you know something about using the C language and the Unix operating system.)
The source code for gawk
generally attempts to adhere to formal
standards wherever possible. This means that gawk
uses library
routines that are specified by the ANSI C standard and by the POSIX
operating system interface standard. When using an ANSI C compiler,
function prototypes are used to help improve the compile-time checking.
Many Unix systems do not support all of either the ANSI or the
POSIX standards. The `missing' subdirectory in the gawk
distribution contains replacement versions of those subroutines that are
most likely to be missing.
The `config.h' file that is created by the configure
program
contains definitions that describe features of the particular operating
system where you are attempting to compile gawk
. The three things
described by this file are what header files are available, so that
they can be correctly included,
what (supposedly) standard functions are actually available in your C
libraries, and
other miscellaneous facts about your
variant of Unix. For example, there may not be an st_blksize
element in the stat
structure. In this case `HAVE_ST_BLKSIZE'
would be undefined.
It is possible for your C compiler to lie to configure
. It may
do so by not exiting with an error when a library function is not
available. To get around this, you can edit the file `custom.h'.
Use an `#ifdef' that is appropriate for your system, and either
#define
any constants that configure
should have defined but
didn't, or #undef
any constants that configure
defined and
should not have. `custom.h' is automatically included by
`config.h'.
It is also possible that the configure
program generated by
autoconf
will not work on your system in some other fashion. If you do have a problem,
the file
`configure.in' is the input for autoconf
. You may be able to
change this file, and generate a new version of configure
that will
work on your system. See section Reporting Problems and Bugs, for
information on how to report problems in configuring gawk
. The same
mechanism may be used to send in updates to `configure.in' and/or
`custom.h'.
gawk
on VMS
This section describes how to compile and install gawk
under VMS.
gawk
on VMS
To compile gawk
under VMS, there is a DCL
command procedure that
will issue all the necessary CC
and LINK
commands, and there is
also a `Makefile' for use with the MMS
utility. From the source
directory, use either
$ @[.VMS]VMSBUILD.COM
or
$ MMS/DESCRIPTION=[.VMS]DESCRIP.MMS GAWK
Depending upon which C compiler you are using, follow one of the sets of instructions in this table:
CC/OPTIMIZE=NOLINE
, which is essential for Version 3.0.
gawk
has been tested under VAX/VMS 5.5-1 using VAX C V3.2,
GNU C 1.40 and 2.3. It should work without modifications for VMS V4.6 and up.
gawk
on VMS
To install gawk
, all you need is a "foreign" command, which is
a DCL
symbol whose value begins with a dollar sign. For example:
$ GAWK :== $disk1:[gnubin]GAWK
(Substitute the actual location of gawk.exe
for
`$disk1:[gnubin]'.) The symbol should be placed in the
`login.com' of any user who wishes to run gawk
,
so that it will be defined every time the user logs on.
Alternatively, the symbol may be placed in the system-wide
`sylogin.com' procedure, which will allow all users
to run gawk
.
Optionally, the help entry can be loaded into a VMS help library:
$ LIBRARY/HELP SYS$HELP:HELPLIB [.VMS]GAWK.HLP
(You may want to substitute a site-specific help library rather than the standard VMS library `HELPLIB'.) After loading the help text,
$ HELP GAWK
will provide information about both the gawk
implementation and the
awk
programming language.
The logical name `AWK_LIBRARY' can designate a default location
for awk
program files. For the `-f' option, if the specified
filename has no device or directory path information in it, gawk
will look in the current directory first, then in the directory specified
by the translation of `AWK_LIBRARY' if the file was not found.
If after searching in both directories, the file still is not found,
then gawk
appends the suffix `.awk' to the filename and the
file search will be re-tried. If `AWK_LIBRARY' is not defined, that
portion of the file search will fail benignly.
gawk
on VMS
Command line parsing and quoting conventions are significantly different
on VMS, so examples in this book or from other sources often need minor
changes. They are minor though, and all awk
programs
should run correctly.
Here are a couple of trivial tests:
$ gawk -- "BEGIN {print ""Hello, World!""}" $ gawk -"W" version ! could also be -"W version" or "-W version"
Note that upper-case and mixed-case text must be quoted.
The VMS port of gawk
includes a DCL
-style interface in addition
to the original shell-style interface (see the help entry for details).
One side-effect of dual command line parsing is that if there is only a
single parameter (as in the quoted string program above), the command
becomes ambiguous. To work around this, the normally optional `--'
flag is required to force Unix style rather than DCL
parsing. If any
other dash-type options (or multiple parameters such as data files to be
processed) are present, there is no ambiguity and `--' can be omitted.
The default search path when looking for awk
program files specified
by the `-f' option is "SYS$DISK:[],AWK_LIBRARY:"
. The logical
name `AWKPATH' can be used to override this default. The format
of `AWKPATH' is a comma-separated list of directory specifications.
When defining it, the value should be quoted so that it retains a single
translation, and not a multi-translation RMS
searchlist.
gawk
on VMS POSIXIgnore the instructions above, although `vms/gawk.hlp' should still be made available in a help library. The source tree should be unpacked into a container file subsystem rather than into the ordinary VMS file system. Make sure that the two scripts, `configure' and `vms/posix-cc.sh', are executable; use `chmod +x' on them if necessary. Then execute the following two commands:
psx> CC=vms/posix-cc.sh configure psx> make CC=c89 gawk
The first command will construct files `config.h' and `Makefile' out
of templates, using a script to make the C compiler fit configure
's
expectations. The second command will compile and link gawk
using
the C compiler directly; ignore any warnings from make
about being
unable to redefine CC
. configure
will take a very long
time to execute, but at least it provides incremental feedback as it
runs.
This has been tested with VAX/VMS V6.2, VMS POSIX V2.0, and DEC C V5.2.
Once built, gawk
will work like any other shell utility. Unlike
the normal VMS port of gawk
, no special command line manipulation is
needed in the VMS POSIX environment.
If you have received a binary distribution prepared by the DOS
maintainers, then gawk
and the necessary support files will appear
under the `gnu' directory, with executables in `gnu/bin',
libraries in `gnu/lib/awk', and manual pages under `gnu/man'.
This is designed for easy installation to a `/gnu' directory on your
drive, but the files can be installed anywhere provided AWKPATH
is
set properly. Regardless of the installation directory, the first line of
`igawk.cmd' and `igawk.bat' (in `gnu/bin') may need to be
edited.
The binary distribution will contain a separate file describing the
contents. In particular, it may include more than one version of the
gawk
executable. OS/2 binary distributions may have a
different arrangement, but installation is similar.
The OS/2 and MS-DOS versions of gawk
search for program files as
described in section The AWKPATH
Environment Variable.
However, semicolons (rather than colons) separate elements
in the AWKPATH
variable. If AWKPATH
is not set or is empty,
then the default search path is ".;c:/lib/awk;c:/gnu/lib/awk"
.
An sh
-like shell (as opposed to command.com
under MS-DOS
or cmd.exe
under OS/2) may be useful for awk
programming.
Ian Stewartson has written an excellent shell for MS-DOS and OS/2, and a
ksh
clone and GNU Bash are available for OS/2. The file
`README_d/README.pc' in the gawk
distribution contains
information on these shells. Users of Stewartson's shell on DOS should
examine its documentation on handling of command-lines. In particular,
the setting for gawk
in the shell configuration may need to be
changed, and the ignoretype
option may also be of interest.
gawk
can be compiled for MS-DOS and OS/2 using the GNU development tools
from DJ Delorie (DJGPP, MS-DOS-only) or Eberhard Mattes (EMX, MS-DOS and OS/2).
Microsoft C can be used to build 16-bit versions for MS-DOS and OS/2. The file
`README_d/README.pc' in the gawk
distribution contains additional
notes, and `pc/Makefile' contains important notes on compilation options.
To build gawk
, copy the files in the `pc' directory (except
for `ChangeLog') to the
directory with the rest of the gawk
sources. The `Makefile'
contains a configuration section with comments, and may need to be
edited in order to work with your make
utility.
The `Makefile' contains a number of targets for building various MS-DOS
and OS/2 versions. A list of targets will be printed if the make
command is given without a target. As an example, to build gawk
using the DJGPP tools, enter `make djgpp'.
Using make
to run the standard tests and to install gawk
requires additional Unix-like tools, including sh
, sed
, and
cp
. In order to run the tests, the `test/*.ok' files may need to
be converted so that they have the usual DOS-style end-of-line markers. Most
of the tests will work properly with Stewartson's shell along with the
companion utilities or appropriate GNU utilities. However, some editing of
`test/Makefile' is required. It is recommended that the file
`pc/Makefile.tst' be copied to `test/Makefile' as a
replacement. Details can be found in `README_d/README.pc'.
gawk
on the Atari ST
There are no substantial differences when installing gawk
on
various Atari models. Compiled gawk
executables do not require
a large amount of memory with most awk
programs and should run on all
Motorola processor based models (called further ST, even if that is not
exactly right).
In order to use gawk
, you need to have a shell, either text or
graphics, that does not map all the characters of a command line to
upper-case. Maintaining case distinction in option flags is very
important (see section Command Line Options).
These days this is the default, and it may only be a problem for some
very old machines. If your system does not preserve the case of option
flags, you will need to upgrade your tools. Support for I/O
redirection is necessary to make it easy to import awk
programs
from other environments. Pipes are nice to have, but not vital.
gawk
on the Atari ST
A proper compilation of gawk
sources when sizeof(int)
differs from sizeof(void *)
requires an ANSI C compiler. An initial
port was done with gcc
. You may actually prefer executables
where int
s are four bytes wide, but the other variant works as well.
You may need quite a bit of memory when trying to recompile the gawk
sources, as some source files (`regex.c' in particular) are quite
big. If you run out of memory compiling such a file, try reducing the
optimization level for this particular file; this may help.
With a reasonable shell (Bash will do), and in particular if you run
Linux, MiNT or a similar operating system, you have a pretty good
chance that the configure
utility will succeed. Otherwise
sample versions of `config.h' and `Makefile.st' are given in the
`atari' subdirectory and can be edited and copied to the
corresponding files in the main source directory. Even if
configure
produced something, it might be advisable to compare
its results with the sample versions and possibly make adjustments.
Some gawk
source code fragments depend on a preprocessor define
`atarist'. This basically assumes the TOS environment with gcc
.
Modify these sections as appropriate if they are not right for your
environment. Also see the remarks about AWKPATH
and envsep
in
section Running gawk
on the Atari ST.
As shipped, the sample `config.h' claims that the system
function is missing from the libraries, which is not true, and an
alternative implementation of this function is provided in
`atari/system.c'. Depending upon your particular combination of
shell and operating system, you may wish to change the file to indicate
that system
is available.
gawk
on the Atari ST
An executable version of gawk
should be placed, as usual,
anywhere in your PATH
where your shell can find it.
While executing, gawk
creates a number of temporary files. When
using gcc
libraries for TOS, gawk
looks for either of
the environment variables TEMP
or TMPDIR
, in that order.
If either one is found, its value is assumed to be a directory for
temporary files. This directory must exist, and if you can spare the
memory, it is a good idea to put it on a RAM drive. If neither
TEMP
nor TMPDIR
are found, then gawk
uses the
current directory for its temporary files.
The ST version of gawk
searches for its program files as described in
section The AWKPATH
Environment Variable.
The default value for the AWKPATH
variable is taken from
DEFPATH
defined in `Makefile'. The sample gcc
/TOS
`Makefile' for the ST in the distribution sets DEFPATH
to
".,c:\lib\awk,c:\gnu\lib\awk"
. The search path can be
modified by explicitly setting AWKPATH
to whatever you wish.
Note that colons cannot be used on the ST to separate elements in the
AWKPATH
variable, since they have another, reserved, meaning.
Instead, you must use a comma to separate elements in the path. When
recompiling, the separating character can be modified by initializing
the envsep
variable in `atari/gawkmisc.atr' to another
value.
Although awk
allows great flexibility in doing I/O redirections
from within a program, this facility should be used with care on the ST
running under TOS. In some circumstances the OS routines for file
handle pool processing lose track of certain events, causing the
computer to crash, and requiring a reboot. Often a warm reboot is
sufficient. Fortunately, this happens infrequently, and in rather
esoteric situations. In particular, avoid having one part of an
awk
program using print
statements explicitly redirected
to "/dev/stdout"
, while other print
statements use the
default standard output, and a calling shell has redirected standard
output to a file.
When gawk
is compiled with the ST version of gcc
and its
usual libraries, it will accept both `/' and `\' as path separators.
While this is convenient, it should be remembered that this removes one,
technically valid, character (`/') from your file names, and that
it may create problems for external programs, called via the system
function, which may not support this convention. Whenever it is possible
that a file created by gawk
will be used by some other program,
use only backslashes. Also remember that in awk
, backslashes in
strings have to be doubled in order to get literal backslashes
(see section Escape Sequences).
gawk
on an Amiga
You can install gawk
on an Amiga system using a Unix emulation
environment available via anonymous ftp
from
ftp.ninemoons.com
in the directory `pub/ade/current'.
This includes a shell based on pdksh
. The primary component of
this environment is a Unix emulation library, `ixemul.lib'.
A more complete distribution for the Amiga is available on the Geek Gadgets CD-ROM from:
CRONUS
1840 E. Warner Road #105-265
Tempe, AZ 85284 USA
US Toll Free: (800) 804-0833
Phone: +1-602-491-0442
FAX: +1-602-491-0048
Email:[email protected]
WWW:http://www.ninemoons.com
Anonymousftp
site:ftp.ninemoons.com
Once you have the distribution, you can configure gawk
simply by
running configure
:
configure -v m68k-amigaos
Then run make
, and you should be all set!
(If these steps do not work, please send in a bug report;
see section Reporting Problems and Bugs.)
There is nothing more dangerous than a bored archeologist. The Hitchhiker's Guide to the Galaxy
If you have problems with gawk
or think that you have found a bug,
please report it to the developers; we cannot promise to do anything
but we might well want to fix it.
Before reporting a bug, make sure you have actually found a real bug. Carefully reread the documentation and see if it really says you can do what you're trying to do. If it's not clear whether you should be able to do something or not, report that too; it's a bug in the documentation!
Before reporting a bug or trying to fix it yourself, try to isolate it
to the smallest possible awk
program and input data file that
reproduces the problem. Then send us the program and data file,
some idea of what kind of Unix system you're using, and the exact results
gawk
gave you. Also say what you expected to occur; this will help
us decide whether the problem was really in the documentation.
Once you have a precise problem, there are two e-mail addresses you can send mail to.
Please include the
version number of gawk
you are using. You can get this information
with the command `gawk --version'.
You should send a carbon copy of your mail to Arnold Robbins, who can
be reached at `[email protected]'.
Important! Do not try to report bugs in gawk
by
posting to the Usenet/Internet newsgroup comp.lang.awk
.
While the gawk
developers do occasionally read this newsgroup,
there is no guarantee that we will see your posting. The steps described
above are the official, recognized ways for reporting bugs.
Non-bug suggestions are always welcome as well. If you have questions about things that are unclear in the documentation or are just obscure features, ask Arnold Robbins; he will try to help you out, although he may not have the time to fix the problem. You can send him electronic mail at the Internet address above.
If you find bugs in one of the non-Unix ports of gawk
, please send
an electronic mail message to the person who maintains that port. They
are listed below, and also in the `README' file in the gawk
distribution. Information in the `README' file should be considered
authoritative if it conflicts with this book.
The people maintaining the non-Unix ports of gawk
are:
If your bug is also reproducible under Unix, please send copies of your report to the general GNU bug list, as well as to Arnold Robbins, at the addresses listed above.
awk
Implementations
It's kind of fun to put comments like this in your awk code.
// Do C++ comments work? answer: yes! of course
Michael Brennan
There are two other freely available awk
implementations.
This section briefly describes where to get them.
awk
awk
freely available. You can get it via anonymous ftp
to the host netlib.att.com
. Change directory to
`/netlib/research'. Use "binary" or "image" mode, and
retrieve `awk.bundle.Z'.
This is a shell archive that has been compressed with the compress
utility. It can be uncompressed with either uncompress
or the
GNU gunzip
utility.
This version requires an ANSI C compiler; GCC (the GNU C compiler)
works quite nicely.
mawk
awk
,
called mawk
. It is available under the GPL
(see section GNU GENERAL PUBLIC LICENSE),
just as gawk
is.
You can get it via anonymous ftp
to the host
ftp.whidbey.net
. Change directory to `/pub/brennan'.
Use "binary" or "image" mode, and retrieve `mawk1.3.3.tar.gz'
(or the latest version that is there).
gunzip
may be used to decompress this file. Installation
is similar to gawk
's
(see section Compiling and Installing gawk
on Unix).
Go to the first, previous, next, last section, table of contents.