The GNU Awk User's Guide

Go to the first, previous, next, last section, table of contents.

Duplicating Output Into Multiple Files

The tee program is known as a "pipe fitting." tee copies its standard input to its standard output, and also duplicates it to the files named on the command line. Its usage is:

tee [-a] file ...

The `-a' option tells tee to append to the named files, instead of truncating them and starting over.

The BEGIN rule first makes a copy of all the command line arguments, into an array named copy. ARGV[0] is not copied, since it is not needed. tee cannot use ARGV directly, since awk will attempt to process each file named in ARGV as input data.

If the first argument is `-a', then the flag variable append is set to true, and both ARGV[1] and copy[1] are deleted. If ARGC is less than two, then no file names were supplied, and tee prints a usage message and exits. Finally, awk is forced to read the standard input by setting ARGV[1] to "-", and ARGC to two.

# tee.awk --- tee in awk
# Arnold Robbins, [email protected], Public Domain
# May 1993
# Revised December 1995

BEGIN    \
{
    for (i = 1; i < ARGC; i++)
        copy[i] = ARGV[i]

    if (ARGV[1] == "-a") {
        append = 1
        delete ARGV[1]
        delete copy[1]
        ARGC--
    }
    if (ARGC < 2) {
        print "usage: tee [-a] file ..." > "/dev/stderr"
        exit 1
    }
    ARGV[1] = "-"
    ARGC = 2
}

The single rule does all the work. Since there is no pattern, it is executed for each line of input. The body of the rule simply prints the line into each file on the command line, and then to the standard output.

{
    # moving the if outside the loop makes it run faster
    if (append)
        for (i in copy)
            print >> copy[i]
    else
        for (i in copy)
            print > copy[i]
    print
}

It would have been possible to code the loop this way:

for (i in copy)
    if (append)
        print >> copy[i]
    else
        print > copy[i]

This is more concise, but it is also less efficient. The `if' is tested for each record and for each output file. By duplicating the loop body, the `if' is only tested once for each input record. If there are N input records and M input files, the first method only executes N `if' statements, while the second would execute N*M `if' statements.

Finally, the END rule cleans up, by closing all the output files.

END    \
{
    for (i in copy)
        close(copy[i])
}

Go to the first, previous, next, last section, table of contents.