The `/dev/user' special file
(see section Special File Names in gawk
)
provides access to the current user's real and effective user and group id
numbers, and if available, the user's supplementary group set.
However, since these are numbers, they do not provide very useful
information to the average user. There needs to be some way to find the
user information associated with the user and group numbers. This
section presents a suite of functions for retrieving information from the
user database. See section Reading the Group Database,
for a similar suite that retrieves information from the group database.
The POSIX standard does not define the file where user information is
kept. Instead, it provides the <pwd.h>
header file
and several C language subroutines for obtaining user information.
The primary function is getpwent
, for "get password entry."
The "password" comes from the original user database file,
`/etc/passwd', which kept user information, along with the
encrypted passwords (hence the name).
While an awk
program could simply read `/etc/passwd' directly
(the format is well known), because of the way password
files are handled on networked systems,
this file may not contain complete information about the system's set of users.
To be sure of being
able to produce a readable, complete version of the user database, it is
necessary to write a small C program that calls getpwent
.
getpwent
is defined to return a pointer to a struct passwd
.
Each time it is called, it returns the next entry in the database.
When there are no more entries, it returns NULL
, the null pointer.
When this happens, the C program should call endpwent
to close the
database.
Here is pwcat
, a C program that "cats" the password database.
/* * pwcat.c * * Generate a printable version of the password database * * Arnold Robbins * [email protected] * May 1993 * Public Domain */ #include <stdio.h> #include <pwd.h> int main(argc, argv) int argc; char **argv; { struct passwd *p; while ((p = getpwent()) != NULL) printf("%s:%s:%d:%d:%s:%s:%s\n", p->pw_name, p->pw_passwd, p->pw_uid, p->pw_gid, p->pw_gecos, p->pw_dir, p->pw_shell); endpwent(); exit(0); }
If you don't understand C, don't worry about it.
The output from pwcat
is the user database, in the traditional
`/etc/passwd' format of colon-separated fields. The fields are:
$HOME
).
Here are a few lines representative of pwcat
's output.
$ pwcat -| root:3Ov02d5VaUPB6:0:1:Operator:/:/bin/sh -| nobody:*:65534:65534::/: -| daemon:*:1:1::/: -| sys:*:2:2::/:/bin/csh -| bin:*:3:3::/bin: -| arnold:xyzzy:2076:10:Arnold Robbins:/home/arnold:/bin/sh -| miriam:yxaay:112:10:Miriam Robbins:/home/miriam:/bin/sh -| andy:abcca2:113:10:Andy Jacobs:/home/andy:/bin/sh ...
With that introduction, here is a group of functions for getting user information. There are several functions here, corresponding to the C functions of the same name.
# passwd.awk --- access password file information # Arnold Robbins, [email protected], Public Domain # May 1993 BEGIN { # tailor this to suit your system _pw_awklib = "/usr/local/libexec/awk/" } function _pw_init( oldfs, oldrs, olddol0, pwcat) { if (_pw_inited) return oldfs = FS oldrs = RS olddol0 = $0 FS = ":" RS = "\n" pwcat = _pw_awklib "pwcat" while ((pwcat | getline) > 0) { _pw_byname[$1] = $0 _pw_byuid[$3] = $0 _pw_bycount[++_pw_total] = $0 } close(pwcat) _pw_count = 0 _pw_inited = 1 FS = oldfs RS = oldrs $0 = olddol0 }
The BEGIN
rule sets a private variable to the directory where
pwcat
is stored. Since it is used to help out an awk
library
routine, we have chosen to put it in `/usr/local/libexec/awk'.
You might want it to be in a different directory on your system.
The function _pw_init
keeps three copies of the user information
in three associative arrays. The arrays are indexed by user name
(_pw_byname
), by user-id number (_pw_byuid
), and by order of
occurrence (_pw_bycount
).
The variable _pw_inited
is used for efficiency; _pw_init
only
needs to be called once.
Since this function uses getline
to read information from
pwcat
, it first saves the values of FS
, RS
, and
$0
. Doing so is necessary, since these functions could be called
from anywhere within a user's program, and the user may have his or her
own values for FS
and RS
.
The main part of the function uses a loop to read database lines, split
the line into fields, and then store the line into each array as necessary.
When the loop is done, _pw_init
cleans up by closing the pipeline,
setting _pw_inited
to one, and restoring FS
, RS
, and
$0
. The use of _pw_count
will be explained below.
function getpwnam(name) { _pw_init() if (name in _pw_byname) return _pw_byname[name] return "" }
The getpwnam
function takes a user name as a string argument. If that
user is in the database, it returns the appropriate line. Otherwise it
returns the null string.
function getpwuid(uid) { _pw_init() if (uid in _pw_byuid) return _pw_byuid[uid] return "" }
Similarly,
the getpwuid
function takes a user-id number argument. If that
user number is in the database, it returns the appropriate line. Otherwise it
returns the null string.
function getpwent() { _pw_init() if (_pw_count < _pw_total) return _pw_bycount[++_pw_count] return "" }
The getpwent
function simply steps through the database, one entry at
a time. It uses _pw_count
to track its current position in the
_pw_bycount
array.
function endpwent() { _pw_count = 0 }
The endpwent
function resets _pw_count
to zero, so that
subsequent calls to getpwent
will start over again.
A conscious design decision in this suite is that each subroutine calls
_pw_init
to initialize the database arrays. The overhead of running
a separate process to generate the user database, and the I/O to scan it,
will only be incurred if the user's main program actually calls one of these
functions. If this library file is loaded along with a user's program, but
none of the routines are ever called, then there is no extra run-time overhead.
(The alternative would be to move the body of _pw_init
into a
BEGIN
rule, which would always run pwcat
. This simplifies the
code but runs an extra process that may never be needed.)
In turn, calling _pw_init
is not too expensive, since the
_pw_inited
variable keeps the program from reading the data more than
once. If you are worried about squeezing every last cycle out of your
awk
program, the check of _pw_inited
could be moved out of
_pw_init
and duplicated in all the other functions. In practice,
this is not necessary, since most awk
programs are I/O bound, and it
would clutter up the code.
The id
program in section Printing Out User Information,
uses these functions.
Go to the first, previous, next, last section, table of contents.