@UNREVISED
GNU tar
is distributed along with the scripts which the Free
Software Foundation uses for performing backups. There is no corresponding
scripts available yet for doing restoration of files. Even if there is
a good chance those scripts may be satisfying to you, they are not the
only scripts or methods available for doing backups and restore. You may
well create your own, or use more sophisticated packages dedicated to
that purpose.
Some users are enthusiastic about Amanda
(The Advanced Maryland
Automatic Network Disk Archiver), a backup system developed by James
da Silva `[email protected]' and available on many Unix systems.
This is free software, and it is available at these places:
http://www.cs.umd.edu/projects/amanda/amanda.html ftp://ftp.cs.umd.edu/pub/amanda
Here is a possible plan for a future documentation about the backuping
scripts which are provided within the GNU tar
distribution.
.* dumps . + what are dumps . + different levels of dumps . - full dump = dump everything . - level 1, level 2 dumps etc, - A level n dump dumps everything changed since the last level n-1 dump (?) . + how to use scripts for dumps (ie, the concept) . - scripts to run after editing backup specs (details) . + Backup Specs, what is it. . - how to customize . - actual text of script [/sp/dump/backup-specs] . + Problems . - rsh doesn't work . - rtape isn't installed . - (others?) . + the --incremental option of tar . + tapes . - write protection . - types of media . : different sizes and types, useful for different things . - files and tape marks one tape mark between files, two at end. . - positioning the tape MT writes two at end of write, backspaces over one when writing again.
This chapter documents both the provided FSF scripts and tar
options which are more specific to usage as a backup tool.
To back up a file system means to create archives that contain all the files in that file system. Those archives can then be used to restore any or all of those files (for instance if a disk crashes or a file is accidently deleted). File system backups are also called dumps.
tar
to Perform Full Dumps@UNREVISED
Full dumps should only be made when no other people or programs
are modifying files in the filesystem. If files are modified while
tar
is making the backup, they may not be stored properly in
the archive, in which case you won't be able to restore them if you
have to. (Files not being modified are written with no trouble, and do
not corrupt the entire archive.)
You will want to use the --label=archive-label (-V archive-label) option to give the archive a volume label, so you can tell what this archive is even if the label falls off the tape, or anything like that.
Unless the filesystem you are dumping is guaranteed to fit on one volume, you will need to use the --multi-volume (-M) option. Make sure you have enough tapes on hand to complete the backup.
If you want to dump each filesystem separately you will need to use
the --one-file-system (-l) option to prevent tar
from crossing
filesystem boundaries when storing (sub)directories.
The --incremental (-G) option is not needed, since this is a complete copy of everything in the filesystem, and a full restore from this backup would only be done onto a completely empty disk.
Unless you are in a hurry, and trust the tar
program (and your
tapes), it is a good idea to use the --verify (-W) option, to make
sure your files really made it onto the dump properly. This will
also detect cases where the file was modified while (or just after)
it was being archived. Not all media (notably cartridge tapes) are
capable of being verified, unfortunately.
--listed-incremental=snapshot-file (-g snapshot-file) take a file name argument always. If the file doesn't exist, run a level zero dump, creating the file. If the file exists, uses that file to see what has changed.
--incremental (-G) @FIXME{look it up}
--incremental (-G) handle old GNU-format incremental backup.
This option should only be used when creating an incremental backup of
a filesystem. When the --incremental (-G) option is used, tar
writes, at the beginning of the archive, an entry for each of the
directories that will be operated on. The entry for a directory
includes a list of all the files in the directory at the time the
dump was done, and a flag for each file indicating whether the file
is going to be put in the archive. This information is used when
doing a complete incremental restore.
Note that this option causes tar
to create a non-standard
archive that may not be readable by non-GNU versions of the tar
program.
The --incremental (-G) option means the archive is an incremental backup. Its meaning depends on the command that it modifies.
If the --incremental (-G) option is used with --list (-t), tar
will list, for each directory in the archive, the list of files in
that directory at the time the archive was created. This information
is put out in a format that is not easy for humans to read, but which
is unambiguous for a program: each file name is preceded by either a
`Y' if the file is present in the archive, an `N' if the
file is not included in the archive, or a `D' if the file is
a directory (and is included in the archive). Each file name is
terminated by a null character. The last file is followed by an
additional null and a newline to indicate the end of the data.
If the --incremental (-G) option is used with --extract (--get, -x), then when the entry for a directory is found, all files that currently exist in that directory but are not listed in the archive are deleted from the directory.
This behavior is convenient when you are restoring a damaged file system from a succession of incremental backups: it restores the entire state of the file system to that which obtained when the backup was made. If you don't use --incremental (-G), the file system will probably fill up with files that shouldn't exist any more.
--listed-incremental=snapshot-file (-g snapshot-file) handle new GNU-format incremental backup. This option handles new GNU-format incremental backup. It has much the same effect as --incremental (-G), but also the time when the dump is done and the list of directories dumped is written to the given file. When restoring, only files newer than the saved time are restored, and the direcotyr list is used to speed up operations.
--listed-incremental=snapshot-file (-g snapshot-file) acts like --incremental (-G), but when
used in conjunction with --create (-c) will also cause tar
to
use the file file, which contains information about the state
of the filesystem at the time of the last backup, to decide which
files to include in the archive being created. That file will then
be updated by tar
. If the file file does not exist when
this option is specified, tar
will create it, and include all
appropriate files in the archive.
The file, which is archive independent, contains the date it was last
modified and a list of devices, inode numbers and directory names.
tar
will archive files with newer mod dates or inode change
times, and directories with an unchanged inode number and device but
a changed directory name. The file is updated after the files to
be archived are determined, but before the new archive is actually
created.
GNU tar
actually writes the file twice: once before the data
and written, and once after.
tar
to Perform Incremental Dumps@UNREVISED
Performing incremental dumps is similar to performing full dumps, although a few more options will usually be needed.
You will need to use the `-N date' option to tell tar
to only store files that have been modified since date.
date should be the date and time of the last full/incremental
dump.
A standard scheme is to do a monthly (full) dump once a month, a weekly dump once a week of everything since the last monthly and a daily every day of everything since the last (weekly or monthly) dump.
Here is a copy of the script used to dump the filesystems of the
machines here at the Free Software Foundation. This script is run via
cron
late at night when people are least likely to be using the
machines. This script dumps several filesystems from several machines
at once (via NFS). The operator is responsible for ensuring that all
the machines will be up at the time the dump happens. If a machine is
not running, its files will not be dumped, and the next day's
incremental dump will not store files that would have gone onto
that dump.
#!/bin/csh # Dump thingie set now = `date` set then = `cat date.nfs.dump` /u/hack/bin/tar -c -G -v\ -f /dev/rtu20\ -b 126\ -N "$then"\ -V "Dump from $then to $now"\ /alpha-bits/gp\ /gnu/hack\ /hobbes/u\ /spiff/u\ /sugar-bombs/u echo $now > date.nfs.dump mt -f /dev/rtu20 rew
Output from this script is stored in a file, for the operator to read later.
This script uses the file `date.nfs.dump' to store the date/time of the last dump.
Since this is a streaming tape drive, no attempt to verify the archive
is done. This is also why the high blocking factor (126) is used.
The tape drive must also be rewound by the mt
command after
the dump is made.
@UNREVISED
--incremental (-G) is used in conjunction with --create (-c), --extract (--get, -x) or --list (-t) when backing up and restoring file systems. An archive cannot be extracted or listed with the --incremental (-G) option specified unless it was created with the option specified. This option should only be used by a script, not by the user, and is usually disregarded in favor of --listed-incremental=snapshot-file (-g snapshot-file), which is described below.
--incremental (-G) in conjunction with --create (-c) causes
tar
to write, at the beginning of the archive, an entry for
each of the directories that will be archived. The entry for a
directory includes a list of all the files in the directory at the
time the archive was created and a flag for each file indicating
whether or not the file is going to be put in the archive.
Note that this option causes tar
to create a non-standard
archive that may not be readable by non-GNU versions of the tar
program.
--incremental (-G) in conjunction with --extract (--get, -x) causes
tar
to read the lists of directory contents previously stored
in the archive, delete files in the file system that did not
exist in their directories when the archive was created, and then
extract the files in the archive.
This behavior is convenient when restoring a damaged file system from a succession of incremental backups: it restores the entire state of the file system to that which obtained when the backup was made. If --incremental (-G) isn't specified, the file system will probably fill up with files that shouldn't exist any more.
--incremental (-G) in conjunction with --list (-t), causes
tar
to print, for each directory in the archive, the list of
files in that directory at the time the archive was created. This
information is put out in a format that is not easy for humans to
read, but which is unambiguous for a program: each file name is
preceded by either a `Y' if the file is present in the archive,
an `N' if the file is not included in the archive, or a `D'
if the file is a directory (and is included in the archive). Each
file name is terminated by a null character. The last file is followed
by an additional null and a newline to indicate the end of the data.
--listed-incremental=snapshot-file (-g snapshot-file) acts like --incremental (-G), but when
used in conjunction with --create (-c) will also cause tar
to use the file snapshot-file, which contains information about
the state of the file system at the time of the last backup, to decide
which files to include in the archive being created. That file will
then be updated by tar
. If the file file does not exist
when this option is specified, tar
will create it, and include
all appropriate files in the archive.
The file file, which is archive independent, contains the date
it was last modified and a list of devices, inode numbers and
directory names. tar
will archive files with newer mod dates
or inode change times, and directories with an unchanged inode number
and device but a changed directory name. The file is updated after
the files to be archived are determined, but before the new archive is
actually created.
Despite it should be obvious that a device has a non-volatile value, NFS
devices have non-dependable values when an automounter gets in the picture.
This led to a great deal of spurious redumping in incremental dumps,
so it is somewhat useless to compare two NFS devices numbers over time.
So tar
now considers all NFS devices as being equal when it comes
to comparing directories; this is fairly gross, but there does not seem
to be a better way to go.
@FIXME{this section needs to be written}
@UNREVISED
An archive containing all the files in the file system is called a full backup or full dump. You could insure your data by creating a full dump every day. This strategy, however, would waste a substantial amount of archive media and user time, as unchanged files are daily re-archived.
It is more efficient to do a full dump only occasionally. To back up files between full dumps, you can a incremental dump. A level one dump archives all the files that have changed since the last full dump.
A typical dump strategy would be to perform a full dump once a week, and a level one dump once a day. This means some versions of files will in fact be archived more than once, but this dump strategy makes it possible to restore a file system to within one day of accuracy by only extracting two archives--the last weekly (full) dump and the last daily (level one) dump. The only information lost would be in files changed or created since the last daily backup. (Doing dumps more than once a day is usually not worth the trouble).
GNU tar
comes with scripts you can use to do full and level-one
dumps. Using scripts (shell programs) to perform backups and
restoration is a convenient and reliable alternative to typing out
file name lists and tar
commands by hand.
Before you use these scripts, you need to edit the file `backup-specs', which specifies parameters used by the backup scripts and by the restore script. @FIXME{There is no such restore script!}. @FIXME-xref{Script Syntax}. Once the backup parameters are set, you can perform backups or restoration by running the appropriate script.
The name of the restore script is restore
. @FIXME{There is
no such restore script!}. The names of the level one and full backup
scripts are, respectively, level-1
and level-0
.
The level-0
script also exists under the name weekly
, and
the level-1
under the name daily
---these additional names
can be changed according to your backup schedule. @FIXME-xref{Scripted
Restoration}, for more information on running the restoration script.
@FIXME-xref{Scripted Backups}, for more information on running the
backup scripts.
Please Note: The backup scripts and the restoration scripts are
designed to be used together. While it is possible to restore files by
hand from an archive which was created using a backup script, and to create
an archive by hand which could then be extracted using the restore script,
it is easier to use the scripts. @FIXME{There is no such restore script!}.
See section Using tar
to Perform Incremental Dumps, and See section Using tar
to Perform Incremental Dumps,
before making such an attempt.
@FIXME{shorten node names}
@UNREVISED
The file `backup-specs' specifies backup parameters for the
backup and restoration scripts provided with tar
. You must
edit `backup-specs' to fit your system configuration and schedule
before using these scripts.
@FIXME{This about backup scripts needs to be written: BS is a shell script .... thus ... `backup-specs' is in shell script syntax.}
@FIXME-xref{Script Syntax}, for an explanation of this syntax.
@FIXME{Whats a parameter .... looked at by the backup scripts ... which will be expecting to find ... now syntax ... value is linked to lame ... `backup-specs' specifies the following parameters:}
tar
writes the archive to. This device should be
attached to the host on which the dump scripts are run.
@FIXME{examples for all ...}
tar
will use when writing the dump archive.
See section The Blocking Factor of an Archive.
tar
on, and should
normally be the host that actually contains the file system. However,
the host machine must have GNU tar
installed, and must be able
to access the directory containing the backup scripts and their
support files using the same file name that is used on the machine
where the scripts are run (ie. what pwd
will print when in that
directory on that machine). If the host that contains the file system
does not have this capability, you can specify another host as long as
it can access the file system through NFS.
@UNREVISED
The following is the text of `backup-specs' as it appears at FSF:
# site-specific parameters for file system backup. ADMINISTRATOR=friedman BACKUP_HOUR=1 TAPE_FILE=/dev/nrsmt0 TAPE_STATUS="mts -t $TAPE_FILE" BLOCKING=124 BACKUP_DIRS=" albert:/fs/fsf apple-gunkies:/gd albert:/fs/gd2 albert:/fs/gp geech:/usr/jla churchy:/usr/roland albert:/ albert:/usr apple-gunkies:/ apple-gunkies:/usr gnu:/hack gnu:/u apple-gunkies:/com/mailer/gnu apple-gunkies:/com/archive/gnu" BACKUP_FILES="/com/mailer/aliases /com/mailer/league*[a-z]"
@UNREVISED
`backup-specs' is in shell script syntax. The following conventions should be considered when editing the script: @FIXME{"conventions?"}
A quoted string is considered to be contiguous, even if it is on more than one line. Therefore, you cannot include commented-out lines within a multi-line quoted string. BACKUP_FILES and BACKUP_DIRS are the two most likely parameters to be multi-line.
A quoted string typically cannot contain wildcards. In `backup-specs', however, the parameters BACKUP_DIRS and BACKUP_FILES can contain wildcards.
@UNREVISED
The syntax for running a backup script is:
`script-name' [time-to-be-run]
where time-to-be-run can be a specific system time, or can be now. If you do not specify a time, the script runs at the time specified in `backup-specs' (@FIXME-pxref{Script Syntax}).
You should start a script with a tape or disk mounted. Once you
start a script, it prompts you for new tapes or disks as it
needs them. Media volumes don't have to correspond to archive
files--a multi-volume archive can be started in the middle of a
tape that already contains the end of another multi-volume archive.
The restore
script prompts for media by its archive volume,
so to avoid an error message you should keep track of which tape
(or disk) contains which volume of the archive. @FIXME{There is
no such restore script!}. @FIXME-xref{Scripted Restoration}.
@FIXME{Have file names changed?}
The backup scripts write two files on the file system. The first is a record file in `/etc/tar-backup/', which is used by the scripts to store and retrieve information about which files were dumped. This file is not meant to be read by humans, and should not be deleted by them. @FIXME-xref{incremental and listed-incremental}, for a more detailed explanation of this file.
The second file is a log file containing the names of the file systems and files dumped, what time the backup was made, and any error messages that were generated, as well as how much space was left in the media volume after the last volume of the archive was written. You should check this log file after every backup. The file name is `log-mmm-ddd-yyyy-level-1' or `log-mmm-ddd-yyyy-full'.
The script also prints the name of each system being dumped to the standard output.
@UNREVISED
Warning: The GNU
tar
distribution does not provide any suchrestore
script yet. This section is only listed here for documentation maintenance purposes. In any case, all contents is subject to change as things develop.
@FIXME{A section on non-scripted restore may be a good idea.}
To restore files that were archived using a scripted backup, use the
restore
script. The syntax for the script is:
where ***** are the file systems to restore from, and ***** is a regular expression which specifies which files to restore. If you specify --all, the script restores all the files in the file system.
You should start the restore script with the media containing the first volume of the archive mounted. The script will prompt for other volumes as they are needed. If the archive is on tape, you don't need to rewind the tape to to its beginning--if the tape head is positioned past the beginning of the archive, the script will rewind the tape as needed. @FIXME-xref{Media}, for a discussion of tape positioning.
If you specify `--all' as the files argument, the
restore
script extracts all the files in the archived file
system into the active file system.
Warning: The script will delete files from the active file system if they were not in the file system when the archive was made.
See section Using tar
to Perform Incremental Dumps, and section Using tar
to Perform Incremental Dumps,
for an explanation of how the script makes that determination.
@FIXME{this may be an option, not a given}
Go to the first, previous, next, last section, table of contents.