[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Here are some ideas for improving GNU diff
and
patch
. The GNU project has identified some
improvements as potential programming projects for volunteers. You
can also help by reporting any bugs that you find.
If you are a programmer and would like to contribute something to the GNU project, please consider volunteering for one of these projects. If you are seriously contemplating work, please write to [email protected] to coordinate with other volunteers.
18.1 Suggested Projects for Improving GNU diff
andpatch
Suggested projects for improvements. 18.2 Reporting Bugs Reporting bugs.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
diff
and patch
One should be able to use GNU diff
to generate a patch from any
pair of directory trees, and given the patch and a copy of one such
tree, use patch
to generate a faithful copy of the other.
Unfortunately, some changes to directory trees cannot be expressed using
current patch formats; also, patch
does not handle some of the
existing formats. These shortcomings motivate the following suggested
projects.
18.1.1 Handling Multibyte and Varying-Width Characters Handling multibyte and varying-width characters. 18.1.2 Handling Changes to the Directory Structure Handling changes to the directory structure. 18.1.3 Files that are Neither Directories Nor Regular Files Handling symbolic links, device special files, etc. 18.1.4 File Names that Contain Unusual Characters Handling file names that contain unusual characters. 18.1.5 Outputting Diffs in Time Stamp Order Outputting diffs in time stamp order. 18.1.6 Ignoring Certain Changes Ignoring certain changes while showing others.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
diff
, diff3
and sdiff
treat each line of
input as a string of unibyte characters. This can mishandle multibyte
characters in some cases. For example, when asked to ignore spaces,
diff
does not properly ignore a multibyte space character.
Also, diff
currently assumes that each byte is one column
wide, and this assumption is incorrect in some locales, e.g., locales
that use UTF-8 encoding. This causes problems with the `-y' or
`--side-by-side' option of diff
.
These problems need to be fixed without unduly affecting the performance of the utilities in unibyte environments.
The IBM GNU/Linux Technology Center Internationalization Team has
proposed some patches to support internationalized diff
http://oss.software.ibm.com/developer/opensource/linux/patches/i18n/diffutils-2.7.2-i18n-0.1.patch.gz.
Unfortunately, these patches are incomplete and are to an older
version of diff
, so more work needs to be done in this area.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
diff
and patch
do not handle some changes to directory
structure. For example, suppose one directory tree contains a directory
named `D' with some subsidiary files, and another contains a file
with the same name `D'. `diff -r' does not output enough
information for patch
to transform the directory subtree into
the file.
There should be a way to specify that a file has been removed without
having to include its entire contents in the patch file. There should
also be a way to tell patch
that a file was renamed, even if
there is no way for diff
to generate such information.
There should be a way to tell patch
that a file's time stamp
has changed, even if its contents have not changed.
These problems can be fixed by extending the diff
output format
to represent changes in directory structure, and extending patch
to understand these extensions.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Some files are neither directories nor regular files: they are unusual
files like symbolic links, device special files, named pipes, and
sockets. Currently, diff
treats symbolic links like regular files;
it treats other special files like regular files if they are specified
at the top level, but simply reports their presence when comparing
directories. This means that patch
cannot represent changes
to such files. For example, if you change which file a symbolic link
points to, diff
outputs the difference between the two files,
instead of the change to the symbolic link.
diff
should optionally report changes to special files specially,
and patch
should be extended to understand these extensions.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
When a file name contains an unusual character like a newline or
white space, `diff -r' generates a patch that patch
cannot
parse. The problem is with format of diff
output, not just with
patch
, because with odd enough file names one can cause
diff
to generate a patch that is syntactically correct but
patches the wrong files. The format of diff
output should be
extended to handle all possible file names.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Applying patch
to a multiple-file diff can result in files
whose time stamps are out of order. GNU patch
has
options to restore the time stamps of the updated files
(see section 10.5 Updating Time Stamps on Patched Files), but sometimes it is useful to generate
a patch that works even if the recipient does not have GNU patch,
or does not use these options. One way to do this would be to
implement a diff
option to output diffs in time stamp order.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
It would be nice to have a feature for specifying two strings, one in from-file and one in to-file, which should be considered to match. Thus, if the two strings are `foo' and `bar', then if two lines differ only in that `foo' in file 1 corresponds to `bar' in file 2, the lines are treated as identical.
It is not clear how general this feature can or should be, or what syntax should be used for it.
A partial substitute is to filter one or both files before comparing, e.g.:
sed 's/foo/bar/g' file1 | diff - file2 |
However, this outputs the filtered text, not the original.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
If you think you have found a bug in GNU cmp
,
diff
, diff3
, or sdiff
, please report it
by electronic mail to the
GNU utilities
bug report mailing list [email protected]. Please send
bug reports for GNU patch
to
[email protected]. Send as precise a description of the
problem as you can, including the output of the `--version'
option and sample input files that produce the bug, if applicable. If
you have a nontrivial fix for the bug, please send it as well. If you
have a patch, please send it too. It may simplify the maintainer's
job if the patch is relative to a recent test release, which you can
find in the directory ftp://alpha.gnu.org/gnu/diffutils/.
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |