[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Gnu-arch-users] Re: Binary Diff System in Arch
From: |
Tom Lord |
Subject: |
Re: [Gnu-arch-users] Re: Binary Diff System in Arch |
Date: |
Thu, 5 Feb 2004 08:38:15 -0800 (PST) |
> From: Martin Pool <address@hidden>
>> xdiff3 is easy, given xdiff. If ANCESTOR and MERGED-TO are exactly
>> the same, then the merged file output should just be a copy of
>> MERGED-FROM. Otherwise, the output should be a copy of MERGED-TO
>> and "xdiff ANCESTOR MERGED-FROM" stored as a .rej file. (Note --
>> tla will need a slight modification to how it handles the .rej
>> file in this case but nothing major.)
> I think it might be more useful to produce three files: MINE, OTHER,
> and ANCESTOR (as Subversion does). The .rej for an xdelta is not very
> useful to the user, because they cannot fix or apply it by hand the
> way they can with a textual delta. Also it is not possible for only
> some chunks of an xdelta to fail. The only useful thing for arch to
> do with the delta is to apply it.
Leaving those files is a fine idea and easy to do -- but xdiff3
shouldn't do that itself. As a stand-alone tool, it would be silly
for xdiff3 to leave those files: the user has provided the three files
as input in the first place.
arch should be the one to store those three files in project trees
(which is a pretty trivial change). In this case, it's arch, not the
user, that has the files on-hand.
I think there are four distinct behaviors that would be useful from
tla (when doing 3-way merges):
~ default
Text files: arch just leaves conflict markers (if any) in the
merged-into file and creates a ".rej" file that says "Hey, there
are conflict markers in that file.
Binary files: arch can, indeed, leave the three files and create
a ".rej" file that says "binary merge conflicts occurred".
~ --no-markers
All files: do the merges that are conflictless. If a merge would
have conflicts, leave the merged-into unmodified and store all
three files. Store a ".rej" that says: "merge would have
conflicts".
~ --no-inexact
All files: if the merged-to file matches the ancestor precisely,
replace it with merged-from. Otherwise, leave the three files
and a .rej that says "merge needed".
~ --merge-data-only
All files: leave the three files. Leave merged-to unmodified.
Make .rej that says "merge needed".
The options --no-markers, --no-inexact, and --merge-data-only are
useful in combination with external tools such as a 3-way merge GUI.
Now, what does all this imply for xdiff3? Note that for the default
behavior of arch, binary and text files have to be treated differently
by tla. If there are conflicts with a text file, arch has to assume
that there are markers in the merged-to file. If there are conflicts
with a binary file, arch has to figure out to store the three files.
I don't think that that distinction is specific to "text" vs. "binary"
files. It strikes me as a distinction that might apply to other file
types treated specially by xdiff3 as well --- it applies to any file
type for which xdiff3 can't store conflict markers for some or all of
the changes. Therefore, I don't want to teach tla about the
differene between "text" and "binary" files.
So, I'm saying that we should extend the "calling conventions" for
diff3 when making xdiff3. If xdiff3 leaves behind a .rej file (and
exits with status 1), that means "xdiff3 couldn't record conflicts as
markers in the merged-to file: they're stored here instead." That
convention can be applied to other file types as well.
I agree that the xdelta-diff in the .rej is not likely to be directly
useful. The user has the ANCESTOR and MERGED-FROM on-hand and can
create it using xdiff if he really wants to. However, it is
indirectly useful -- or at least consistent: .rej files created by
patch contain "diff hunks" -- a .rej created by xdiff3 should be no
different.
(It's more obvious that xpatch should store the xdelta-diff in the
.rej file. I agree it will not _often_ be useful but sometimes it
will be. For example, I might have an editor-backup file to which it
applies cleanly.)
Another way to look at all of this is as a distinction between
"extensional" and "intensional" file types:
We would have "intensional" (explicitly declared) file types if arch
were modified so that every file has an official "type". The type
property of a file might be something like "text", "binary", or
"open-office". For every new file type, arch would have to learn the
special rules for handling that file type. Arch would have to learn
how to store file types; how to do merges when file types change;
and so forth.
We will have "extensional" file types if arch's notion of file-type
(for merging purposes) is determined entirely by how diff, patch, and
diff3 treat the files:
~ files that can't be diffed: those are stored in changesets as
whole-text copies of ORIG and MOD
~ files that diff3 merges without creating a .rej file: those are
file types presumed to be "change-marker-friendly".
~ files that diff3 _does_ create a .rej file: those are file types
for which 3-way merging may, nevertheless, result in an external
record of some or all of the conflicts that occured
-t