quilt-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Quilt-dev] [OT] Improving diffstat bar graphs


From: Jean Delvare
Subject: [Quilt-dev] [OT] Improving diffstat bar graphs
Date: Fri, 12 Aug 2005 21:41:10 +0200

Hi all,

I am not totally happy with the bar graphs generated by diffstat, and am
trying to make them reflect patch "profiles" better.

The first thing I dislike is that diffstat is currently designed to get
the length of each bar as correct as possible, and tends to sacrifice
the contents of that bar to reach this objective. In order to get the
correct length, the rounding error made on the '+' part is carried over
on the '-' part. This may make a patch with more insertions than
deletions have a bar showing the opposite (e.g. "++---") in some cases.

The second thing I don't like is that, when one file has many more
changes that all the others, the others might end up having no bar at
all, pretty much voiding the point of the bar graph. This is even worse
than it should because the total length of a bar is currently computed
without proper rounding, but even without that, the problem exists.

As I have been working on large kernel cleanup patches these days, it is
particularly important for me that the diffstat bar graphs represent the
changes accurately, especially from a qualitative point of view. If I am
adding 40 lines of code to one file and removing two lines in each of 50
files, I want it to show in the in the bar graph. If it doesn't,
reviewers will start wondering whether my cleanup patch really is one.

The design rules rules I think should be used for a better bar graph
algorithm are:

1* The "allocation" of character slots in the bar should be as
symmetrical as possible. Rather than reporting the error from one set of
characters to the next one, use a largest remainder algorithm to
allocate the "extra" slot to the set of changes which deserves it the
most.

2* Equilibrium between '+' and '-' is more important than bar length
accuracy. If a patch adds as many lines to as it removes from a given
file, I do not want it to be represented with "+++----" or "++++---". I
prefer "+++---" even if the bar is now slightly shorter than it should
have been in the first place. Likewise, if a patch adds more to a file
than it removes from it, I do not want the bar to be symmetrical.

3* Each file should have at least one character representing the changes
if the number of insertions differs from the number of deletions. In
other words, if the total bar length after scaling is 0, and there are
more insertions than deletions, I want the bar to be '+' (and '-' if
there are more deletions than insertions.) That way, removing (or
adding) a small number of lines from (to) a large number of files will
be properly represented.

Attached is a patch to make diffstat 1.39 behave according to the three
rules above. I am using it locally and am rather satisfied with the
results so far.

I would appreciate if quilt users, most of which I suspect are diffstat
users, could comment on this. More precisely, I would welcome feedback
on each of the three design rules above. Each could be implemented
independant from the other ones if a majority seems to agree that not
all rules are correct.

Thanks,
-- 
Jean Delvare

Attachment: diffstat-1.39-better-bar-graph-2.diff
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]