[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#9780: sort -u throws out non-duplicates
From: |
Jim Meyering |
Subject: |
bug#9780: sort -u throws out non-duplicates |
Date: |
Fri, 17 Aug 2012 21:53:06 +0200 |
Paul Eggert wrote:
> On 08/17/2012 12:36 PM, Jim Meyering wrote:
>> The first time the safe_text buffer is allocated
>> it will have to be disjoint from the line.text buffer
>> and from the buffer into which we're about to fread.
>> Thereafter, regardless of reallocation, overlap should
>> always be false.
>
> I haven't thought it through entirely, but I was
> worried about the case where there is a saved line
> but no saved_text, the buffer is reallocated, and
That is precisely what happens when this "(unique && ..." condition
is true for the first time (presuming you mean s/saved_text/safe_text/)
/* With --unique, when we're about to read into a buffer that
overlaps the saved "preceding" line (saved_line), copy the line's
.text member to a realloc'd-as-needed temporary buffer and adjust
the line's key-defining members if they're set. */
if (unique && overlap (ptr, readsize, &saved_line))
{
/* Copy saved_line.text into a buffer where it won't be clobbered
and if KEY is non-NULL, adjust saved_line.key* to match. */
static char *safe_text;
static size_t safe_text_n_alloc;
if (safe_text_n_alloc < saved_line.length)
{
safe_text_n_alloc = saved_line.length;
safe_text = x2nrealloc (safe_text, &safe_text_n_alloc, 1);
}
memcpy (safe_text, saved_line.text, saved_line.length);
if (key)
{
#define s saved_line
s.keybeg = safe_text + (s.keybeg - s.text);
s.keylim = safe_text + (s.keylim - s.text);
#undef s
}
saved_line.text = safe_text;
}
safe_text is initially NULL and we enter that block
only when we're about to fread into a buffer that overlaps
the current saved_line.text buffer.
In that case, we allocate an initial safe_text buffer,
copy saved_line.text into it, and update saved_line.text
to point to the just-allocated/initialized buffer.
Any test of overlap that compares that just-allocated
(or realloc'd) buffer with the about-to-be-fread-into
buffer will return false.
> then we test for overlap. If the reallocated buffer
> does not overlap the original buffer, the test for
> overlap will fail even though the saved line needs
> to be copied into a new saved_text buffer.
>
> I'll stare at the code some more....
- bug#9780: sort -u throws out non-duplicates, (continued)
- bug#9780: sort -u throws out non-duplicates, Jim Meyering, 2012/08/16
- bug#9780: sort -u data loss deserves new release ASAP [Re: bug#9780: sort -u..., Jim Meyering, 2012/08/17
- bug#9780: sort -u data loss deserves new release ASAP [Re: bug#9780: sort -u..., Bernhard Voelker, 2012/08/17
- bug#9780: sort -u data loss deserves new release ASAP [Re: bug#9780: sort -u..., Jim Meyering, 2012/08/17
- bug#9780: sort -u throws out non-duplicates, Paul Eggert, 2012/08/17
- bug#9780: sort -u throws out non-duplicates, Jim Meyering, 2012/08/17
- bug#9780: sort -u throws out non-duplicates, Paul Eggert, 2012/08/17
- bug#9780: sort -u throws out non-duplicates,
Jim Meyering <=
- bug#9780: sort -u throws out non-duplicates, Paul Eggert, 2012/08/17
- bug#9780: sort -u throws out non-duplicates, Jim Meyering, 2012/08/17
- bug#9780: sort -u throws out non-duplicates, Paul Eggert, 2012/08/17
- bug#9780: sort -u throws out non-duplicates, Jim Meyering, 2012/08/20
- bug#9780: sort -u throws out non-duplicates, Paul Eggert, 2012/08/17
- bug#9780: sort -u throws out non-duplicates, Jim Meyering, 2012/08/17
- bug#9780: sort -u throws out non-duplicates, Jim Meyering, 2012/08/18
- bug#9780: sort -u throws out non-duplicates, Paul Eggert, 2012/08/18