--- Begin Message ---
Subject: |
Re: Slight bug in split :-) |
Date: |
Fri, 22 Jun 2012 00:12:00 +0200 |
François Pinard wrote:
> Hi, Jim.
>
> I was looking for a problematic spot from a big file, and to isolate it,
> used "split" repeatedly as a way to zoom into the proper place. Just to
> try, I used "split -C 100000 xad" at one place (after saving "xad"
> first, of course). "split" interrupted itself, producing less output
> than input.
>
> My suggestion would be that split moans in some way before it destroys
> its own input. :-)
>
> François
Hi François!
Thank you for reporting that.
That's definitely a bug.
For the record, here's a quick reproducer:
$ seq 10 > xaa
$ split -C 6 xaa
$ wc -c x??
6 xaa
1 xab
7 total
$ head x??
==> xaa <==
1
2
3
==> xab <==
3$
I've Cc'd the bug list, in case someone would like to write
the patch (fix, NEWS and test) before I get to it.
I may not have time tomorrow.
--- End Message ---
--- Begin Message ---
Subject: |
Re: bug#11761: Slight bug in split :-) |
Date: |
Fri, 22 Jun 2012 09:47:59 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:6.0) Gecko/20110816 Thunderbird/6.0 |
On 06/22/2012 08:56 AM, Jim Meyering wrote:
> Pádraig Brady wrote:
> ...
>> diff --git a/src/split.c b/src/split.c
>> index 53ee271..3e3313a 100644
>> --- a/src/split.c
>> +++ b/src/split.c
>> @@ -92,6 +92,9 @@ static char const *additional_suffix;
>> /* Name of input file. May be "-". */
>> static char *infile;
>>
>> +/* stat buf for input file. */
>> +static struct stat in_stat_buf;
>> +
>> /* Descriptor on which output file is open. */
>> static int output_desc = -1;
>>
>> @@ -362,6 +365,17 @@ create (const char *name)
>> {
>> if (verbose)
>> fprintf (stdout, _("creating file %s\n"), quote (name));
>> +
>> + struct stat out_stat_buf;
>> + if (stat (name, &out_stat_buf) == 0)
>> + {
>> + if (SAME_INODE (in_stat_buf, out_stat_buf))
>> + error (EXIT_FAILURE, 0, _("%s would overwrite input.
>> Aborting."),
>> + quote (name));
>> + }
>> + else if (errno != ENOENT)
>> + error (EXIT_FAILURE, errno, _("cannot stat %s"), quote (name));
>> +
>> return open (name, O_WRONLY | O_CREAT | O_TRUNC | O_BINARY,
>> (S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP | S_IROTH |
>> S_IWOTH))
>> }
>
> Hi Pádraig,
>
> Thanks for taking this on.
> That introduces a minor TOCTOU race.
> It would probably never matter in practice,
> but who knows... if we can avoid it, why not?
> What do you think about something like this?
>
> int fd = open (name, (... as above, but without O_TRUNC...)...
> if (fd < 0)
> return fd;
> if ( ! fstat (fd, &out_stat_buf))
> error (EXIT_FAILURE, errno, _("failed to fstat %s"), quote (name));
> if (SAME_INODE (in_stat_buf, out_stat_buf))
> error (EXIT_FAILURE, 0, _("%s would overwrite input. Aborting."),
> quote (name));
> if ( ! ftruncate (fd, 0))
> error ...
> return fd;
>
> The above might even be a tiny bit faster for long names,
> since it resolves each name only once.
Well probably slower due to the extra truncate syscall,
but point taken on the unlikely TOCTOU race.
I'll push the attached in a while.
cheers,
Pádraig.
>
split-input-guard.diff
Description: Text document
--- End Message ---