bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: split behavior


From: Pádraig Brady
Subject: Re: split behavior
Date: Mon, 14 Sep 2009 22:31:36 +0100
User-agent: Thunderbird 2.0.0.6 (X11/20071008)

Pádraig Brady wrote:
> Roger McNichols wrote:
>> I found a machine with the old version of split.
>>
>> home:~> uname -a
>> Linux home 2.2.13 #4 Thu May 8 23:11:31 CDT 2003 i686 unknown
>> home:~>
>> home:~> split --version
>> split (GNU textutils) 1.22
>> home:~>
>>
>>
>> Here's the result of 
>> home:~> cat /var/log/messages | split -2 - /tmp/x.
>>
>> not exactly as I recalled. instead of adding zz first time, adds za but ends 
>> with yz,
>> then starts adding zz...  Anyway:
>>
>> x.aa
>> x.ab
>> ...
>> x.yz
>> x.zaaa
>> x.zaab
>> ...
>> x.zyzz
>> x.zzaaaa
>> x.zzaaab
> 
> Interesting. I can confirm that textutils-1.22 behaves as above.
> http://ftp.gnu.org/old-gnu/textutils/textutils-1.22.tar.gz
> 
> I'll have a look later this evening to see when/why this changed.

The -a option and the fixed length suffix behaviour was added
in 2002 (2.0.21) so as to conform to POSIX:
http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=commit;h=65cbf7d1

So you'll be able to get the old behaviour by using split from:
http://alpha.gnu.org/gnu/coreutils/textutils-2.0.20.tar.bz2

POSIX seems to only consider fixed length suffixes, saying:

  split [-l line_count] [-a suffix_length] [file[name]]
  split -b n[k|m] [-a suffix_length] [file[name]]

  The suffix shall consist of exactly suffix_length lowercase letters

  By default, the names of the output files shall be 'x', followed
  by a two-character suffix from the character set as described
  above, starting with "aa", "ab", "ac", and so on, and continuing
  until the suffix "zz", for a maximum of 676 files. If the number
  of files required exceeds the maximum allowed by the suffix
  length provided, the split utility shall fail

  The -a option was added to overcome the limitation of being able
  to create only 676 files.

The last statement is ironic in this context. I would think that
the old behaviour is still desirable if -a was not specified and
POSIXLY_CORRECT was not set?

cheers,
Pádraig.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]