bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

split behavior


From: Roger McNichols
Subject: split behavior
Date: Fri, 11 Sep 2009 19:51:36 -0500 (CDT)


Currently using version 5.2.1 of coreutils 'split' command produces files 
with 'intelligent' suffixes.  That is, the number of letters (or digits) 
required
is based on the known number of output files that will be required.

An OLD version of split (and I dont know which one becuase I dont have it 
anymore)
used 'dumb' suffixes.  That is, it would start with aa, ab, ac, ..., ba, bb, 
bc, ...
util it got to zz and then would jump to zzaa, zzab, zzac, ... etc and then on 
to zzaaaa, zzaaab, zzaaac, etc...

While the OLD version may have been annoying, the behavior had two distinct 
advantages:
FIRST, corresponding sections of files, regardless of the overall file size, 
would have 
the same suffixes.  
SECOND, the OLD version of split was capable of reading an arbitrary 
stream of input from the stdin, where as the NEW version complains when it gets 
to zz that 
output suffixes are exhausted (and fails).

A final problem is that there does not appear to be a way to get the NEW 
version to 
behave like the OLD version.  So an application created years ago with the OLD 
version which 
used the suffixes as a reference for corresponding data tags can no longer be 
augmented 
without rebuilding the entire related database. 

It would be HIGHLY convenient if split could be made to operate in the OLD 
non-presumptive 
manner for creating suffixes, at least for the "split -N - prefix." stdin 
invocation.

I can provide more detail if it is helpful.  I hope that I have made the 
difficulty clear
enough.  Thanks for considering. 

-roger

___________________________
Roger J. McNichols, Ph.D.
Chief Scientist
BioTex, Inc.
8058 El Rio St.
Houston, TX  77054
713.741.0111 (o)
713.741.0122 (f)
832.338.4371 (m)




reply via email to

[Prev in Thread] Current Thread [Next in Thread]