bug-findutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: cygwin xargs limitation: ARG_MAX depends on command


From: James Youngman
Subject: Re: cygwin xargs limitation: ARG_MAX depends on command
Date: Tue, 6 Sep 2005 09:29:13 +0100
User-agent: Mutt/1.5.9i

On Mon, Sep 05, 2005 at 02:01:53PM -0600, Eric Blake wrote:

> Cygwin has an interesting situation affecting xargs 4.2.25, where the
> maximum command-line length of a command is determined by the command
> being executed.  If the command lives on a cygwin-executable mount, the
> command line is passed using cygwin internal memory, and can easily be
> several megabytes.  But in the typical situation, the command lives on a
> normal Windows mount, where the command line goes through the Windows API
> CreateProcess which has a 32k limit.  

I believe that Cygwin should then define ARG_MAX to be 32K.  It's no
good indicating a value for a limit that the system then cannot
honour.

> xargs, by default, wants to use 128k + environment_size, which means
> that on a non-cygexec mount it has a good chance of exceeding the
> max argument limitation.

xargs uses 128KiB+(size of environment) for the command line length,
unless this exceeds (ARG_MAX-2048), in which case the value is
reduced.  This can be overridden via the -s option.  The relevant code
in xargs is overcomplex, uses too many variables, and could do with
being simplified.

> The problem is that POSIX requires that ARG_MAX, if defined, be a
> constant, and that sysconf(_SC_ARG_MAX) be constant for the life of a
> process, as well as being no less than ARG_MAX if it was defined.  Cygwin
> 1.5.18 does not define ARG_MAX, and sysconf(_SC_ARG_MAX) returns 1 meg,
> meaning that xargs violates the 32k limit of non-cygexec mounts.  

To me, that sounds like a bug in Cygwin.

> But even if cygwin 1.5.19 were to define ARG_MAX and changes
> sysconf(_SC_ARG_MAX) to 32k instead of 1 meg, this would unfairly
> penalize cygexec mounts, which can handle much bigger command lines.

As Bob said though there is no great advantage in larger command lines
in terms of performance.  The main reason that xargs uses even the
value it does is to support inputs which contain large single
arguments (with "xargs -i", for example).

> Would you accept a patch that changes lib/buildcmd.c to accept the
> command to be exec'd as a parameter, so that on cygwin it can return
> either 32k or 4 meg depending on whether cygwin detects that the
> file to be exec'd is mounted cygexec?

Such a patch should be workable because {}-substitution is forbidden
in the command name itself, so that is static for the life of an xargs
invocation.  

However, such a patch could be somewhat problematic since currently
xargs.c decides the maximum command line length before it calls
getopt_long(), and hence before it knows what the name of the command
is.  The second likely problem with the patch is that xargs would then
have to simulate the path-searching behaviour of execvp() in order to
locate the binary that would be used.

To be honest, I'm not certain that the benefit of the change would
outweigh the disadvantages of having a big load of additional code
which searches $PATH (or, if it is not set, does whatever matches the
implementation-dependent behaviour of execvp() on that system) and
then checks the binary.  I wouldn't want to invoke that code on all
platforms, which therefore also poses a maintenance problem - we'd
have a nontrivial chunk of code that I simply couldn't test.

To summarise my comments, I'm not convinced that the benefit of the
longer command line really outweighs the problems and disadvantages of
doing this.  The folks who maintain and package Cygwin may be keener
on this kind of thing.  Nevertheless if you do provide a patch I would
certainly take a look at it.

Regards,
James.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]