bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] [PATCH] New option: --rename-output: modify output filena


From: Andrew Cady
Subject: Re: [Bug-wget] [PATCH] New option: --rename-output: modify output filename with perl
Date: Thu, 1 Aug 2013 15:22:56 -0400
User-agent: Mutt/1.5.20 (2009-06-14)

On Thu, Aug 01, 2013 at 10:36:00AM -0700, Micah Cowan wrote:
> On Wed, Jul 31, 2013 at 06:32:18PM -0400, Andrew Cady wrote:
> > By that, I assume you mean to execute the option in the shell.  So the
> > existing usage:
> > 
> >   --rename-output=s/x/y/
> > 
> > would (almost) become:
> > 
> >   --rename-output='perl -lpe "BEGIN{\$|++}" -e s/x/y/'
> 
> It COULD, sure, but why on earth would someone choose to use that
> instead of 'sed s/x/y/'?

With sed, you still need -u, or else there is a deadlock.  This
knowledge should be embedded into wget because most people don't have
it.

> >   --name-filter=sed   --rename='sed code here'
> >   --name-filter=perl  --rename='perl code here'
> >   --name-filter=shell --rename='shell code here'
> >   --name-filter=pcre  --rename='regex here'
> 
> All of the above seems like over-engineering to me, especially just
> to avoid the occasional backslash... don't forget, the more complexity
> you introduce, the more bugs as well (and wget already rather suffers
> from such a scenario).

Well, I actually went ahead and implemented it already.  I even did
pcre, although that's not done (/g does not work).

It's not complex.  The only difference between sed, perl, and shell is
the command line that is given to execvp.  There's a simple switch() to
decide which argument list to use.  It's basically just a simple alias
mechanism.  Here's the relevant code:

  pipe2_t init_pipe_filter()
  {
    pipe2_t res;
    char *code = opt.rename_output;
    char *perl[] = { "perl", "perl", "-0lpe", "BEGIN{$|++}", "-e", code, 0 };
    char *sed[] = { "sed", "sed", "-ue", code, 0 };
    char *shprog = getshell();
    char *shell[] = { shprog, shprog, "-c", code, 0 };

    char **cmd = 0;
    switch (filter_type) {
      case perl_filter:
        cmd = perl, delim = 0;
        break;
      case sed_filter:
        cmd = sed, delim = '\n';
        break;
      case shell_filter:
        cmd = shell, delim = getdelimopt();
        break;
      default:
        error(1, 0, "unpossible!");
    }

    int in = -1, out = -1; /* initialize to silence warning */
    if (open2(cmd, &in, &out, &res.pid) < 0)
      error(1, errno, "init_name_filter: open2");
    if ((res.out = fdopen(out, "w")) < 0)
      error(1, errno, "init_name_filter: fdopen");
    if ((res.in  = fdopen(in, "r")) < 0)
      error(1, errno, "init_name_filter: fdopen");

    return res;
  }


It's very simple, but a big win I think.  I can use perl without any
hassle.  Other people can use sed without even knowing about '-u'.
Anybody who wants anything else can call it through the shell.  And
platforms without pipe() will eventually get pcre.

> Sh is already well-tested...

Different systems have different shells.  When you have to try to escape
for the system shell, you run into portability problems, and general
confusion regarding double-escaping.  If you sit in freenode #openssh
for a while, you can see these problems routinely, resulting from the
fact that ssh remote commands are executed through the remote system
shell.

You cannot impose the need to double-escape on the user in the name of
avoiding bugs.  That is not avoiding bugs.  That is just making the user
write the bugs for you!



reply via email to

[Prev in Thread] Current Thread [Next in Thread]