bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] Can a GAWK script access its full path\name at "run-time?


From: Aharon Robbins
Subject: Re: [bug-gawk] Can a GAWK script access its full path\name at "run-time?
Date: Tue, 12 May 2015 14:47:33 +0300
User-agent: Heirloom mailx 12.5 6/20/10

Hello. Re this:

> Date: Tue, 12 May 2015 10:11:48 +0900
> From: green fox <address@hidden>
> To: address@hidden
> Cc: address@hidden
> Subject: Re: [bug-gawk] Can a GAWK script access its full path\name at
>       "run-time?
>
> Command line parsing of gawk  works in strange ways
>
> Say there is something like
> gawk -i script1.awk -i script2.awk -- "$@"
> and lets say script 1 parses ARGC/ARGV.
>
> When we parse ARGC _before_ the script2 is called, we get the path of
> script2.awk,
> however, we are unable to call functions declared within script2.
>
> On the other hand, if ARGC/ARGV was parsed from script2,
> "-i script1.awk" (probably intentionally) disappears from ARGV, and we are
> able to call script1 from within script2.
> Same goes for -f option.
>
> The only reliable way at the moment is to load all scripts, then at
> the last script, call the cmdline parser. But then the argv is
> trimmed.
> Being able to parse the raw argv would be a win imho.
>
> On a *nix sustem with /proc/ , one would lookup
> "/proc/"PROCINFO["pid"]"/cmdline"
> But afaik, cygwin did not have the full capability of proc, so ymmv.
>
> Tested on Linux 3.2.29/gcc 4.7.1/awk version 4.1.60(git b035244efcc)
> -[foo.awk]------------------------------------------------
> function foo(){print "bar";};
> -[test.sh]------------------------------------------------
> #!/bin/bash
> function main(){
>  gawk 'BEGIN{ exit main( ARGC , ARGV );}
> function main( argc ,argv , this , ignore){
>   if( argc ){
>     ignore = 1;
>     for( i = 1  ; i <= argc ; i++ ){
>       if( ignore ){
>         if( argv[i] == "--" ) ignore = 0;
>         print "skipped["argv[i]"]";
>         continue;
>       }
>     }
>   }
> }
> END{foo();}
> ' -i foo.awk -- "$@"
>  return;
> }
> main "$@";
> -------------------------------------------------
> Output:
> skipped[-i]
> skipped[foo.awk]
> skipped[--]
> gawk: cmd. line:6: fatal: function `foo' not defined
> -------------------------------------------------
>
> I can not come up with any good solutions for this, but it would be
> nice to have some sort of consistency/programmatic way to detect/avoid
> this behavior.
>
> fox

Your test program does not test what you claim to be true. You supplied
the first part of the program on the command line, and as documented,
when gawk finds that the first non-option, non -f/-i argument is the
program, no other options are processed.  When -i is used consistently,
correct results are produced:

        $ cat foo0.awk 
        BEGIN{ exit main( ARGC , ARGV );}
        function main( argc ,argv , this , ignore){
          if( argc ){
            ignore = 1;
            for( i = 1  ; i < argc ; i++ ){
              if( ignore ){
                if( argv[i] == "--" ) ignore = 0;
                print "skipped["argv[i]"]";
                continue;
              }
            }
          }
        }
        END{foo();}

        $ cat foo.awk 
        function foo(){print "bar";};

        $ ./gawk -i foo0.awk -i foo.awk --  a b c
        skipped[b]
        skipped[c]
        bar

You also had a small bug; you should test `i < argc', not `i <= argc'.

> The only reliable way at the moment is to load all scripts, then at
> the last script, call the cmdline parser. But then the argv is
> trimmed.

This is a misunderstanding. Gawk parses ALL the scripts provided before
starting to run any code.  The effect is as if all the files and command
line scripts (-f, -i, -e) were concatenated together into a single
program, and that program is then run.

While access to the raw argv would be nice, that's not how awk has ever
worked, nor is that how POSIX defines it.  Sorry.

Note that even with such access, there's no easy way to reconstitute the
absolute path of any given script, since pathnames could be relative,
and also since scripts can be found via a path search.

I respectfully suggest studying the sections in the manual that
describe all this. Everything is clearly documented.

Arnold



reply via email to

[Prev in Thread] Current Thread [Next in Thread]