bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] gawk 4.x series mmap attempts to alocates 32GB of memory


From: Aharon Robbins
Subject: Re: [bug-gawk] gawk 4.x series mmap attempts to alocates 32GB of memory and fails when using printf("%c") supplied with large floating point value.
Date: Thu, 10 Jul 2014 22:42:02 -0700
User-agent: Heirloom mailx 12.5 6/20/10

Hi.

> Date: Fri, 11 Jul 2014 09:47:52 +0900
> Subject: Re: [bug-gawk] gawk 4.x series mmap attempts to alocates 32GB of
>  memory and fails when using printf("%c") supplied with large floating point 
> value.
> From: green fox <address@hidden>

Do you have a real name?  Just wondering.

> To: Aharon Robbins <address@hidden>
> Cc: address@hidden
>
> Just a thought, _if_ I was to write code, which patch would you prefer
> to accept?
>
> A) Routines to address the issue for handling utf-8 string when -b is at
> effect.
>
> B) Provide length(),substr(),index(),print() with extended capability to
>    handle raw single byte data. (even when one is on a utf-8 system)
>
> The reason asking this, is when one is reading from a ( disk / server )
> that does not match the local character set, the current gawk setup
> fails really badly.

I'm aware of this. I don't have a good solution to this very thorny problem.

I would actually prefer that instead of a patch, you write a loadable
extension using the API defined for that purpose in the 4.1 release.
You could then contribute it to the gawkextlib project.

The manual fully documents how to write extensions.  I believe that
the API gives you everything you need to write the extended
versions of the functions you desire, without having to have them
built-in to the core gawk interpreter.  (If not, then that should
be discussed separately, in terms of enhancing the API.)

I think such an extension would be a valuable thing to have.

HTH,

Thanks,

Arnold



reply via email to

[Prev in Thread] Current Thread [Next in Thread]