bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] Memory leak


From: Stephane Delsert
Subject: Re: [bug-gawk] Memory leak
Date: Thu, 30 Mar 2017 14:49:19 +0000

Hi,

I ran the script with --show-reachable=yes for 1MM & 2MM ( se the joined 
reports : REPORT_1MM_V2.txt and REPORT_2MM_V2.txt).  I'm sorry my test this 
night with 100MM has failed, I had to kill the job this morning.  I will make 
another test with 50MM of records this night.


I made tests with a modified test.awk names test2.awk to track the dimension of 
the array tab_store and a check to be sure that we have the same number of 
fields along the file. The maximum size of the array is 6 (see cnt_gawk.1MM.txt 
& cnt_gawk.2MM.txt and the reports : REPORT_1_MM_V3.txt and REPORT_2MM_V3.txt) 

Thanks,

Stéphane.


-----Original Message-----
From: Andrew J. Schorr [mailto:address@hidden 
Sent: jeudi 30 mars 2017 14:56
To: address@hidden
Cc: Stephane Delsert <address@hidden>; Vihan_Sharma - Vihan Sharma (LiveRamp) 
<address@hidden>; Fatima Aliane <address@hidden>; address@hidden
Subject: Re: [bug-gawk] Memory leak

I think that's probably true, but my concern is based on some strange results 
in the REPORT_1MM.txt and REPORT_2MM.txt valgrind logs. In both cases, only 800 
NODE objects are allocated, but at exit, the 1MM case reports "in use at exit:
5,749,914 bytes in 59,366 blocks", whereas the 2MM case says "in use at exit:
11,416,868 bytes in 118,397 blocks".  So we have an extra 59,031 blocks in use, 
but what are they if not NODEs or BUCKETs?

It seems impossible to answer that question unless Stephane runs valgrind in 
those 2 cases with --show-reachable=yes.

Regards,
Andy

On Thu, Mar 30, 2017 at 01:04:22AM -0600, address@hidden wrote:
> Make that 263 meg.  Just checked.
> 
> I am comfortable that we don't have a true memory leak.
> 
> It's on my TODO list to try to reduce the overhead of array storage, 
> but that won't be in time for the next release.
> 
> Thanks,
> 
> Arnold
> 
> address@hidden wrote:
> 
> > "Andrew J. Schorr" <address@hidden> wrote:
> >
> > > You might need to run valgrind with --leak-check=full 
> > > --show-reachable=yes to get to the bottom of this. I don't see any 
> > > obvious leaks when I run that on the 344-record file that you sent.
> >
> > That would be helpful.
> >
> > I ran gawk on a ~ 20 megabyte file and it hit a steady size as shown 
> > by top. I think that there aren't any real leaks here. Valgrind is 
> > generally good about reporting real leaks as "definitely lost" and I 
> > have yet to see that in this instance.
> >
> > It may be that we could reduce gawk's memory usage for arrays, but 
> > that's a different issue from a leak.
> >
> > Thanks,
> >
> > Arnold
***************************************************************************
The information contained in this communication is confidential, is
intended only for the use of the recipient named above, and may be legally
privileged.

If the reader of this message is not the intended recipient, you are
hereby notified that any dissemination, distribution or copying of this
communication is strictly prohibited.

If you have received this communication in error, please resend this
communication to the sender and delete the original message or any copy
of it from your computer system.

Thank You.
****************************************************************************

Attachment: REPORT_2MM_V2.txt
Description: REPORT_2MM_V2.txt

Attachment: REPORT_1MM_V2.txt
Description: REPORT_1MM_V2.txt

Attachment: cnt_gawk.2MM.txt
Description: cnt_gawk.2MM.txt

Attachment: REPORT_2MM_V3.txt
Description: REPORT_2MM_V3.txt

Attachment: cnt_gawk.1MM.txt
Description: cnt_gawk.1MM.txt

Attachment: REPORT_1MM_V3.txt
Description: REPORT_1MM_V3.txt

Attachment: test2.awk
Description: test2.awk


reply via email to

[Prev in Thread] Current Thread [Next in Thread]