|
From: | Arthur Maciel |
Subject: | Re: [Chicken-users] Segfault with large data-structures (bug) |
Date: | Sun, 3 Feb 2013 10:46:01 -0200 |
Jim, that's great! Thank you so much!
I've read that facebook reached out billions of users. As I'm testing graph implementations to create a graph database, do you believe this code could handle billions nodes or I would need a lot more RAM to run it?
I'm not experienced in programming so I don't know if there is a way to handle this problem, like share this graph data structures along many machines, serializing it to disk and handle in memory only a part of it, etc. Do you know any applicable strategy to make it work properly?
Thanks again!
Arthur
OK, I patched the core and the program runs to completion. Patch forthcoming.
$ ./list-partials -:d -:hm16G
[debug] application startup...
[debug] heap resized to 1048576 bytes
[debug] stack bottom is 0x7fff6f80f4b0.
[debug] entering toplevel toplevel...
[debug] entering toplevel library_toplevel...
[debug] entering toplevel build_2dversion_toplevel...
[debug] entering toplevel eval_toplevel...
[debug] resizing mutation-stack from 8k to 16k ...
[debug] entering toplevel expand_toplevel...
[debug] entering toplevel modules_toplevel...
[debug] resizing mutation-stack from 16k to 32k ...
[debug] entering toplevel chicken_2dsyntax_toplevel...
[debug] entering toplevel srfi_2d69_toplevel...
Hash-tables - Inserting edges
[debug] resizing heap dynamically from 1024k to 2048k ...
[debug] resizing heap dynamically from 2048k to 4096k ...
[debug] resizing heap dynamically from 4096k to 8192k ...
[debug] resizing heap dynamically from 8192k to 16384k ...
[debug] resizing heap dynamically from 16384k to 32768k ...
[debug] resizing heap dynamically from 32768k to 65536k ...
[debug] resizing heap dynamically from 65536k to 131072k ...
[debug] resizing heap dynamically from 131072k to 262144k ...
5000 nodes inserted
[debug] resizing heap dynamically from 262144k to 524288k ...
10000 nodes inserted
[debug] resizing heap dynamically from 524288k to 1048576k ...
15000 nodes inserted
20000 nodes inserted
[debug] resizing heap dynamically from 1048576k to 2097152k ...
25000 nodes inserted
30000 nodes inserted
35000 nodes inserted
40000 nodes inserted
[debug] resizing heap dynamically from 2097152k to 4194304k ...
45000 nodes inserted
50000 nodes inserted
55000 nodes inserted
60000 nodes inserted
65000 nodes inserted
70000 nodes inserted
75000 nodes inserted
80000 nodes inserted
85000 nodes inserted
[debug] resizing heap dynamically from 4194304k to 8388608k ...
90000 nodes inserted
95000 nodes inserted
100000 nodes inserted
105000 nodes inserted
110000 nodes inserted
115000 nodes inserted
120000 nodes inserted
125000 nodes inserted
130000 nodes inserted
135000 nodes inserted
140000 nodes inserted
145000 nodes inserted
150000 nodes inserted
155000 nodes inserted
160000 nodes inserted
165000 nodes inserted
170000 nodes inserted
175000 nodes inserted
[debug] resizing heap dynamically from 8388608k to 16777216k ...
180000 nodes inserted
185000 nodes inserted
190000 nodes inserted
195000 nodes inserted
200000 nodes inserted
205000 nodes inserted
210000 nodes inserted
215000 nodes inserted
220000 nodes inserted
225000 nodes inserted
230000 nodes inserted
235000 nodes inserted
240000 nodes inserted
245000 nodes inserted
1042.938s CPU time, 50.714s GC time (major), 250066522 mutations, 44/851838 GCs (major/minor)
[debug] forcing finalizers...
[debug] application terminated normally
On Feb 2, 2013, at 8:06 PM, Jim Ursetto wrote:
> (bug found -- tl;dr see end of message)
>
> Figured it out: you're exceeding the default maximal heap size, which is 2GB. For whatever reason, Chicken doesn't terminate reliably and with an error in this situation, it just tries to continue.
>
> Simply run your program with -:d to see:
>
> $ ./list-partials -:d
> Hash-tables - Inserting edges
> [debug] resizing heap dynamically from 1024k to 2048k ...
> [debug] resizing heap dynamically from 2048k to 4096k ...
> [debug] resizing heap dynamically from 4096k to 8192k ...
> [debug] resizing heap dynamically from 8192k to 16384k ...
> [debug] resizing heap dynamically from 16384k to 32768k ...
> [debug] resizing heap dynamically from 32768k to 65536k ...
> [debug] resizing heap dynamically from 65536k to 131072k ...
> [debug] resizing heap dynamically from 131072k to 262144k ...
> 5000 nodes inserted
> [debug] resizing heap dynamically from 262144k to 524288k ...
> 10000 nodes inserted
> [debug] resizing heap dynamically from 524288k to 1048576k ...
> 15000 nodes inserted
> 20000 nodes inserted
> [debug] resizing heap dynamically from 1048576k to 2097151k ...
> 25000 nodes inserted
> 30000 nodes inserted
> 35000 nodes inserted
> 40000 nodes inserted
>
> Error: segmentation violation
>
>
>
> The easy fix to is increase the max heap size, say to 8GB:
>
> $ ./list-partials -:d -:hm8192000k
> [debug] application startup...
> [debug] heap resized to 1048576 bytes
> [debug] stack bottom is 0x7fff693fa4b0.
> [debug] entering toplevel toplevel...
> [debug] entering toplevel library_toplevel...
> [debug] entering toplevel build_2dversion_toplevel...
> [debug] entering toplevel eval_toplevel...
> [debug] resizing mutation-stack from 8k to 16k ...
> [debug] entering toplevel expand_toplevel...
> [debug] entering toplevel modules_toplevel...
> [debug] resizing mutation-stack from 16k to 32k ...
> [debug] entering toplevel chicken_2dsyntax_toplevel...
> [debug] entering toplevel srfi_2d69_toplevel...
>
> Hash-tables - Inserting edges
> [debug] resizing heap dynamically from 1024k to 2048k ...
> [debug] resizing heap dynamically from 2048k to 4096k ...
> [debug] resizing heap dynamically from 4096k to 8192k ...
> [debug] resizing heap dynamically from 8192k to 16384k ...
> [debug] resizing heap dynamically from 16384k to 32768k ...
> [debug] resizing heap dynamically from 32768k to 65536k ...
> [debug] resizing heap dynamically from 65536k to 131072k ...
> [debug] resizing heap dynamically from 131072k to 262144k ...
> 5000 nodes inserted
> [debug] resizing heap dynamically from 262144k to 524288k ...
> 10000 nodes inserted
> [debug] resizing heap dynamically from 524288k to 1048576k ...
> 15000 nodes inserted
> 20000 nodes inserted
> [debug] resizing heap dynamically from 1048576k to 2097152k ...
> 25000 nodes inserted
> 30000 nodes inserted
> 35000 nodes inserted
> 40000 nodes inserted
> [debug] resizing heap dynamically from 2097152k to 4194304k ...
> 45000 nodes inserted
> 50000 nodes inserted
> 55000 nodes inserted
> 60000 nodes inserted
> 65000 nodes inserted
> 70000 nodes inserted
> 75000 nodes inserted
> 80000 nodes inserted
> 85000 nodes inserted
> [debug] resizing heap dynamically from 0k to 1024k ...
> [panic] out of memory - heap full while resizing - execution terminated
>
> Uh oh, we've hit an actual bug now. Although we can get nodes up to
> 85000 by increasing max heap size from 2GB to 8GB, it appears to bomb
> after the heap exceeds 4GB, maybe indicating some 32-bit sizes
> left laying around in the code.
>
> Jim
>
> On Feb 2, 2013, at 7:18 PM, Jim Ursetto wrote:
>
>> On Feb 2, 2013, at 3:46 PM, Kristian Lein-Mathisen wrote:
>>
>>> I'm getting the same result here, when I run it through csc. When I run it through csi, though, it never seems to finish - is the task that big? I had to kill it after 2-3 hours.
>>
>> It's a hash table with 250,000 entries and 1,000 items per entry. That's going to be tens of gigabytes for heap (memory usage at 40,000 nodes just before it dies is about 2GB on my machine). It's 250,000 node inserts (O(V)), 250 million edge inserts (O(V*E)), and 125 billion operations (O(V*E^2)) to traverse the edge lists (since you are using member). To me that is pretty big.
>>
>> csi doesn't crash but appears to be stuck after 40,000 nodes inserted, the same place that csc crashes maybe. It's not swap related, I had plenty of memory, so something is maybe up with the hash table or GC. I can't explain why csi does not crash.
>>
>> This machine is 64-bit OS X with 16GB main memory and no obvious limit to the process data segment.
>>
>> Probably stupid shot in the dark: what happens when the Chicken heap size exceeds 4GB on a 64-bit machine?
>>
>> Jim
>
[Prev in Thread] | Current Thread | [Next in Thread] |