guile-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: GC Warning related to large mem block allocation - Help needed


From: Daniel Llorens
Subject: Re: GC Warning related to large mem block allocation - Help needed
Date: Mon, 1 Jan 2018 23:05:37 +0100

On 01 Jan 2018, at 15:11, Freja Nordsiek <address@hidden> wrote:

> The only worry then would be that it would get collected while still
> being used. I think most cases, this would not be a problem. However, if
> someone makes a new bytevector from an existing one from somewhere in
> the middle, it is possible that the new one would only point to the
> middle and not the head and thus could be collected prematurely (would
> need to do some more digging to see if the new one would be allocated
> using make_bytevector_from_buffer). Or, if someone was using C code to
> say take the norm of the vector (very common operation often done with
> BLAS) and the scheme code wasn't going to use the bytevector anymore,
> there might only be a pointer on the stack pointing to the current
> element that the C code is reading and as soon as it gets past the 512
> byte mark, the bytearray might get collected while it is still being
> worked on which would be a disaster. So I am not sure that the
> allocation could be safely changed to use
> GC_MALLOC_ATOMIC_IGNORE_OFF_PAGE if the bytevector is large. I do not
> know enough about Guile internals yet to know if typical pure scheme
> operations would run into problems. I think it is definitely possible
> that there are FFI cases where problems could be run into, which would
> then mean the coder has to take extra precautions to prevent collection,
> which could be a major problem for changing the allocation Guile 2.0.x
> and 2.2.x since it would be a major API change. Wouldn't be such an
> issue for 3.x series since the API could be changed but it would be a
> bit of a surprising result for people to have to worry about if using
> FFI. I could be wrong on this - a pointer to the head might still be
> kept on the stack and then there is no problem.

Hi Freja,

thanks for these comments. I know too little about GC, but I've dug now and 
then in the bytevector / array code. As a user of large arrays, I'm interested 
in making them more usable. 

The primary means for indirect access to bytevectors / arrays in Guile is the 
array API. All array objects contain a reference to their ‘backing store’, and 
array handles keep a second copy of this reference, plus a pointer to the head 
of the backing store (not to the head of the array itself). The array API uses 
a get/release mechanism whose only purpose is to keep those pointers on the 
stack. This makes using arrays harder and I've found it annoying, so I've asked 
Andy about it a couple times.

AFAIU, the only functions in the public API that can create a vector/bytevector 
object from raw memory are the scm_take_xxx series (which are used internally 
by pointer->bytevector). Those functions are clearly meant to be used with 
‘foreign’ storage, but if one were to use them with Guile-managed storage, then 
I think it's understood that the user is responsible for retaining the relevant 
pointers.

That's also how I understand the comments above make_bytevector_from_buffer.

Also AFAIU, the only ways to get a pointer to bytevector storage using the 
public API are 1) scm_xxx_elements / scm_xxx_writable_elements, 2) the macro 
SCM_BYTEVECTOR_CONTENTS, or 3) the FFI function bytevector->pointer.

The scm_xxx_elements functions take an array handle and enforce the get/release 
interface, so they should cause no issues.

I think it would be fair to add a warning to the manual that the user is 
responsible for retaining the pointer obtained from SCM_BYTEVECTOR_CONTENTS 
around any calls. I think that's what the scm_remember_upto_here functions are 
for, although I've never had to use them myself. I've also never used 
SCM_BYTEVECTOR_CONTENTS directly. I'm not sure it belongs in the public API to 
be honest.

Then bytevector->pointer uses SCM_BYTEVECTOR_CONTENTS internally, but it also 
keeps (SCM_BYTEVECTOR_CONTENTS(bv) + offset) in a ‘pointer_weak_refs’ table. So 
it seems possible for this to happen:

y = make_bytevector
x = bytevector->pointer(y, offset = 512+1)
do_stuff_with(x /* y is collected */)

either in Scheme or in C. But it would be solved if bytevector->pointer kept 
just SCM_BYTEVECTOR_CONTENTS(bv) instead of (SCM_BYTEVECTOR_CONTENTS(bv) + 
offset) in the table. I'm not sure why it keeps (... + offset). Does this make 
sense?

So I'd be interested in trying out GC_MALLOC_ATOMIC_IGNORE_OFF_PAGE, if this 
fix to bytevector->pointer makes sense and no one can point out another 
*concrete* situation where it would result in a GC bug.

Regards

        Daniel




reply via email to

[Prev in Thread] Current Thread [Next in Thread]