qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] a


From: Paolo Bonzini
Subject: Re: [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] add a header file for atomic operations)
Date: Wed, 19 Jun 2013 11:31:09 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130514 Thunderbird/17.0.6

Il 18/06/2013 18:38, Torvald Riegel ha scritto:
> I don't think that this is the conclusion here.  I strongly suggest to
> just go with the C11/C++11 model, instead of rolling your own or trying
> to replicate the Java model.  That would also allow you to just point to
> the C11 model and any information / tutorials about it instead of having
> to document your own (see the patch), and you can make use of any
> (future) tool support (e.g., race detectors).

I'm definitely not rolling my own, but I think there is some value in
using the Java model.  Warning: the explanation came out quite
verbose... tl;dr at the end.


One reason is that implementing SC for POWER is quite expensive, while
this is not the case for Java volatile (which I'm still not convinced is
acq-rel, because it also orders volatile stores and volatile loads).
People working on QEMU are often used to manually placed barriers on
Linux, and Linux barriers do not fully give you seq-cst semantics.  They
give you something much more similar to the Java model.

The Java model gives good performance and is easier to understand than
the non-seqcst modes of atomic builtins.  It is pretty much impossible
to understand the latter without a formal model; I see the importance of
a formal model, but at the same time it is hard not to appreciate the
detailed-but-practical style of the Linux documentation.


Second, the Java model has very good "practical" documentation from
sources I trust.  Note the part about trust: I found way too many Java
tutorials, newsgroup posts, and blogs that say Java is SC, when it is not.

Paul's Linux docs are a source I trust, and the JSR-133 FAQ/cookbook too
(especially now that Richard and Paul completed my understanding of
them).  There are substantially fewer practical documents for C11/C++11
that are similarly authoritative.  I obviously trust Cambridge for
C11/C++11, but their material is very concise or just refers to the
formal model.  The formal model is not what I want when my question is
simply "why is lwsync good for acquire and release, but not for
seqcst?", for example.  And the papers sometime refer to "private
communication" between the authors and other people, which can be
annoying.  Hans Boehm and Herb Sutter have good poster and slide
material, but they do not have the same level of completeness as Paul's
Linux documentation.  Paul _really_ has spoiled us "pure practitioners"...


Third, we must support old GCC (even as old as 4.2), so we need
hand-written assembly for atomics anyway.  This again goes back to
documentation and the JSR-133 cookbook.  It not only gives you
instructions on how to implement the model (which is also true for the
Cambridge web pages on C11/C++11), but is also a good base for writing
our own documentation.  It helped me understanding existing code using
barriers, optimizing it, and putting this knowledge in words.  I just
couldn't find anything as useful for C11/C++11.


In short, the C11/C++11 model is not what most developers are used to
here, hardware is not 100% mature for it (for example ARMv8 has seqcst
load/store; perhaps POWER will grow that in time), is harder to
optimize, and has (as of 2013) less "practical" documentation from
sources I trust.

Besides, since what I'm using is weaker than SC, there's always the
possibility of switching to SC in the future when enough of these issues
are solved.  In case you really need SC _now_, it is easy to do it using
fetch-and-add (for loads) or xchg (for stores).

>> I will just not use __atomic_load/__atomic_store to implement the 
>> primitives, and always express them in terms of memory barriers.
> 
> Why?  (If there's some QEMU-specific reason, just let me know; I know
> little about QEMU..)

I guess I mentioned the QEMU-specific reasons above.

> I would assume that using the __atomic* builtins is just fine if they're
> available.

It would implement slightly different semantics based on the compiler
version, so I think it's dangerous.

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]