qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] a


From: Paolo Bonzini
Subject: Re: [Qemu-devel] Java volatile vs. C11 seq_cst (was Re: [PATCH v2 1/2] add a header file for atomic operations)
Date: Thu, 20 Jun 2013 09:53:49 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130514 Thunderbird/17.0.6

Il 19/06/2013 22:25, Torvald Riegel ha scritto:
> On Wed, 2013-06-19 at 17:14 +0200, Paolo Bonzini wrote:
>> (1) I don't care about relaxed RMW ops (loads/stores occur in hot paths,
>> but RMW shouldn't be that bad.  I don't care if reference counting is a
>> little slower than it could be, for example);
> 
> I doubt relaxed RMW ops are sufficient even for reference counting.

They are enough on the increment side, or so says boost...

http://www.chaoticmind.net/~hcb/projects/boost.atomic/doc/atomic/usage_examples.html#boost_atomic.usage_examples.example_reference_counters

>>    [An aside: Java guarantees that volatile stores are not reordered
>>    with volatile loads.  This is not guaranteed by just using release
>>    stores and acquire stores, and is why IIUC acq_rel < Java < seq_cst].
>
> Or maybe Java volatile is acq for loads and seq_cst for stores...

Perhaps (but I'm not 100% sure).

>> As long as you only have a producer and a consumer, C11 is fine, because
>> all you need is load-acquire/store-release.  In fact, if it weren't for
>> the experience factor, C11 is easier than manually placing acquire and
>> release barriers.  But as soon as two or more threads are reading _and_
>> writing the shared memory, it gets complicated and I want to provide
>> something simple that people can use.  This is the reason for (2) above.
> 
> I can't quite follow you here.  There is a total order for all
> modifications to a single variable, and if you use acq/rel combined with
> loads and stores on this variable, then you basically can make use of
> the total order.  (All loads that read-from a certain store get a
> synchronized-with (and thus happens-before edge) with the store, and the
> stores are in a total order.)  This is independent of the number of
> readers and writers.  The difference starts once you want to sync with
> more than one variable, and need to establish an order between those
> accesses.

You're right of course.  More specifically when there is a thread where
some variables are stored while others are loaded.

>> There will still be a few cases that need to be optimized, and here are
>> where the difficult requirements come:
>>
>> (R1) the primitives *should* not be alien to people who know Linux.
>>
>> (R2) those optimizations *must* be easy to do and review; at least as
>> easy as these things go.
>>
>> The two are obviously related.  Ease of review is why it is important to
>> make things familiar to people who know Linux.
>>
>> In C11, relaxing SC loads and stores is complicated, and more
>> specifically hard to explain!
> 
> I can't see why that would be harder than reasoning about equally weaker
> Java semantics.  But you obviously know your community, and I don't :)

Because Java semantics are "almost" SC, and as Paul mentioned the
difference doesn't matter in practice (IRIW/RWC is where it matters, WRC
works even on Power; see
http://www.cl.cam.ac.uk/~pes20/ppc-supplemental/ppc051.html#toc5, row
WRC+lwsyncs).  It hasn't ever mattered for Linux, at least.

>> By contrast, Java volatile semantics are easily converted to a sequence
>> of relaxed loads, relaxed stores, and acq/rel/sc fences.
> 
> The same holds for C11/C++11.  If you look at either the standard or the
> Batty model, you'll see that for every pair like store(rel)--load(acq),
> there is also store(rel)--fence(acq)+load(relaxed),
> store(relaxed)+fence(rel)--fence(acq)+load(relaxed), etc. defined,
> giving the same semantics.  Likewise for SC.

Do you have a pointer to that?  It would help.

> You can also build Dekker with SC stores and acq loads, if I'm not
> mistaken.  Typically one would probably use SC fences and relaxed
> stores/loads.

Yes.

>>> I guess so.  But you also have to consider the legacy that you create.
>>> I do think the C11/C++11 model will used widely, and more and more
>>> people will used to it.
>>
>> I don't think many people will learn how to use the various non-seqcst
>> modes...  At least so far I punted. :)
> 
> But you already use similarly weaker orderings that the other
> abstractions provide (e.g., Java), so you're half-way there :)

True.  On the other hand you can treat Java like "kinda SC but don't
worry, you won't see the difference".  It is both worrisome and appealing...

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]