[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC v3 PATCH 01/14] Introduce TCGOpcode for memory bar
From: |
Pranith Kumar |
Subject: |
Re: [Qemu-devel] [RFC v3 PATCH 01/14] Introduce TCGOpcode for memory barrier |
Date: |
Tue, 21 Jun 2016 10:52:08 -0400 |
Hi Sergey,
On Mon, Jun 20, 2016 at 5:21 PM, Sergey Fedorov <address@hidden> wrote:
> On 18/06/16 07:03, Pranith Kumar wrote:
>> diff --git a/tcg/tcg.h b/tcg/tcg.h
>> index db6a062..36feca9 100644
>> --- a/tcg/tcg.h
>> +++ b/tcg/tcg.h
>> @@ -408,6 +408,20 @@ static inline intptr_t QEMU_ARTIFICIAL
>> GET_TCGV_PTR(TCGv_ptr t)
>> #define TCG_CALL_DUMMY_TCGV MAKE_TCGV_I32(-1)
>> #define TCG_CALL_DUMMY_ARG ((TCGArg)(-1))
>>
>> +typedef enum {
>> + TCG_MO_LD_LD = 1,
>> + TCG_MO_ST_LD = 2,
>> + TCG_MO_LD_ST = 4,
>> + TCG_MO_ST_ST = 8,
>> + TCG_MO_ALL = 0xF, // OR of all above
>
> So TCG_MO_ALL specifies a so called "full" memory barrier?
This enum just specifies what loads and stores need to be ordered.
TCG_MO_ALL specifies that we need to order both previous loads and
stores with later loads and stores. To get a full memory barrier you
will need to pair it with BAR_SC:
TCG_MO_ALL | TCG_BAR_SC
>
>> +} TCGOrder;
>> +
>> +typedef enum {
>> + TCG_BAR_ACQ = 32,
>> + TCG_BAR_REL = 64,
>
> I'm convinced that the only practical way to represent a standalone
> acquire memory barrier is to order all previous loads with all
> subsequent loads and stores. Similarly, a standalone release memory
> barrier would order all previous loads and stores with all subsequent
> stores. [1]
Yes, here acquire would be:
(TCG_MO_LD_ST | TCG_MO_LD_LD) | TCG_BAR_ACQ
and release would be:
(TCG_MO_ST_ST | TCG_MO_LD_ST) | TCG_BAR_REL
>
> On the other hand, acquire or release semantic associated with a memory
> operation itself can be directly mapped into e.g. AArch64's Load-Acquire
> (LDAR) and Store-Release (STLR) instructions. A standalone barrier
> adjacent to a memory operation shouldn't be mapped this way because it
> should provide more strict guarantees than e.g. AArch64 instructions
> mentioned above.
You are right. That is why the load-acquire operation generates the
stronger barrier:
TCG_MO_ALL | TCG_BAR_ACQ and not the acquire barrier above. Similarly
for store-release.
>
> Therefore, I advocate for clear distinction between standalone memory
> barriers and implicit memory ordering semantics associated with memory
> operations themselves.
Any suggestions on how to make the distinction clearer? I will add a
detailed comment like the above but please let me know if you have
anything in mind.
>
> [1] http://preshing.com/20130922/acquire-and-release-fences/
>
>> + TCG_BAR_SC = 128,
>
> How's that different from TCG_MO_ALL?
TCG_BAR_* tells us what ordering is enforced. TCG_MO_* tells what on
what operations the ordering is to be enforced.
Thanks,
--
Pranith