[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC v3 PATCH 01/14] Introduce TCGOpcode for memory bar
From: |
Sergey Fedorov |
Subject: |
Re: [Qemu-devel] [RFC v3 PATCH 01/14] Introduce TCGOpcode for memory barrier |
Date: |
Tue, 21 Jun 2016 00:21:58 +0300 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0 |
On 18/06/16 07:03, Pranith Kumar wrote:
> diff --git a/tcg/tcg.h b/tcg/tcg.h
> index db6a062..36feca9 100644
> --- a/tcg/tcg.h
> +++ b/tcg/tcg.h
> @@ -408,6 +408,20 @@ static inline intptr_t QEMU_ARTIFICIAL
> GET_TCGV_PTR(TCGv_ptr t)
> #define TCG_CALL_DUMMY_TCGV MAKE_TCGV_I32(-1)
> #define TCG_CALL_DUMMY_ARG ((TCGArg)(-1))
>
> +typedef enum {
> + TCG_MO_LD_LD = 1,
> + TCG_MO_ST_LD = 2,
> + TCG_MO_LD_ST = 4,
> + TCG_MO_ST_ST = 8,
> + TCG_MO_ALL = 0xF, // OR of all above
So TCG_MO_ALL specifies a so called "full" memory barrier?
> +} TCGOrder;
> +
> +typedef enum {
> + TCG_BAR_ACQ = 32,
> + TCG_BAR_REL = 64,
I'm convinced that the only practical way to represent a standalone
acquire memory barrier is to order all previous loads with all
subsequent loads and stores. Similarly, a standalone release memory
barrier would order all previous loads and stores with all subsequent
stores. [1]
On the other hand, acquire or release semantic associated with a memory
operation itself can be directly mapped into e.g. AArch64's Load-Acquire
(LDAR) and Store-Release (STLR) instructions. A standalone barrier
adjacent to a memory operation shouldn't be mapped this way because it
should provide more strict guarantees than e.g. AArch64 instructions
mentioned above.
Therefore, I advocate for clear distinction between standalone memory
barriers and implicit memory ordering semantics associated with memory
operations themselves.
[1] http://preshing.com/20130922/acquire-and-release-fences/
> + TCG_BAR_SC = 128,
How's that different from TCG_MO_ALL?
> +} TCGBar;
> +
> /* Conditions. Note that these are laid out for easy manipulation by
> the functions below:
> bit 0 is used for inverting;
Kind regards,
Sergey