Re: [Qemu-devel] Atomic instruction.

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Atomic instruction.

From:	Peter Maydell
Subject:	Re: [Qemu-devel] Atomic instruction.
Date:	Fri, 4 Jul 2014 17:02:24 +0100

On 4 July 2014 16:35, Frederic Konrad <address@hidden> wrote:
> Hi everybody,
>
> We are experimenting with multi-core QEMU. We have Multiple QEMU cores
> running
> on multiple host threads and we are now looking at the issue of ‘atomic’
> instructions.
>
> Our initial thought was to push some sort of set of flags out to the memory
> chain (eg: ATOMIC or NORMAL much as the H/W would do). However,
> this is a lot of changes and there seems to be a choice of a number of
> different ways of doing this.

Yes, I don't think you can emulate load-store exclusive instrucitons
like this. (Also, these days hardware doesn't generally use that sort
of "lock the bus" signal, which is awkward for SMP; indeed that's
why load-store exclusive paired instructions have taken over from
SWP on ARM.)

> We think the best approach overall is to leave the current mechanisms for
> guaranteeing the functionality of e.g. load/store exclusive in place.
> That is to say, right now, for instance for ARM, QEMU stores the addr/val of
> loads, and compare them to ensure they have not been changed on store.
> Effectively it does a load-compair-store for the store.
> This is a ‘belt and braces’ implementation of the H/W, but it’s good.

Actually what we do at the moment isn't architecturally valid for ARM.
If any other core writes the same value to the memory location
between the LDREX and the STREX the architecture says that we
must fail the STREX, but our implementation does not. The
architecture also says that plain stores by other cores should
break the lock, and we don't implement that (we only handle STREX
specially). I recommend reading the ARM ARM sections on
synchronization and exclusive accesses (though they are rather
heavy going...)

In a multi-threaded TCG world I would be inclined actually to
implement this in a manner somewhat closer to what hardware
does: on LDREX you mark the page as read-only, and (using a
similar method to what we do for watchpoints) arrange that if
some other core writes to that page then we un-read-protect it
and note that the STREX should fail.

> I believe it is valid to say that - so long as each core can guarantee that
> the ‘load-compair-store’ is somehow atomic, then it is perfectly satisfactory
> for each QEMU core to hold it’s own value for the ‘old value’ etc, and this
> mechanism will still work.
>
> The issue will only be to ensure that the load-compair-store is atomic - and
> only in the ‘store exclusive’.

You also need some means of ensuring that atomicity is preserved
in other cases: for instance for ARM guests with LPAE 64-bit loads
and stores must be atomic, but we don't currently guarantee that
in TCG.

> Overall, this mechanism does not actually mirror the hardware we are
> modelling,
> so overall we think it would be easier and more re-usable to provide two new
> OP’s in the TCG, one to lock a mutex, one to release it.

What in particular are you proposing that these mutexes should
protect against? I suspect you may want to describe the semantics
at a slightly higher level (perhaps "do not allow any other vCPU
to execute while these TCG instructions are executing" markers?)

It would probably also be useful if you were to sketch out how
you would expect this to work for:
 * a simple atomic operation (eg ARM SWP or some of the x86
 LOCK-prefixed insns)
 * LDREX/STREX or load-locked/store-conditional pairs
 * any other atomicity requirements that might differ between
   guest and host, like whether 64-bit accesses are atomic

thanks
-- PMM

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] Atomic instruction., Frederic Konrad, 2014/07/04
- Re: [Qemu-devel] Atomic instruction., Peter Maydell <=

Prev by Date: Re: [Qemu-devel] [PATCH 2/2] virtio-blk: dataplane: notify guest as a batch
Next by Date: Re: [Qemu-devel] [RFC PATCH V3 0/6] icount: Implement delay algorithm between guest and host clocks
Previous by thread: [Qemu-devel] Atomic instruction.
Next by thread: [Qemu-devel] [PATCH 0/4] block: Fix qiov sizes
Index(es):
- Date
- Thread