qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] [PATCH] Add a memory barrier to DMA functions


From: Benjamin Herrenschmidt
Subject: [Qemu-devel] [PATCH] Add a memory barrier to DMA functions
Date: Tue, 22 May 2012 14:34:38 +1000

The emulated devices can run simultaneously with the guest, so
we need to be careful with ordering of load and stores done by
them to the guest system memory, which need to be observed in
the right order by the guest operating system.

This adds a barrier call to the basic DMA read/write ops which
is currently implemented as a smp_mb(), but could be later
improved for more fine grained control of barriers.

Additionally, a _relaxed() variant of the accessors is provided
to easily convert devices who would be performance sensitive
and negatively impacted by the change.

Signed-off-by: Benjamin Herrenschmidt <address@hidden>
---

So here's the latest try :-) I've kept is simple, I don't
add anything to map/unmap at this stage, so we might still
have a problem with drivers who do that a lot without any
explicit barrier (ahci ?). My preference is to also add
barriers to map/unmap by default but we can discuss it.

Note that I've put the barrier in an inline "helper" so
we can nag about the type of barriers, or try to be smart,
or use flags in the DMAContext or whatever we want reasonably
easily as we have a single spot to modify.

I'm now going to see if I can measure a performance hit on
my x86 laptop, but if somebody who has existing x86 guest setups
wants to help, that would be much welcome :-)

 dma.h |   54 ++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/dma.h b/dma.h
index f1fcb71..0d57e50 100644
--- a/dma.h
+++ b/dma.h
@@ -13,6 +13,7 @@
 #include <stdio.h>
 #include "hw/hw.h"
 #include "block.h"
+#include "kvm.h"
 
 typedef struct DMAContext DMAContext;
 typedef struct ScatterGatherEntry ScatterGatherEntry;
@@ -70,6 +71,30 @@ typedef struct DMAContext {
     DMAUnmapFunc *unmap;
 } DMAContext;
 
+static inline void dma_barrier(DMAContext *dma, DMADirection dir)
+{
+    /*
+     * This is called before DMA read and write operations
+     * unless the _relaxed form is used and is responsible
+     * for providing some sane ordering of accesses vs
+     * concurrently running VCPUs.
+     *
+     * Users of map(), unmap() or lower level st/ld_*
+     * operations are responsible for providing their own
+     * ordering via barriers.
+     *
+     * This primitive implementation does a simple smp_mb()
+     * before each operation which provides pretty much full
+     * ordering.
+     *
+     * A smarter implementation can be devised if needed to
+     * use lighter barriers based on the direction of the
+     * transfer, the DMA context, etc...
+     */
+    if (kvm_enabled())
+        smp_mb();
+}
+
 static inline bool dma_has_iommu(DMAContext *dma)
 {
     return !!dma;
@@ -93,8 +118,9 @@ static inline bool dma_memory_valid(DMAContext *dma,
 
 int iommu_dma_memory_rw(DMAContext *dma, dma_addr_t addr,
                         void *buf, dma_addr_t len, DMADirection dir);
-static inline int dma_memory_rw(DMAContext *dma, dma_addr_t addr,
-                                void *buf, dma_addr_t len, DMADirection dir)
+static inline int dma_memory_rw_relaxed(DMAContext *dma, dma_addr_t addr,
+                                        void *buf, dma_addr_t len,
+                                        DMADirection dir)
 {
     if (!dma_has_iommu(dma)) {
         /* Fast-path for no IOMMU */
@@ -106,6 +132,28 @@ static inline int dma_memory_rw(DMAContext *dma, 
dma_addr_t addr,
     }
 }
 
+static inline int dma_memory_read_relaxed(DMAContext *dma, dma_addr_t addr,
+                                          void *buf, dma_addr_t len)
+{
+    return dma_memory_rw_relaxed(dma, addr, buf, len, DMA_DIRECTION_TO_DEVICE);
+}
+
+static inline int dma_memory_write_relaxed(DMAContext *dma, dma_addr_t addr,
+                                           const void *buf, dma_addr_t len)
+{
+    return dma_memory_rw_relaxed(dma, addr, (void *)buf, len,
+                                 DMA_DIRECTION_FROM_DEVICE);
+}
+
+static inline int dma_memory_rw(DMAContext *dma, dma_addr_t addr,
+                                void *buf, dma_addr_t len,
+                                DMADirection dir)
+{
+    dma_barrier(dma, dir);
+
+    return dma_memory_rw_relaxed(dma, addr, buf, len, dir);
+}
+
 static inline int dma_memory_read(DMAContext *dma, dma_addr_t addr,
                                   void *buf, dma_addr_t len)
 {
@@ -124,6 +172,8 @@ int iommu_dma_memory_set(DMAContext *dma, dma_addr_t addr, 
uint8_t c,
 static inline int dma_memory_set(DMAContext *dma, dma_addr_t addr,
                                  uint8_t c, dma_addr_t len)
 {
+    dma_barrier(dma, DMA_DIRECTION_FROM_DEVICE);
+
     if (!dma_has_iommu(dma)) {
         /* Fast-path for no IOMMU */
         cpu_physical_memory_set(addr, c, len);





reply via email to

[Prev in Thread] Current Thread [Next in Thread]