qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] race between tb_gen_code() and qemu_ram_free()


From: Peter Maydell
Subject: Re: [Qemu-devel] race between tb_gen_code() and qemu_ram_free()
Date: Mon, 23 Apr 2018 17:47:54 +0100

On 20 April 2018 at 19:20, Paolo Bonzini <address@hidden> wrote:
> Note that qemu_ram_free() is _also_ RCU-freeing the RAMBlock.  If it is
> not found, it means that codegen is not running within
> rcu_read_lock()/rcu_read_unlock().  In fact it's not.

This seems to not be the only problem. We get into difficulties
somewhat earlier, because:
 * get_page_addr_code() calls memory_region_request_mmio_ptr()
 * memory_region_request_mmio_ptr() calls the xilinx-spips device's
   request_ptr function
 * the xilinx-spips code calls memory_region_invalidate_mmio_ptr()
   on the old MMIO pointer, since it only supports one at a time
 * at this point the device expects that the memory is not
   being used by anything: it always uses the same host buffer,
   so as soon as it fills it with the new page's data the
   content being used for the old page will be gone
 * but memory_region_invalidate_mmio_ptr() has merely deferred
   the invalidate for later
 * memory_region_request_mmio_ptr() now creates a new mmio-interface
   device, which maps itself into the device's memory region
   using memory_region_add_subregion()
 * we'll unly unmap the old mmio-interface device in
   memory_region_do_invalidate_mmio_ptr() at some later date

Problems:
(1) if you flip between pages too quickly you'll hit the
    old stale mmio-interface MemoryRegion rather than creating a
    fresh one

(2) since we create the new mmio-interface before the old
    one is deleted, there is an interim period when there
    are two RAMBlocks in the list with the same host address.
    Which one you get back from qemu_ram_addr_from_host()
    or qemu_ram_block_from_host() is unclear. This is actually
    the primary cause of the crash here -- get_page_addr_code()
    hits the new MemoryRegion in the TLB, but then when it
    tries to get the ram_addr from the host address that's
    also in the TLB it gets given back the ram_addr for
    the about-to-die RAMBlock rather than the new one.

The memory system design fairly strongly assumes that one
host address is only associated with one MemoryRegion at
a time, so we need to avoid breaking that with mmio-interface...

thanks
-- PMM



reply via email to

[Prev in Thread] Current Thread [Next in Thread]