qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [patch v4 12/16] e1000: apply fine lock on e1000


From: Avi Kivity
Subject: Re: [Qemu-devel] [patch v4 12/16] e1000: apply fine lock on e1000
Date: Thu, 25 Oct 2012 19:02:01 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:16.0) Gecko/20121016 Thunderbird/16.0.1

On 10/25/2012 06:39 PM, Jan Kiszka wrote:
>> 
>> That doesn't work cross-thread.
>> 
>> vcpu A: write to device X, dma-ing to device Y
>> vcpu B: write to device Y, dma-ing to device X
> 
> We will deny DMA-ing from device X on behalf of a VCPU, ie. in dispatch
> context, to Y.
> 
> What we do not deny, though, is DMA-ing from an I/O thread that
> processes an event for device X. 

I would really like to avoid depending on the context.  In real hardware, there 
is no such thing.

> If the invoked callback of device X
> holds the device lock across some DMA request to Y, then we risk to run
> into the same ABBA issue. Hmm...

Yup.

> 
>> 
>> My suggestion was to drop the locks around DMA, then re-acquire the lock
>> and re-validate data.
> 
> Maybe possible, but hairy depending on the device model.

It's unpleasant, yes.

Note depending on the device, we may not need to re-validate data, it may be 
sufficient to load it into local variables to we know it is consistent at some 
point.  But all those solutions suffer from requiring device model authors to 
understand all those issues, rather than just add a simple lock around access 
to their data structures.

>>>>> I see that we have a all-or-nothing problem here: to address this
>>>>> properly, we need to convert the IRQ path to lock-less (or at least
>>>>> compatible with holding per-device locks) as well.
>>>>
>>>> There is a transitional path where writing to a register that can cause
>>>> IRQ changes takes both the big lock and the local lock.
>>>>
>>>> Eventually, though, of course all inner subsystems must be threaded for
>>>> this work to have value.
>>>>
>>>
>>> But that transitional path must not introduce regressions. Opening a
>>> race window between IRQ cause update and event injection is such a
>>> thing, just like dropping concurrent requests on the floor.
>> 
>> Can you explain the race?
> 
> Context A                             Context B
> 
> device.lock
> ...
> device.set interrupt_cause = 0
> lower_irq = true
> ...
> device.unlock
>                                       device.lock
>                                       ...
>                                       device.interrupt_cause = 42
>                                       raise_irq = true
>                                       ...
>                                       device.unlock
>                                       if (raise_irq)
>                                               bql.lock
>                                               set_irq(device.irqno)
>                                               bql.unlock
> if (lower_irq)
>       bql.lock
>       clear_irq(device.irqno)
>       bql.unlock
> 
> 
> And there it goes, our interrupt event.

Obviously you'll need to reacquire the device lock after taking bql and 
revalidate stuff.  But that is not what I am suggesting.  Instead, any path 
that can lead to an irq update (or timer update etc) will take both the bql and 
the device lock.  This will leave after the first pass only side effect free 
register reads and writes, which is silly if we keep it that way, but we will 
follow with a threaded timer and irq subsystem and we'll peel away those big 
locks.

  device_mmio_write:
    if register is involved in irq or timers or block layer or really anything 
that matters:
      bql.acquire
    device.lock.acquire
    do stuff
    device.lock.release
    if that big condition from above was true:
      bql.release

-- 
error compiling committee.c: too many arguments to function



reply via email to

[Prev in Thread] Current Thread [Next in Thread]