[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [patch v4 12/16] e1000: apply fine lock on e1000
From: |
liu ping fan |
Subject: |
Re: [Qemu-devel] [patch v4 12/16] e1000: apply fine lock on e1000 |
Date: |
Mon, 29 Oct 2012 13:24:15 +0800 |
On Fri, Oct 26, 2012 at 2:48 AM, Jan Kiszka <address@hidden> wrote:
> On 2012-10-25 19:02, Avi Kivity wrote:
>> On 10/25/2012 06:39 PM, Jan Kiszka wrote:
>>>>
>>>> That doesn't work cross-thread.
>>>>
>>>> vcpu A: write to device X, dma-ing to device Y
>>>> vcpu B: write to device Y, dma-ing to device X
>>>
>>> We will deny DMA-ing from device X on behalf of a VCPU, ie. in dispatch
>>> context, to Y.
>>>
>>> What we do not deny, though, is DMA-ing from an I/O thread that
>>> processes an event for device X.
>>
>> I would really like to avoid depending on the context. In real hardware,
>> there is no such thing.
>
> The point is how we deal with any kind of access to a device that
> requires taking that device's lock while holding another lock, provided
> that scenario can also take place in reverse order at the same time.
> Known scenarios are:
>
> - vcpu 1 -> access device A -> access device B
> - vcpu 2 -> access device B -> access device A
>
> - event 1 -> device A event processing -> access device B
> - event 2 -> device B event processing -> access device A
>
> and combinations of those pairs.
>
>>
>>> If the invoked callback of device X
>>> holds the device lock across some DMA request to Y, then we risk to run
>>> into the same ABBA issue. Hmm...
>>
>> Yup.
>>
>>>
>>>>
>>>> My suggestion was to drop the locks around DMA, then re-acquire the lock
>>>> and re-validate data.
>>>
>>> Maybe possible, but hairy depending on the device model.
>>
>> It's unpleasant, yes.
>>
>> Note depending on the device, we may not need to re-validate data, it may be
>> sufficient to load it into local variables to we know it is consistent at
>> some point. But all those solutions suffer from requiring device model
>> authors to understand all those issues, rather than just add a simple lock
>> around access to their data structures.
>
> Right. And therefor it is a suboptimal way to start (patching).
>
>>
>>>>>>> I see that we have a all-or-nothing problem here: to address this
>>>>>>> properly, we need to convert the IRQ path to lock-less (or at least
>>>>>>> compatible with holding per-device locks) as well.
>>>>>>
>>>>>> There is a transitional path where writing to a register that can cause
>>>>>> IRQ changes takes both the big lock and the local lock.
>>>>>>
>>>>>> Eventually, though, of course all inner subsystems must be threaded for
>>>>>> this work to have value.
>>>>>>
>>>>>
>>>>> But that transitional path must not introduce regressions. Opening a
>>>>> race window between IRQ cause update and event injection is such a
>>>>> thing, just like dropping concurrent requests on the floor.
>>>>
>>>> Can you explain the race?
>>>
>>> Context A Context B
>>>
>>> device.lock
>>> ...
>>> device.set interrupt_cause = 0
>>> lower_irq = true
>>> ...
>>> device.unlock
>>> device.lock
>>> ...
>>> device.interrupt_cause = 42
>>> raise_irq = true
>>> ...
>>> device.unlock
>>> if (raise_irq)
>>> bql.lock
>>> set_irq(device.irqno)
>>> bql.unlock
>>> if (lower_irq)
>>> bql.lock
>>> clear_irq(device.irqno)
>>> bql.unlock
>>>
>>>
>>> And there it goes, our interrupt event.
>>
>> Obviously you'll need to reacquire the device lock after taking bql and
>> revalidate stuff. But that is not what I am suggesting. Instead, any path
>> that can lead to an irq update (or timer update etc) will take both the bql
>> and the device lock. This will leave after the first pass only side effect
>> free register reads and writes, which is silly if we keep it that way, but
>> we will follow with a threaded timer and irq subsystem and we'll peel away
>> those big locks.
>>
>> device_mmio_write:
>> if register is involved in irq or timers or block layer or really
>> anything that matters:
>> bql.acquire
>> device.lock.acquire
>> do stuff
>> device.lock.release
>> if that big condition from above was true:
>> bql.release
>
> Looks simpler than it is as you cannot wrap complete handlers with that
> pattern. An example where it would fail (until we solved the locking
> issues above):
>
> mmio_write:
> bql.conditional_lock
> device.lock
> device.check_state
> issue_dma
> device.update_state
> update_irq, play_with_timers, etc.
> device.unlock
> bql.conditional_unlock
>
> If that DMA request hits an unconverted MMIO region or one that takes
> BQL conditionally as above, we will lock up (or bail out as our mutexes
> detect the error). E1000's start_xmit looks like this so far, and that's
> a pretty import service.
>
> Moreover, I prefer having a representative cut-through over enjoying to
> merge a first step that excludes some 80% of the problems. For that
> reason I would be even be inclined to start with addressing the IRQ
> injection topic first (patch-wise), then the other necessary backend
> services for the e1000 or whatever and convert some device(s) last.
>
> IOW: cut out anything from this series that touches e1000 until the
> building blocks for converting it reasonably are finished. Carrying
> experimental, partially broken conversion on top is fine, try to merge
> pieces of that not, IMHO.
>
Agreed. Just want to take this opportunity to discuss what is the
next and what is nut.
Regards,
pingfan
> Jan
>
> --
> Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
> Corporate Competence Center Embedded Linux
- Re: [Qemu-devel] [patch v4 12/16] e1000: apply fine lock on e1000, (continued)
- Re: [Qemu-devel] [patch v4 12/16] e1000: apply fine lock on e1000, Avi Kivity, 2012/10/22
- Re: [Qemu-devel] [patch v4 12/16] e1000: apply fine lock on e1000, Jan Kiszka, 2012/10/23
- Re: [Qemu-devel] [patch v4 12/16] e1000: apply fine lock on e1000, liu ping fan, 2012/10/24
- Re: [Qemu-devel] [patch v4 12/16] e1000: apply fine lock on e1000, Jan Kiszka, 2012/10/24
- Re: [Qemu-devel] [patch v4 12/16] e1000: apply fine lock on e1000, Avi Kivity, 2012/10/25
- Re: [Qemu-devel] [patch v4 12/16] e1000: apply fine lock on e1000, Jan Kiszka, 2012/10/25
- Re: [Qemu-devel] [patch v4 12/16] e1000: apply fine lock on e1000, Avi Kivity, 2012/10/25
- Re: [Qemu-devel] [patch v4 12/16] e1000: apply fine lock on e1000, Jan Kiszka, 2012/10/25
- Re: [Qemu-devel] [patch v4 12/16] e1000: apply fine lock on e1000, Avi Kivity, 2012/10/25
- Re: [Qemu-devel] [patch v4 12/16] e1000: apply fine lock on e1000, Jan Kiszka, 2012/10/25
- Re: [Qemu-devel] [patch v4 12/16] e1000: apply fine lock on e1000,
liu ping fan <=
- Re: [Qemu-devel] [patch v4 12/16] e1000: apply fine lock on e1000, liu ping fan, 2012/10/24
- Re: [Qemu-devel] [patch v4 12/16] e1000: apply fine lock on e1000, Jan Kiszka, 2012/10/25
- Re: [Qemu-devel] [patch v4 12/16] e1000: apply fine lock on e1000, Avi Kivity, 2012/10/25
- Re: [Qemu-devel] [patch v4 12/16] e1000: apply fine lock on e1000, Jan Kiszka, 2012/10/25
- Re: [Qemu-devel] [patch v4 12/16] e1000: apply fine lock on e1000, Avi Kivity, 2012/10/25
- Re: [Qemu-devel] [patch v4 12/16] e1000: apply fine lock on e1000, liu ping fan, 2012/10/29
- Re: [Qemu-devel] [patch v4 12/16] e1000: apply fine lock on e1000, Jan Kiszka, 2012/10/31
[Qemu-devel] [patch v4 03/16] hotplug: introduce qdev_unplug_complete() to remove device from views, Liu Ping Fan, 2012/10/22
[Qemu-devel] [patch v4 01/16] atomic: introduce atomic operations, Liu Ping Fan, 2012/10/22
[Qemu-devel] [patch v4 13/16] e1000: add busy flag to anti broken device state, Liu Ping Fan, 2012/10/22