qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/1] hw/ide/core: terminate in-flight DMA on IDE bus reset


From: John Snow
Subject: Re: [PATCH 1/1] hw/ide/core: terminate in-flight DMA on IDE bus reset
Date: Tue, 3 Oct 2023 13:05:47 -0400



On Tue, Oct 3, 2023, 10:07 AM Niklas Cassel <Niklas.Cassel@wdc.com> wrote:
On Mon, Sep 25, 2023 at 03:53:23PM -0400, John Snow wrote:
> Niklas, I'm sorry to lean on you here a little bit - You've been
> working on the SATA side of this a bit more often, can you let me know
> if you think this patch is safe?

FWIW, I prefer Fiona's patch series which calls blk_aio_cancel() before
calling ide_reset():
https://lore.kernel.org/qemu-devel/20230906130922.142845-1-f.ebner@proxmox.com/T/#u

Patch 2/2 also adds a qtest which fails before patch 1/2, and passes after
patch 1/2.

I think I also prefer Fiona's patch. Simon makes a good point (I think) that it feels correct to dump the DMA as fast as possible, but Fiona's patch seems more justifiably and obviously correct.

I think my preference is for Fiona's patches first, and if we want to optimize cancellation thereafter we can attempt to do so.



>
> I'm not immediately sure what the impact of applying it is, but I have
> some questions about it:
>
> (1) When does ide_dma_cb get invoked when DRQ_STAT is set but the
> return code we were passed is not < 0, and
>
> (2) what's the impact of just *not* executing the end-of-transfer
> block here; what happens to the state machine?
>
> On Thu, Sep 21, 2023 at 12:07 PM Simon Rowe <simon.rowe@nutanix.com> wrote:
> >
> > When an IDE controller is reset, its internal state is being cleared
> > before any outstanding I/O is cancelled. If a response to DMA is
> > received in this window, the aio callback will incorrectly continue
> > with the next part of the transfer (now using sector 0 from
> > the cleared controller state).
>
> Eugh, yikes. It feels like we should fix the cancellation ... I'm
> worried that if we've reset the state machine and we need to bail on a
> DMA callback that the heuristics we use for that will eventually be
> wrong, unless I am mistaken and this is totally safe and reliable for
> a reason I don't intuitively see right away.
>
> Thoughts?

Since Simon has a reliable reproducer, and has offered to test Fiona's patch,
perhaps we should simply take him up on that offer :)

Yes! I just wanted to understand the differences between the two approaches since it looked like it was going to be my job to decide between them 😅

Simon, can you confirm that Fiona's patches are appropriate for your reproducer? In the meantime I'll do my own audit for the problem as you described it (thank you very much for that) and see if there's anything else that needs to be addressed.



Kind regards,
Niklas

Thank you everyone. 

--js

reply via email to

[Prev in Thread] Current Thread [Next in Thread]