qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 1/2] hw/nvme: Implement shadow doorbell buffer support


From: Klaus Jensen
Subject: Re: [PATCH v3 1/2] hw/nvme: Implement shadow doorbell buffer support
Date: Mon, 12 Dec 2022 16:18:17 +0100

On Dec 12 06:27, Guenter Roeck wrote:
> On 12/12/22 05:45, Klaus Jensen wrote:
> > On Dec 12 05:39, Guenter Roeck wrote:
> > > On 12/12/22 01:58, Klaus Jensen wrote:
> > > > On Dec  8 12:39, Guenter Roeck wrote:
> > > > > On Thu, Dec 08, 2022 at 12:13:55PM -0800, Guenter Roeck wrote:
> > > > > > On Thu, Dec 08, 2022 at 10:47:42AM -0800, Guenter Roeck wrote:
> > > > > > > > 
> > > > > > > > A cq head doorbell mmio is skipped... And it is not the fault 
> > > > > > > > of the
> > > > > > > > kernel. The kernel is in it's good right to skip the mmio since 
> > > > > > > > the cq
> > > > > > > > eventidx is not properly updated.
> > > > > > > > 
> > > > > > > > Adding that and it boots properly on riscv. But I'm perplexed 
> > > > > > > > as to why
> > > > > > > > this didnt show up on our regularly tested platforms.
> > > > > > > > 
> > > > > > > > Gonna try to get this in for 7.2!
> > > > > > > 
> > > > > > > I see another problem with sparc64.
> > > > > > > 
> > > > > > > [    5.261508] could not locate request for tag 0x0
> > > > > > > [    5.261711] nvme nvme0: invalid id 0 completed on queue 1
> > > > > > > 
> > > > > > > That is seen repeatedly until the request times out. I'll test 
> > > > > > > with
> > > > > > > your patch to see if it resolves this problem as well, and will 
> > > > > > > bisect
> > > > > > > otherwise.
> > > > > > > 
> > > > > > The second problem is unrelated to the doorbell problem.
> > > > > > It is first seen in qemu v7.1. I'll try to bisect.
> > > > > > 
> > > > > 
> > > > > Unfortunately, the problem observed with sparc64 also bisects to this
> > > > > patch. Making things worse, "hw/nvme: fix missing cq eventidx update"
> > > > > does not fix it (which is why I initially thought it was unrelated).
> > > > > 
> > > > > I used the following qemu command line.
> > > > > 
> > > > > qemu-system-sparc64 -M sun4v -cpu "TI UltraSparc IIi" -m 512 
> > > > > -snapshot \
> > > > >       -device nvme,serial=foo,drive=d0,bus=pciB \
> > > > >       -drive file=rootfs.ext2,if=none,format=raw,id=d0 \
> > > > >       -kernel arch/sparc/boot/image -no-reboot \
> > > > >       -append "root=/dev/nvme0n1 console=ttyS0" \
> > > > >       -nographic -monitor none
> > > > > 
> > > > 
> > > > Hi Guenter,
> > > > 
> > > > Thank you very much for the detailed reports and I apologize for the
> > > > fallout of this.
> > > > 
> > > > I think this is related to missing endian conversions when handling the
> > > > shadow doorbells. I'm not sure if there is a kernel issue here as well,
> > > > because as far as I can tell, the shadow doorbells are updated like so:
> > > > 
> > > >     old_value = *dbbuf_db;
> > > >     *dbbuf_db = value;
> > > > 
> > > > (where `value` is the new head/tail value depending on if this is an sq
> > > > or cq).
> > > > 
> > > > Keith, is the kernel doing something magically here I am not aware of,
> > > > or isn't this missing some cpu_to_le32() / le32_to_cpu() calls as well?
> > > 
> > > Wouldn't that mean that nvme doorbell support would be broken in Linux
> > > on all big endian systems ? Maybe it is, but it seems a bit unlikely.
> > > 
> > 
> > No, not broken in general - only for shadow doorbells. On regular MMIO,
> > the linux helpers takes care of endianness (and so does the MMIO
> > callbacks in QEMU). However, for shadow doorbells, the doorbell value is
> > written to host memory - and Linux (and QEMU) does not handle that
> > correctly wrt. endianness.
> > 
> > I got it all working with the series I just posted for QEMU (v4) and
> > fixing endianness conversion for the above in the kernel (patch pending
> > for linux-nvme).
> 
> Hmm, interesting. I guess I'll wait for the Linux patch to be posted.
> 
> That makes me wonder, though: Are the Linux changes really necessary ?
> If this never worked, would it be possible to adjust the qemu code
> in a way that it just works with the existing Linux code ?
> 

That would mean that hw/nvme would be non-compliant on big-endian and
I'd rather not leave it in that state.

> Alternatively, would it be possible to add a runtime flag to qemu
> that would let me disable shadow doorbell support ? I am testing
> all Linux kernel branches, and without such a flag I'd have to carry
> a patch just disabling it in qemu until all kernel branches are fixed
> (if that ever happens).
> 

YES. The fallout of this has been bad, so I'll cook up a patch to
disable shadow doorbells at runtime.

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]