qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 1/2] hw/nvme: Implement shadow doorbell buffer support


From: Guenter Roeck
Subject: Re: [PATCH v3 1/2] hw/nvme: Implement shadow doorbell buffer support
Date: Mon, 12 Dec 2022 06:27:08 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2

On 12/12/22 05:45, Klaus Jensen wrote:
On Dec 12 05:39, Guenter Roeck wrote:
On 12/12/22 01:58, Klaus Jensen wrote:
On Dec  8 12:39, Guenter Roeck wrote:
On Thu, Dec 08, 2022 at 12:13:55PM -0800, Guenter Roeck wrote:
On Thu, Dec 08, 2022 at 10:47:42AM -0800, Guenter Roeck wrote:

A cq head doorbell mmio is skipped... And it is not the fault of the
kernel. The kernel is in it's good right to skip the mmio since the cq
eventidx is not properly updated.

Adding that and it boots properly on riscv. But I'm perplexed as to why
this didnt show up on our regularly tested platforms.

Gonna try to get this in for 7.2!

I see another problem with sparc64.

[    5.261508] could not locate request for tag 0x0
[    5.261711] nvme nvme0: invalid id 0 completed on queue 1

That is seen repeatedly until the request times out. I'll test with
your patch to see if it resolves this problem as well, and will bisect
otherwise.

The second problem is unrelated to the doorbell problem.
It is first seen in qemu v7.1. I'll try to bisect.


Unfortunately, the problem observed with sparc64 also bisects to this
patch. Making things worse, "hw/nvme: fix missing cq eventidx update"
does not fix it (which is why I initially thought it was unrelated).

I used the following qemu command line.

qemu-system-sparc64 -M sun4v -cpu "TI UltraSparc IIi" -m 512 -snapshot \
      -device nvme,serial=foo,drive=d0,bus=pciB \
      -drive file=rootfs.ext2,if=none,format=raw,id=d0 \
      -kernel arch/sparc/boot/image -no-reboot \
      -append "root=/dev/nvme0n1 console=ttyS0" \
      -nographic -monitor none


Hi Guenter,

Thank you very much for the detailed reports and I apologize for the
fallout of this.

I think this is related to missing endian conversions when handling the
shadow doorbells. I'm not sure if there is a kernel issue here as well,
because as far as I can tell, the shadow doorbells are updated like so:

    old_value = *dbbuf_db;
    *dbbuf_db = value;

(where `value` is the new head/tail value depending on if this is an sq
or cq).

Keith, is the kernel doing something magically here I am not aware of,
or isn't this missing some cpu_to_le32() / le32_to_cpu() calls as well?

Wouldn't that mean that nvme doorbell support would be broken in Linux
on all big endian systems ? Maybe it is, but it seems a bit unlikely.


No, not broken in general - only for shadow doorbells. On regular MMIO,
the linux helpers takes care of endianness (and so does the MMIO
callbacks in QEMU). However, for shadow doorbells, the doorbell value is
written to host memory - and Linux (and QEMU) does not handle that
correctly wrt. endianness.

I got it all working with the series I just posted for QEMU (v4) and
fixing endianness conversion for the above in the kernel (patch pending
for linux-nvme).

Hmm, interesting. I guess I'll wait for the Linux patch to be posted.

That makes me wonder, though: Are the Linux changes really necessary ?
If this never worked, would it be possible to adjust the qemu code
in a way that it just works with the existing Linux code ?

Alternatively, would it be possible to add a runtime flag to qemu
that would let me disable shadow doorbell support ? I am testing
all Linux kernel branches, and without such a flag I'd have to carry
a patch just disabling it in qemu until all kernel branches are fixed
(if that ever happens).

Thanks,
Guenter




reply via email to

[Prev in Thread] Current Thread [Next in Thread]