qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 1/2] hw/nvme: Implement shadow doorbell buffer support


From: Guenter Roeck
Subject: Re: [PATCH v3 1/2] hw/nvme: Implement shadow doorbell buffer support
Date: Fri, 9 Dec 2022 03:00:22 -0800

On Thu, Dec 08, 2022 at 12:39:57PM -0800, Guenter Roeck wrote:
> On Thu, Dec 08, 2022 at 12:13:55PM -0800, Guenter Roeck wrote:
> > On Thu, Dec 08, 2022 at 10:47:42AM -0800, Guenter Roeck wrote:
> > > > 
> > > > A cq head doorbell mmio is skipped... And it is not the fault of the
> > > > kernel. The kernel is in it's good right to skip the mmio since the cq
> > > > eventidx is not properly updated.
> > > > 
> > > > Adding that and it boots properly on riscv. But I'm perplexed as to why
> > > > this didnt show up on our regularly tested platforms.
> > > > 
> > > > Gonna try to get this in for 7.2!
> > > 
> > > I see another problem with sparc64.
> > > 
> > > [    5.261508] could not locate request for tag 0x0
> > > [    5.261711] nvme nvme0: invalid id 0 completed on queue 1
> > > 
> > > That is seen repeatedly until the request times out. I'll test with
> > > your patch to see if it resolves this problem as well, and will bisect
> > > otherwise.
> > > 
> > The second problem is unrelated to the doorbell problem.
> > It is first seen in qemu v7.1. I'll try to bisect.
> > 
> 
> Unfortunately, the problem observed with sparc64 also bisects to this
> patch. Making things worse, "hw/nvme: fix missing cq eventidx update"
> does not fix it (which is why I initially thought it was unrelated).
> 
> I used the following qemu command line.
> 
> qemu-system-sparc64 -M sun4v -cpu "TI UltraSparc IIi" -m 512 -snapshot \
>     -device nvme,serial=foo,drive=d0,bus=pciB \
>     -drive file=rootfs.ext2,if=none,format=raw,id=d0 \
>     -kernel arch/sparc/boot/image -no-reboot \
>     -append "root=/dev/nvme0n1 console=ttyS0" \
>     -nographic -monitor none
> 

With completed tests, it turns out the problem is seen with various
emulations running big endian CPUs.

Example from arm64be:

[    4.736752] nvme nvme0: pci function 0000:00:02.0
[    4.737829] nvme 0000:00:02.0: enabling device (0000 -> 0002)
[    4.774673] nvme nvme0: 2/0/0 default/read/poll queues
[    4.779331] nvme nvme0: Ignoring bogus Namespace Identifiers
[    4.799400] could not locate request for tag 0x0
[    4.799533] nvme nvme0: invalid id 0 completed on queue 2
[    4.799612] could not locate request for tag 0x0
[    4.799676] nvme nvme0: invalid id 0 completed on queue 2
[    4.799744] could not locate request for tag 0x0

powerpc:

could not locate request for tag 0x0
nvme nvme0: invalid id 0 completed on queue 1
could not locate request for tag 0x0
nvme nvme0: invalid id 0 completed on queue 1

trace logs (arm64be, good, qemu v7.0):

pci_nvme_admin_cmd cid 2864 sqid 0 opc 0x6 opname 'NVME_ADM_CMD_IDENTIFY'
pci_nvme_identify cid 2864 cns 0x5 ctrlid 0 csi 0x0
pci_nvme_identify_ns_csi nsid=1, csi=0x0
pci_nvme_map_prp trans_len 4096 len 4096 prp1 0x44a84000 prp2 0x0 num_prps 2
pci_nvme_map_addr addr 0x44a84000 len 4096
pci_nvme_enqueue_req_completion cid 2864 cqid 0 dw0 0x0 dw1 0x0 status 0x0
pci_nvme_irq_msix raising MSI-X IRQ vector 0
pci_nvme_mmio_write addr 0x1004 data 0x10 size 4
pci_nvme_mmio_doorbell_cq cqid 0 new_head 16
pci_nvme_mmio_write addr 0x1008 data 0x1 size 4
pci_nvme_mmio_doorbell_sq sqid 1 new_tail 1
pci_nvme_io_cmd cid 32770 nsid 0x1 sqid 1 opc 0x2 opname 'NVME_NVM_CMD_READ'
pci_nvme_read cid 32770 nsid 1 nlb 8 count 4096 lba 0x0
pci_nvme_map_prp trans_len 4096 len 4096 prp1 0x44879000 prp2 0x0 num_prps 2
pci_nvme_map_addr addr 0x44879000 len 4096
pci_nvme_rw_cb cid 32770 blk 'd0'
pci_nvme_rw_complete_cb cid 32770 blk 'd0'
pci_nvme_enqueue_req_completion cid 32770 cqid 1 dw0 0x0 dw1 0x0 status 0x0
pci_nvme_irq_msix raising MSI-X IRQ vector 1
pci_nvme_mmio_write addr 0x100c data 0x1 size 4
pci_nvme_mmio_doorbell_cq cqid 1 new_head 1
pci_nvme_mmio_write addr 0x1008 data 0x2 size 4
pci_nvme_mmio_doorbell_sq sqid 1 new_tail 2

trace log (arm64be, bad, qemu v7.2):

pci_nvme_admin_cmd cid 5184 sqid 0 opc 0x6 opname 'NVME_ADM_CMD_IDENTIFY'
pci_nvme_identify cid 5184 cns 0x5 ctrlid 0 csi 0x0
pci_nvme_identify_ns_csi nsid=1, csi=0x0
pci_nvme_map_prp trans_len 4096 len 4096 prp1 0x44e56000 prp2 0x0 num_prps 2
pci_nvme_map_addr addr 0x44e56000 len 4096
pci_nvme_enqueue_req_completion cid 5184 cqid 0 dw0 0x0 dw1 0x0 status 0x0
pci_nvme_update_sq_eventidx sqid 0 new_eventidx 18
pci_nvme_update_sq_tail sqid 0 new_tail 18
pci_nvme_update_cq_eventidx cqid 0 new_eventidx 16
pci_nvme_update_cq_head cqid 0 new_head 16
pci_nvme_irq_msix raising MSI-X IRQ vector 0
pci_nvme_mmio_write addr 0x1004 data 0x11 size 4
pci_nvme_mmio_doorbell_cq cqid 0 new_head 17
pci_nvme_mmio_write addr 0x1010 data 0x1 size 4
pci_nvme_mmio_doorbell_sq sqid 2 new_tail 1
pci_nvme_update_sq_tail sqid 2 new_tail 0
pci_nvme_io_cmd cid 16384 nsid 0x1 sqid 2 opc 0x2 opname 'NVME_NVM_CMD_READ'
pci_nvme_read cid 16384 nsid 1 nlb 8 count 4096 lba 0x0
pci_nvme_map_prp trans_len 4096 len 4096 prp1 0x44cea000 prp2 0x0 num_prps 2
pci_nvme_map_addr addr 0x44cea000 len 4096
pci_nvme_update_sq_eventidx sqid 2 new_eventidx 0
pci_nvme_update_sq_tail sqid 2 new_tail 0
pci_nvme_io_cmd cid 0 nsid 0x0 sqid 2 opc 0x0 opname 'NVME_NVM_CMD_FLUSH'
pci_nvme_enqueue_req_completion cid 0 cqid 2 dw0 0x0 dw1 0x0 status 0x400b
pci_nvme_err_req_status cid 0 nsid 0 status 0x400b opc 0x0
pci_nvme_update_sq_eventidx sqid 2 new_eventidx 0
pci_nvme_update_sq_tail sqid 2 new_tail 0

[flush command repeated many times]

Guenter



reply via email to

[Prev in Thread] Current Thread [Next in Thread]