qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] virtio-net sporadic error with QNX 7.0 guest: virtio-ne


From: Claudio
Subject: Re: [Qemu-devel] virtio-net sporadic error with QNX 7.0 guest: virtio-net ctrl missing headers
Date: Wed, 12 Sep 2018 19:12:58 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1

Hi Michael,

On 09/12/2018 05:31 PM, Michael S. Tsirkin wrote:
> On Wed, Sep 12, 2018 at 05:16:38PM +0200, Claudio wrote:
>> Thank you both for your responses,
>>
>> and ciao Paolo,
>>
>> On 09/12/2018 02:37 PM, Michael S. Tsirkin wrote:
>>> On Wed, Sep 12, 2018 at 10:01:34AM +0200, Claudio wrote:
>>>> Hello Michael, Jason and all,
>>>>
>>>> I am currently using latest mainline QEMU on x86_64 to run a QNX 7 guest.
>>>>
>>>> QNX 7 is not free software anymore unfortunately, with the
>>>> the last open source versions in the 6.x range.
>>>>
>>>> I am using the official virtio-net guest driver from QNX 7.
>>>>
>>>> During initialization I sporadically get this error message:
>>>>
>>>> $ qemu-system-x86_64 -machine pc,accel=kvm,kernel_irqchip=on -smp 2 -m 
>>>> 2048 -display none -nodefconfig -nodefaults -chardev stdio,mux=on,id=char0 
>>>> -serial chardev:char0 -monitor none -mon chardev=char0,mode=readline 
>>>> -netdev user,id=user0,hostfwd=udp::9004-:9004 -device 
>>>> virtio-net,netdev=user0 qnx.img
>>>>
>>>> virtio-net ctrl missing headers
>>>
>>> This means a control buffer is sent either without the input element
>>> or without an output element, or with a single byte output element.
>>>
>>>> and following that my host->guest UDP port forwarding does not work, that 
>>>> is,
>>>> the qemu process shows up as listening on the interface, but no packets 
>>>> appear in the guest.
>>>>
>>>> This error during initialization does not appear every time I launch QEMU.
>>>> It appears to be more or less random.
>>>>
>>>> Whenever the error does not appear, the interface works as expected, and 
>>>> port forwarding works.
>>>>
>>>> Latest commit is 
>>>>
>>>> 19b599f7664b ("Merge remote-tracking branch 
>>>> 'remotes/armbru/tags/pull-error-2018-08-27-v2'")
>>>>
>>>> Thanks a lot for any advice!
>>>>
>>>> Ciao,
>>>>
>>>> Claudio
>>>
>>> If it's easy to reproduce, you can try printing out all commands.
>>> E.g.:
>>>
>>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
>>> index f154756e85..34251273a9 100644
>>> --- a/hw/net/virtio-net.c
>>> +++ b/hw/net/virtio-net.c
>>> @@ -987,6 +987,10 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, 
>>> VirtQueue *vq)
>>>          if (!elem) {
>>>              break;
>>>          }
>>> +        fprintf(stderr, "%s: in %d/%d, out %d/%d\n",
>>> +                elem->in_num, iov_size(elem->in_sg, elem->in_num),
>>> +                elem->out_num, iov_size(elem->out_sg, elem->out_num));
>>> +
>>>          if (iov_size(elem->in_sg, elem->in_num) < sizeof(status) ||
>>>              iov_size(elem->out_sg, elem->out_num) < sizeof(ctrl)) {
>>>              virtio_error(vdev, "virtio-net ctrl missing headers");
>>> @@ -1014,6 +1018,9 @@ static void virtio_net_handle_ctrl(VirtIODevice 
>>> *vdev, VirtQueue *vq)
>>>          } else if (ctrl.class == VIRTIO_NET_CTRL_GUEST_OFFLOADS) {
>>>              status = virtio_net_handle_offloads(n, ctrl.cmd, iov, iov_cnt);
>>>          }
>>> +        fprintf(stderr, "%s: class 0x%x cmd 0x%x cnt %d status 0x%x\n",
>>> +                ctrl.class, ctrl.cmd, iov_cnt, status);
>>> +
>>>  
>>>          s = iov_from_buf(elem->in_sg, elem->in_num, 0, &status, 
>>> sizeof(status));
>>>          assert(s == sizeof(status));
>>>
>>>
>>> You can also try replacing virtio_error with fprintf - that will
>>> avoid wedging the device on an error and let you proceed
>>> with debugging.
>>>
>>
>> I managed to automate the detection of the error, and it does appear to be a 
>> regression from my point of view.
>>
>> I did a git bisect, and you can see the result and log as follows:
>>
>> >From 4a3f03ba8dbf53fce36d0c1dd5d0cc0f340fe5f3 Mon Sep 17 00:00:00 2001      
>> >             
>> From: Paolo Bonzini <address@hidden>                                         
>>        
>> Date: Wed, 11 Jan 2017 09:38:15 +0100                                        
>>             
>> Subject: [PATCH] virtio-net: enable ioeventfd even if vhost=off              
>>             
>>                                                                              
>>             
>> virtio-net-pci does not enable ioeventfd for historical reasons (and         
>>             
>> nobody ever checked whether it should be revisited).  Note that other        
>>             
>> backends do enable ioeventfd for virtio-net.                                 
>>             
>>                                                                              
>>             
>> However, it has a major effect on performance.  On Windows, throughput is    
>>             
>> _multiplied_ by 2 or 3 on TCP_STREAM (on small packets it is "only" a 30%    
>>             
>> improvement) and a little less so on TCP_MAERTS albeit still very much       
>>             
>> statistically significant.  Latency also has a single digit improvement.     
>>             
>>                                                                              
>>             
>> This is not visible when using vhost, which forces ioeventfd=on, but it      
>>             
>> is substantial without vhost.  In addition, also on Windows and with the     
>>             
>> RHEL 7.3 kernel, APICv seems to slow down virtio-net performance a bit,      
>>             
>> but the penalty with this patch goes from -25% to -7%.                       
>>             
>>                                                                              
>>             
>> Signed-off-by: Paolo Bonzini <address@hidden>                                
>>        
>> Reviewed-by: Michael S. Tsirkin <address@hidden>                             
>>             
>> Signed-off-by: Michael S. Tsirkin <address@hidden>                           
>>             
>> ---
>>  hw/virtio/virtio-pci.c | 2 +-                                               
>>             
>>  1 file changed, 1 insertion(+), 1 deletion(-)                               
>>             
>>                                                                              
>>             
>> diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c                 
>>             
>> index 854b8f2..8baaf2b 100644                                                
>>             
>> --- a/hw/virtio/virtio-pci.c                                                 
>>             
>> +++ b/hw/virtio/virtio-pci.c                                                 
>>             
>> @@ -2278,7 +2278,7 @@ static const TypeInfo virtio_serial_pci_info = {
>>                                                                              
>>             
>>  static Property virtio_net_properties[] = {                                 
>>             
>>      DEFINE_PROP_BIT("ioeventfd", VirtIOPCIProxy, flags,                     
>>             
>> -                    VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT, false),              
>>             
>> +                    VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT, true),               
>>             
>>      DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 3),             
>>             
>>      DEFINE_PROP_END_OF_LIST(),                                              
>>             
>>  };                                                                          
>>             
>> --                                                                           
>>             
>> 2.7.4
>>
>>
>>
>> git bisect start
>> # good: [ec57db1630f9cdcd13c8c55acbc8daf5237aabf1] virtio-net: unbreak the 
>> minix guest
>> git bisect good ec57db1630f9cdcd13c8c55acbc8daf5237aabf1
>> # good: [ec57db1630f9cdcd13c8c55acbc8daf5237aabf1] virtio-net: unbreak the 
>> minix guest
>> git bisect good ec57db1630f9cdcd13c8c55acbc8daf5237aabf1
>> # good: [ec57db1630f9cdcd13c8c55acbc8daf5237aabf1] virtio-net: unbreak the 
>> minix guest
>> git bisect good ec57db1630f9cdcd13c8c55acbc8daf5237aabf1
>> # bad: [19b599f7664b2ebfd0f405fb79c14dd241557452] Merge remote-tracking 
>> branch 'remotes/armbru/tags/pull-error-2018-08-27-v2' into staging
>> git bisect bad 19b599f7664b2ebfd0f405fb79c14dd241557452
>> # good: [c0f3f6754a120abc3451ebf7a7c703fbaf7b29e7] block: m25p80: 4byte 
>> address mode
>> git bisect good c0f3f6754a120abc3451ebf7a7c703fbaf7b29e7
>> # bad: [67b9c5d4f37ea373ebf9aad251883886e34bf2e1] Merge remote-tracking 
>> branch 'remotes/bonzini/tags/for-upstream' into staging
>> git bisect bad 67b9c5d4f37ea373ebf9aad251883886e34bf2e1
>> # good: [6bbcb76301a72dc80c8d29af13d40bb9a759c9c6] MAINTAINERS: Remove 
>> obsolete stable branches
>> git bisect good 6bbcb76301a72dc80c8d29af13d40bb9a759c9c6
>> # bad: [ff79d5e939c38677a575e3493eb9b4d36eb21865] Merge remote-tracking 
>> branch 'remotes/xtensa/tags/20170306-xtensa' into staging
>> git bisect bad ff79d5e939c38677a575e3493eb9b4d36eb21865
>> # bad: [a0def594286d9110a6035e02eef558cf3cf5d847] Merge remote-tracking 
>> branch 'remotes/bonzini/tags/for-upstream' into staging
>> git bisect bad a0def594286d9110a6035e02eef558cf3cf5d847
>> # good: [b6c08970bc989bfddcf830684ea7a96b7a4d62a7] Merge remote-tracking 
>> branch 'remotes/bkoppelmann/tags/pull-tricore-2017-01-11-2' into staging
>> git bisect good b6c08970bc989bfddcf830684ea7a96b7a4d62a7
>> # good: [c5fc89b36c0a167548ae7af40dc085707a7756d2] hw/intc/arm_gicv3: 
>> Implement gicv3_cpuif_virt_update()
>> git bisect good c5fc89b36c0a167548ae7af40dc085707a7756d2
>> # bad: [009fad7f4ccbbf3ea115e9196edde88c0a17c1c2] migration: Change name of 
>> live migration thread
>> git bisect bad 009fad7f4ccbbf3ea115e9196edde88c0a17c1c2
>> # bad: [ebe9383caefd56d519e965a5d87bca29f0aeffe3] target-hppa: Implement 
>> floating-point insns
>> git bisect bad ebe9383caefd56d519e965a5d87bca29f0aeffe3
>> # bad: [db655a9653af05aaa90f41e55433d41e398d4b75] Merge remote-tracking 
>> branch 'remotes/jasowang/tags/net-pull-request' into staging
>> git bisect bad db655a9653af05aaa90f41e55433d41e398d4b75
>> # bad: [2943b53f682f54548e7ddcf2ebb6c6d12d8dc821] virtio: force 
>> VIRTIO_F_IOMMU_PLATFORM
>> git bisect bad 2943b53f682f54548e7ddcf2ebb6c6d12d8dc821
>> # bad: [a5b3ebfd23bc70fa68461dff1d7145ff65e07150] fw-cfg: bump 
>> "x-file-slots" to 0x20 for 2.9+ machine types
>> git bisect bad a5b3ebfd23bc70fa68461dff1d7145ff65e07150
>> # bad: [c471ad0e9bd46ca5f5c9c796e727230e043a091d] vhost_net: device IOTLB 
>> support
>> git bisect bad c471ad0e9bd46ca5f5c9c796e727230e043a091d
>> # bad: [332fa82d0963409fa14997a02639289afa226596] Revert "virtio: turn 
>> vq->notification into a nested counter"
>> git bisect bad 332fa82d0963409fa14997a02639289afa226596
>> # bad: [4a3f03ba8dbf53fce36d0c1dd5d0cc0f340fe5f3] virtio-net: enable 
>> ioeventfd even if vhost=off
>> git bisect bad 4a3f03ba8dbf53fce36d0c1dd5d0cc0f340fe5f3
>> # first bad commit: [4a3f03ba8dbf53fce36d0c1dd5d0cc0f340fe5f3] virtio-net: 
>> enable ioeventfd even if vhost=off
>>
>>
>> I reverted the commit on master, and this solves the problem for me.
>>
>> Ciao,
>>
>> Claudio
> 
> Well this seems to tell us this is some kind of race.
> Does it change anything if we just retry?
> 
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index f154756e85..8d4e0cce61 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -987,12 +987,22 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, 
> VirtQueue *vq)
>          if (!elem) {
>              break;
>          }
> +        fprintf(stderr, "%s: in %d/%d, out %d/%d\n",
> +                elem->in_num, iov_size(elem->in_sg, elem->in_num),
> +                elem->out_num, iov_size(elem->out_sg, elem->out_num));
> +
>          if (iov_size(elem->in_sg, elem->in_num) < sizeof(status) ||
>              iov_size(elem->out_sg, elem->out_num) < sizeof(ctrl)) {
> +            virtqueue_unpop(vq, elem, 0);
> +            g_free(elem);
> +            continue;
> +#if 0
> +
>              virtio_error(vdev, "virtio-net ctrl missing headers");
>              virtqueue_detach_element(vq, elem, 0);
>              g_free(elem);
>              break;
> +#endif
>          }
>  
>          iov_cnt = elem->out_num;
> @@ -1014,6 +1024,9 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, 
> VirtQueue *vq)
>          } else if (ctrl.class == VIRTIO_NET_CTRL_GUEST_OFFLOADS) {
>              status = virtio_net_handle_offloads(n, ctrl.cmd, iov, iov_cnt);
>          }
> +        fprintf(stderr, "%s: class 0x%x cmd 0x%x cnt %d status 0x%x\n",
> +                ctrl.class, ctrl.cmd, iov_cnt, status);
> +
>  
>          s = iov_from_buf(elem->in_sg, elem->in_num, 0, &status, 
> sizeof(status));
>          assert(s == sizeof(status));
> 

I applied a slight variation of the above to make gcc happy:

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index f154756..74bfc39 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -987,12 +987,22 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, 
VirtQueue *vq)
         if (!elem) {
             break;
         }
+        fprintf(stderr, "%s (%lu): in %u/%lu, out %u/%lu\n",
+                __func__, time(NULL),
+                elem->in_num, iov_size(elem->in_sg, elem->in_num),
+                elem->out_num, iov_size(elem->out_sg, elem->out_num));
+
         if (iov_size(elem->in_sg, elem->in_num) < sizeof(status) ||
             iov_size(elem->out_sg, elem->out_num) < sizeof(ctrl)) {
+            virtqueue_unpop(vq, elem, 0);
+            g_free(elem);
+            continue;
+#if 0
             virtio_error(vdev, "virtio-net ctrl missing headers");
             virtqueue_detach_element(vq, elem, 0);
             g_free(elem);
             break;
+#endif
         }
 
         iov_cnt = elem->out_num;
@@ -1014,6 +1024,9 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, 
VirtQueue *vq)
         } else if (ctrl.class == VIRTIO_NET_CTRL_GUEST_OFFLOADS) {
             status = virtio_net_handle_offloads(n, ctrl.cmd, iov, iov_cnt);
         }
+        fprintf(stderr, "%s (%lu): class 0x%x cmd 0x%x cnt %d status 0x%x\n",
+                __func__, time(NULL),
+                ctrl.class, ctrl.cmd, iov_cnt, status);
 
         s = iov_from_buf(elem->in_sg, elem->in_num, 0, &status, 
sizeof(status));
         assert(s == sizeof(status));
--

The result is actually that it loops forever. I left it running for some 
minutes.

If there is a race condition, maybe it already happened before we reach here?
The "out" parameters are 0/0 on unsuccessful runs, while they are 2/400 on 
success.

Here is a successful run (it sometimes happens). 

virtio_net_handle_ctrl (1536771438): in 1/1, out 2/400
virtio_net_handle_ctrl (1536771438): class 0x0 cmd 0x0 cnt 1 status 0x0
virtio_net_handle_ctrl (1536771438): in 1/1, out 2/400
virtio_net_handle_ctrl (1536771438): class 0x0 cmd 0x0 cnt 1 status 0x0
virtio_net_handle_ctrl (1536771438): in 1/1, out 2/400
virtio_net_handle_ctrl (1536771438): class 0x1 cmd 0x0 cnt 1 status 0x1
virtio_net_handle_ctrl (1536771438): in 1/1, out 2/400
virtio_net_handle_ctrl (1536771438): class 0x0 cmd 0x0 cnt 1 status 0x0

Here is an unsuccessful run - when it happens it then seems to loop forever 
though - retrying does not help:

virtio_net_handle_ctrl (1536771600): in 1/1, out 0/0
virtio_net_handle_ctrl (1536771600): in 1/1, out 0/0
virtio_net_handle_ctrl (1536771601): in 1/1, out 0/0
...
(repeats forever)

Thank you,

Ciao,

Claudio



reply via email to

[Prev in Thread] Current Thread [Next in Thread]