qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [GPU and VFIO] qemu hang at startup, VFIO_IOMMU_MAP_DMA


From: Bob Chen
Subject: Re: [Qemu-devel] [GPU and VFIO] qemu hang at startup, VFIO_IOMMU_MAP_DMA is extremely slow
Date: Tue, 2 Jan 2018 15:04:37 +0800

Ping...

Was it because VFIO_IOMMU_MAP_DMA needs contiguous memory and my host was
not able to provide them immediately?

2017-12-26 19:37 GMT+08:00 Bob Chen <address@hidden>:

>
>
> 2017-12-26 18:51 GMT+08:00 Liu, Yi L <address@hidden>:
>
>> > -----Original Message-----
>> > From: Qemu-devel [mailto:qemu-devel-bounces+yi.l.liu=
>> address@hidden
>> > On Behalf Of Bob Chen
>> > Sent: Tuesday, December 26, 2017 6:30 PM
>> > To: address@hidden
>> > Subject: [Qemu-devel] [GPU and VFIO] qemu hang at startup,
>> > VFIO_IOMMU_MAP_DMA is extremely slow
>> >
>> > Hi,
>> >
>> > I have a host server with multiple GPU cards, and was assigning them to
>> qemu
>> > with VFIO.
>> >
>> > I found that when setting up the last free GPU, the qemu process would
>> hang
>>
>> Are all the GPUs in the same iommu group?
>>
>
> Each of them is in a single group.
>
>
>>
>> > there and took almost 10 minutes before finishing startup. I made some
>> dig by
>> > gdb, and found the slowest part occurred at the
>> > hw/vfio/common.c:vfio_dma_map function call.
>>
>> This is to setup mapping and it takes time. This function would be called
>> multiple
>> times and it will take some time. The slowest part, do you mean it takes
>> a long time for a single vfio_dma_map() calling or the whole passthru
>> spends a lot
>> of time on creating mapping. If a single calling takes a lot of time,
>> then it may be
>> a problem.
>>
>
> Each vfio_dma_map() takes 3 to 10 mins accordingly.
>
>
>>
>> You may paste your Qemu command which might help. And the dmesg in host
>> would also help.
>>
>
> cmd line:
> After adding -device vfio-pci,host=09:00.0,multifunction=on,addr=0x15,
> qemu would hang.
> Otherwise, could start immediately without this option.
>
> dmesg:
> [Tue Dec 26 18:39:50 2017] vfio-pci 0000:09:00.0: enabling device (0400 ->
> 0402)
> [Tue Dec 26 18:39:51 2017] vfio_ecap_init: 0000:09:00.0 hiding ecap
> address@hidden
> [Tue Dec 26 18:39:51 2017] vfio_ecap_init: 0000:09:00.0 hiding ecap
> address@hidden
> [Tue Dec 26 18:39:55 2017] kvm: zapping shadow pages for mmio generation
> wraparound
> [Tue Dec 26 18:39:55 2017] kvm: zapping shadow pages for mmio generation
> wraparound
> [Tue Dec 26 18:40:03 2017] kvm [74663]: vcpu0 ignored rdmsr: 0x345
>
> Kernel:
> 3.10.0-514.16.1  CentOS 7.3
>
>
>>
>> >
>> >
>> > static int vfio_dma_map(VFIOContainer *container, hwaddr iova,
>> ram_addr_t
>> > size, void *vaddr, bool readonly) { ...
>> >     if (ioctl(container->fd, VFIO_IOMMU_MAP_DMA, &map) == 0 ||
>> >         (errno == EBUSY && vfio_dma_unmap(container, iova, size) == 0 &&
>> >          ioctl(container->fd, VFIO_IOMMU_MAP_DMA, &map) == 0)) {
>> >         return 0;
>> >     }
>> > ...
>> > }
>> >
>> >
>> > The hang was enable to reproduce on one of my hosts, I was setting up a
>> 4GB
>> > memory VM, while the host still had 16GB free. GPU physical mem is 8G.
>>
>> Does it happen when you only assign a single GPU?
>>
>
> Not sure. Didn't try multiple GPUs.
>
>
>>
>> > Also, this phenomenon was observed on other hosts occasionally, and the
>> > similarity is that they always happened on the last free GPU.
>> >
>> >
>> > Full stack trace file is attached. Looking forward for you help, thanks
>> >
>> >
>> > - Bob
>>
>> Regards,
>> Yi L
>>
>
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]