qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Bug 1626972] Re: [PATCH] util: secure memfd_create fal


From: Rafael David Tinoco
Subject: Re: [Qemu-devel] [Bug 1626972] Re: [PATCH] util: secure memfd_create fallback mechanism
Date: Fri, 21 Oct 2016 01:03:15 -0200

Hello Again, finally I could get back to this, and..
 
I was finishing a patch creating the open+truncate+mmap+unlink mechanism on 
files specified by "vhostlog" parameter of tap devices. Patch is done, problem 
is that... looks like the "memfd" is only used for shared logs AND vhost-net 
(used for tap devices) doesn't use it. 

In the following...

(scenario 1)

Linux kvm01 4.8.0-22-generic #24-Ubuntu SMP Sat Oct 8 09:15:00 UTC 2016 x86_64 
x86_64 x86_64 GNU/Linux

with:
-netdev tap,id=net0,vhost=on
-device 
virtio-net-pci,netdev=net0,id=net0,mac=52:54:00:20:c5:42,bus=pci.0,addr=0x3

## kvm01

$ ./instance.sh
qemu_memfd_check
qemu_memfd_alloc: enter
qemu_memfd_alloc: memfd_create with no sealing
qemu_memfd_alloc: memfd_create worked, truncating...
qemu_memfd_alloc: mmaping
qemu_memfd_free: enter
qemu_memfd_check: ok
vhost_dev_start: enter
vhost_log_get: enter
vhost_log_alloc: enter
vhost_log_alloc: local
vhost_log_get: not shared
vhost_log_put: enter
vhost_log_put: enter
vhost_log_put: local free

(qemu) migrate -d tcp:kvm02:4444
(qemu) info migrate
capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off 
compres
Migration status: completed
total time: 14586 milliseconds
downtime: 10 milliseconds
setup: 20 milliseconds
transferred ram: 377224 kbytes
throughput: 212.02 mbps
remaining ram: 0 kbytes
total ram: 4001544 kbytes
duplicate: 908879 pages
skipped: 0 pages
normal: 92129 pages
normal bytes: 368516 kbytes
dirty sync count: 4

## kvm02

$ ./instance.sh
qemu_memfd_check
qemu_memfd_alloc: enter
qemu_memfd_alloc: memfd_create with no sealing
qemu_memfd_alloc: memfd_create worked, truncating...
qemu_memfd_alloc: mmaping
qemu_memfd_free: enter
qemu_memfd_check: ok
vhost_dev_start: enter

(scenario 2)

Linux kvm01 3.13.0-99-generic #146-Ubuntu SMP Wed Oct 12 20:56:26 UTC 2016 
x86_64 x86_64 x86_64 GNU/Linux

with:
-netdev tap,id=net0,vhost=on
-device 
virtio-net-pci,netdev=net0,id=net0,mac=52:54:00:20:c5:42,bus=pci.0,addr=0x3

## kvm01

$ ./instance.sh
qemu_memfd_check
qemu_memfd_alloc: enter
qemu_memfd_alloc: memfd_create with no sealing
qemu_memfd_alloc: memfd_create failed #2
qemu_memfd_alloc: fallback
qemu_memfd_alloc: fname = /tmp/memfd-XXXXXX
qemu_memfd_alloc: fallback truncating
qemu_memfd_alloc: mmaping
qemu_memfd_free
qemu_memfd_check: ok
vhost_dev_start: enter
vhost_log_get: enter
vhost_log_alloc: enter
vhost_log_alloc: local
vhost_log_get: not shared
vhost_log_put: enter
vhost_log_put: enter
vhost_log_put: local free

(qemu) migrate -d tcp:kvm02:4444
(qemu) info migrate
capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off 
compres
Migration status: completed
total time: 15400 milliseconds
downtime: 9 milliseconds
setup: 5 milliseconds
transferred ram: 375812 kbytes
throughput: 199.99 mbps
remaining ram: 0 kbytes
total ram: 4001544 kbytes
duplicate: 909186 pages
skipped: 0 pages
normal: 91776 pages
normal bytes: 367104 kbytes
dirty sync count: 3

## kvm02

$ ./instance.sh
qemu_memfd_check
qemu_memfd_alloc: enter
qemu_memfd_alloc: memfd_create with no sealing
qemu_memfd_alloc: memfd_create failed #2
qemu_memfd_alloc: fallback
qemu_memfd_alloc: fname = /tmp/memfd-XXXXXX
qemu_memfd_alloc: fallback truncating
qemu_memfd_alloc: mmaping
qemu_memfd_free
qemu_memfd_check: ok
vhost_dev_start: enter

For kvm01, we have 2 parts:

(1) From "-netdev tap,id=net0,vhost=on":
  - net_init_clients()
  - net_init_client()
  - net_client_init()
  - net_client_init1()
  - net_client_init_fun() .. net_init_tap() in my case
  - net_init_tap_one()
  - vhost_net_init()
  - vhost_dev_init()
  - migration checks (host feature, memfd functional test)

(2) From "-device virtio-net-pci,netdev=net0...":
  - virtio_pci_device_plugged()
  - virtio_pci_modern_regions_init()
  - virtio_pci_common_write()
  - virtio_set_status()
  - virtio_net_set_status()
  - virtio_net_vhost_status()
  - vhost_net_start()
  - vhost_net_start_one()
  - vhost_dev_start()
  - does the log allocation logic

It looks like "vhost_requires_shm_log" isn't defined by my underlaying VHOST 
driver (vhost-net in my case). It seems that vhost-user defines it (from 
VhostOps user_ops).

Judging by the outputs above, looks like vhost_dev_log_is_shared is returning 
false, making (2) - vhost_dev_start - to use a different log allocation 
(malloc) than the one that was tested for allowing migrations at (1) - 
vhost_dev_init.

Question: Why to check for "memfd" when its not sure - yet - if a shared 
descriptor and memory pointer is going to be needed for the migration to happen 
? Do you want me to change that ? If memfd fails, but, the guest in question is 
using regular "malloc" for vhost log, we are marking it unable to live migrate 
by mistake. I could check for vhost_requires_shm_log pointer during 
vhost_dev_init (coming from tap).

Also, if possible, I would like comments about a draft:

https://pastebin.canonical.com/168579/
(please disregard printfs and minor problems)

OBS: I'm basically removing fallback mechanism from memfd, creating a generic 
qemu_mmap_XXX implementation, adding a vhostlog parameter in tap cmdline AND 
changing the decision on what to use: if vhostlog is present in cmdline, 
qemu_mmap_XXX on vhostlog is used. If it is a directory, a random file is 
created inside it. If it is a file, the file is used. If no vhostlog is given 
(default while libvirt isn't changed), it tries first to use memfd (all newer 
kernels), and, if not possible, it tries to fallback using the qemu_mmap 
mechanism on "tmp" directory creating random files. 

PS: Remember that this is because selinux/apparmor labelling on tmp files (and 
because file descriptors can be passed away, like we discussed before). 

If that is okay I'll provide a patch asap. Let me know if you prefer something 
else.

Thank you,
Rafael

> On Oct 04, 2016, at 12:29, Rafael David Tinoco <address@hidden> wrote:
> 
> 
>> On Oct 04, 2016, at 10:50, Marc-André Lureau <address@hidden> wrote:
>> 
>> What about having a single config parameter as a place to put all vhost logs 
>> for all drives for a single instance ? Remove the memfd implementation with 
>> all the memfd shared_memory option ? Replace it with a 
>> open+unlink+ftruncate+mmap approach only.
>> 
>> 
>> I fail to see your point, memfd is superior to open+unlink and has other 
>> advantages with sealing etc.
> 
> I was just summarising needs based on previous statement from Daniel:
> 
>> This makes me wonder about the memfd_create() code path too - we'll
>> again not want that external process to be granted access to arbitrary
>> FDs of QEMU's and I'm not sure of a way to get the memfd  FD to have
>> a specific label. So I think it is possible that when using libvirt
>> we'll want the ability to tell QEMU to *always* use an explicit file
>> in a path libvirt specifies, and never use memfd even if available.
>> 
>> Regards,
>> Daniel




reply via email to

[Prev in Thread] Current Thread [Next in Thread]