[Qemu-devel] Re: [PATCHv2 09/12] vhost: vhost net support

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Re: [PATCHv2 09/12] vhost: vhost net support

From:	Anthony Liguori
Subject:	[Qemu-devel] Re: [PATCHv2 09/12] vhost: vhost net support
Date:	Sun, 28 Feb 2010 10:02:29 -0600
User-agent:	Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.5) Gecko/20091209 Fedora/3.0-4.fc12 Lightning/1.0pre Thunderbird/3.0

On 02/27/2010 01:38 PM, Michael S. Tsirkin wrote:

On Fri, Feb 26, 2010 at 09:18:03AM -0600, Anthony Liguori wrote:

On 02/26/2010 08:49 AM, Michael S. Tsirkin wrote:

KVM code needs all kind of work-arounds for KVM specific issues.
It also assumes that KVM is registered at startup, so it
does not try to optimize finding slots.

No, the slot mapping changes dynamically so KVM certainly needs to
optimize this.

Maybe, but it does not, KVM algorithms are n^2 or worse.


But n is small and the mappings don't change frequently.

More importantly, they change at the exact same times for vhost as theydo for kvm. So even if vhost has an O(n) algorithm, the KVM code getsexecuted either immediately before or immediately after the vhost codeso your optimizations are lost in KVM's O(n^2) algorithm.

Mine has no bugs, let's switch to it!

Seriously, need to tread very carefully here.
This is why I say: merge it, then look at how to reuse code.

Once it's merged, there's no incentive to look at reusing code.
Again, I don't think this is a huge burden to vhost.  The two bits of code
literally do exactly the same thing.  They just use different data
structures that ultimately contain the same values.

Not exactly. For example kvm track ROM and video ram addresses.

KVM treats ROM and RAM the same (it even maps ROM as RAM). There is nospecial handling for video ram addresses.

There is some magic in the VGA code to switch the VGA LFB from mmio toram when possible but that happens at a higher layer.

'++i' is an odd thing to do in C in a for() loop.  We're not explicit
about it in Coding Style but the vast majority of code just does
'i++'.

Ugh. Do we really need to specify every little thing?

I don't care that much about coding style. I don't care if there arecurly brackets on single line ifs.

However, it's been made very clear to me that most other people do andthat it's something that's important to enforce.

Hmm. I'll look into it.
I actually think that for functions that just do a list of things
unconditionally, without branches or loops, or with just error handling
as here, it is perfectly fine for them to be of any length.


Like I said, just a suggestion.

You never unmap() the mapped memory and you're cheating by assuming that
the virtio rings have a constant mapping for the life time of a guest.
That's not technically true.  My concern is that since a guest can
trigger remappings (by adjusting PCI mappings) badness can ensue.

I do not know how this can happen. What do PCI mappings have to do with this?
Please explain. If it can, vhost will need notification to update.

If a guest modifies the bar for an MMIO region such that it happens to
exist in RAM, while this is a bad thing for the guest to do, I don't
think we do anything to stop it.  When the region gets remapped, the
result will be that the mapping will change.

So IMO this is the bug. If there's a BAR that matches RAM
physical address, it should never get mapped. Any idea how
to check this?

We could check it when the BAR is mapped in the PCI layer. I'msuspicious there are other ways a guest can enforce/determine mappingsthough.

Generally speaking, I think it's necessary to assume that a guest canmanipulate memory mappings. If we can prove that a guest cannot, itwould definitely simplify the code a lot. I'd love to make the sameassumptions in virtio userspace before it's actually a big source ofoverhead.


I'm pretty sure though that we have to let a guest control mappings though.

Within qemu, because we carry the qemu_mutex, we know that the mappings
are fixed as long as we're in qemu.  We're very careful to assume that
we don't rely on a mapping past when we drop the qemu_mutex.

With vhost, you register a slot table and update it whenever mappings
change.  I think that's good enough for dealing with ram addresses.  But
you pass the virtual address for the rings and assume those mappings
never change.

So, the issue IMO is that an MMIO address gets passed instead of RAM.
There's no reason to put virtio rings not in RAM, we just need to
verify this.

Yes, but we don't always map PCI IO regions as MMIO or PIO. Inparticular, for VGA devices (particularly VMware VGA), we map certain IOregions as RAM because that's how the device is designed. Likewise, ifwe do shared memory PCI devices using IO regions as the ram contents, wewould be mapping those as ram too.

So just checking to see if the virtio ring area is RAM or not is notenough. A guest may do something that causes a virtio ring to still beram, but a different ram address. Now the vhost code is writing to RAMthat it thinks is physical address X but is really guest physical address Y.

This is not something that a guest can use to break into qemu, but it isan emulation bug and depending on the guest OS, it may be possible touse it to do a privilege escalation within the guest.

I think the only way to handle this is to explicitly check for changesin the physical addresses the rings are mapped at and do the appropriateioctls to vhost to let it know if the ring's address has changed.

I'm pretty sure a guest can cause those to change and I'm not 100% sure,
but I think it's a potential source of exploits if you assume a mapping.
In the very least, a guest can trick vhost into writing to ram that it
wouldn't normally write to.

This seems harmless. guest can write anywhere in ram, anyway.

Not all guest code is created equal and if we're writing to the wrongguest ram location, it can potentially circumvent the guest's securityarchitecture.


Regards,

Anthony Liguori

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] Re: [PATCHv2 09/12] vhost: vhost net support, (continued)
- [Qemu-devel] [PATCHv2 02/12] kvm: add API to set ioeventfd, Michael S. Tsirkin, 2010/02/25
  - [Qemu-devel] Re: [PATCHv2 02/12] kvm: add API to set ioeventfd, Anthony Liguori, 2010/02/25
- [Qemu-devel] [PATCHv2 04/12] virtio: add notifier support, Michael S. Tsirkin, 2010/02/25
- [Qemu-devel] [PATCHv2 01/12] tap: add interface to get device fd, Michael S. Tsirkin, 2010/02/25
- [Qemu-devel] [PATCHv2 07/12] virtio: move typedef to qemu-common, Michael S. Tsirkin, 2010/02/25
- [Qemu-devel] [PATCHv2 10/12] tap: add vhost/vhostfd options, Michael S. Tsirkin, 2010/02/25
  - [Qemu-devel] Re: [PATCHv2 10/12] tap: add vhost/vhostfd options, Anthony Liguori, 2010/02/25
    - [Qemu-devel] Re: [PATCHv2 10/12] tap: add vhost/vhostfd options, Michael S. Tsirkin, 2010/02/26
    - [Qemu-devel] Re: [PATCHv2 10/12] tap: add vhost/vhostfd options, Anthony Liguori, 2010/02/26
    - [Qemu-devel] Re: [PATCHv2 10/12] tap: add vhost/vhostfd options, Michael S. Tsirkin, 2010/02/27

Prev by Date: Re: [Qemu-devel] Re: [PATCHv2 09/12] vhost: vhost net support
Next by Date: [Qemu-devel] Re: [PATCHv2 10/12] tap: add vhost/vhostfd options
Previous by thread: Re: [Qemu-devel] Re: [PATCHv2 09/12] vhost: vhost net support
Next by thread: [Qemu-devel] [PATCHv2 02/12] kvm: add API to set ioeventfd
Index(es):
- Date
- Thread