qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Re: [PATCH 26/35] kvm: Eliminate KVMState arguments


From: Jan Kiszka
Subject: Re: [Qemu-devel] Re: [PATCH 26/35] kvm: Eliminate KVMState arguments
Date: Mon, 10 Jan 2011 21:34:30 +0100
User-agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.1.12) Gecko/20080226 SUSE/2.0.0.12-1.1 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666

Am 10.01.2011 21:23, Anthony Liguori wrote:
> On 01/10/2011 02:12 PM, Jan Kiszka wrote:
>> Am 10.01.2011 20:59, Anthony Liguori wrote:
>>   
>>> On 01/08/2011 02:47 AM, Jan Kiszka wrote:
>>>     
>>>> Am 08.01.2011 00:27, Anthony Liguori wrote:
>>>>
>>>>       
>>>>> On 01/07/2011 03:03 AM, Jan Kiszka wrote:
>>>>>
>>>>>         
>>>>>> Am 06.01.2011 20:24, Anthony Liguori wrote:
>>>>>>
>>>>>>
>>>>>>           
>>>>>>> On 01/06/2011 11:56 AM, Marcelo Tosatti wrote:
>>>>>>>
>>>>>>>
>>>>>>>             
>>>>>>>> From: Jan Kiszka<address@hidden>
>>>>>>>>
>>>>>>>> QEMU supports only one VM, so there is only one kvm_state per
>>>>>>>> process,
>>>>>>>> and we gain nothing passing a reference to it around. Eliminate any
>>>>>>>> need
>>>>>>>> to refer to it outside of kvm-all.c.
>>>>>>>>
>>>>>>>> Signed-off-by: Jan Kiszka<address@hidden>
>>>>>>>> CC: Alexander Graf<address@hidden>
>>>>>>>> Signed-off-by: Marcelo Tosatti<address@hidden>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                
>>>>>>> I think this is a big mistake.
>>>>>>>
>>>>>>>
>>>>>>>              
>>>>>> Obviously, I don't share your concerns. :)
>>>>>>
>>>>>>
>>>>>>
>>>>>>           
>>>>>>> Having to manage kvm_state keeps the abstraction lines well defined.
>>>>>>>
>>>>>>>
>>>>>>>              
>>>>>> How does it help?
>>>>>>
>>>>>>
>>>>>>
>>>>>>           
>>>>>>> Otherwise, it's far too easy for portions of code to call into KVM
>>>>>>> functions that really shouldn't.
>>>>>>>
>>>>>>>
>>>>>>>              
>>>>>> I can't imagine we gain anything from requiring kvm_check_extension
>>>>>> callers to hold a kvm_state "capability". Yes, it's now much
>>>>>> easier to
>>>>>> call kvm_[vm_]ioctl, but that's the key point of this change:
>>>>>>
>>>>>> So far we primarily complicated the internal interface between
>>>>>> generic
>>>>>> and arch-dependent kvm parts by requiring kvm_state joggling. But
>>>>>> external users already find interfaces without this restriction
>>>>>> (kvm_log_*, kvm_ioeventfd_*, ...). That's because it's at least
>>>>>> complicated to _cleanly_ pass kvm_state references to all users that
>>>>>> need it - e.g. sysbus devices like kvmclock or upcoming in-kernel
>>>>>> irqchips.
>>>>>>
>>>>>>
>>>>>>            
>>>>> I think you're basically making my point for me.
>>>>>
>>>>> ioeventfd is a broken interface.  It shouldn't be a VM ioctl but
>>>>> rather
>>>>> a VCPU ioctl because PIO events are dispatched on a per-VCPU basis.
>>>>>
>>>>>          
>>>> OK, but I don't want to argue about the ioeventfd API. So let's put
>>>> this
>>>> case aside. :)
>>>>
>>>>
>>>>       
>>>>> kvm_state is available as part of CPU state so it's quite easy to
>>>>> get at
>>>>> if these interfaces just took a CPUState argument (and they should).
>>>>>
>>>>>          
>>>> My point is definitely NOT about cpu-bound devices. That case is clear
>>>> and is not touched at all by this patch.
>>>>
>>>> My point is about devices that have clear system scope like kvmclock,
>>>> ioapic, pit, pic,
>>>>        
>>> I don't see how ioapic, pit, or pic have a system scope.
>>>      
>> They are not bound to any CPU like the APIC which you may have in mind.
>>    
> 
> And none of the above interact with KVM.
> 
> They may be replaced by KVM but if you look at the PIT, this is done by
> having two distinct devices.  The KVM specific device can (and should)
> be instantiated with kvm_state.
> 
> The way the IOAPIC/APIC/PIC is handled in qemu-kvm is nasty.  The kernel
> devices are separate devices and that should be reflected in the device
> tree.

If separate device or hack to existing one - both need to sync their
user space state with the kernel when QEMU asks them to. That's how they
have to interact with KVM all the time. Same for kvmclock if you want to
look at a really trivial example.

> 
>>> I don't know enough about kvmclock.
>>>      
>> It's just the same.
>>
>>   
>>>     
>>>>    whatever-the-future-will-bring. And about KVM services
>>>> that have global scope like capability checks and other feature
>>>> explorations or VM configurations done by the KVM arch code. You still
>>>> didn't explain what we gain in these concrete scenarios by handing the
>>>> technically redundant abstraction kvm_state around, especially _inside_
>>>> the KVM core.
>>>>
>>>>        
>>> If you have to pass around a KVMState pointer, you establish an explicit
>>> relationship and communication between subsystems.  Any place where the
>>> global KVMState is used is a red flag that something is wrong.
>>>      
>> It is and will be _only_ used inside kvm-all.c. Again: What is the
>> benefit of restricting access to kvm_check_extension this way?
>>    
> 
> The more places that need to deal with KVM compatibility code, the worse
> we will be because it's more opportunities to get it wrong.

That code belongs where the related logic is. IMHO, it would be a
needless abstraction to push in-kernel access services and workaround
definitions in the KVM core instead of the KVM device model code -
provided there is only one user.

But this discussion is a bit abstract right now as we do not yet have
anything more complex than kvmclock on the table for QEMU.

> 
>>> I don't see what the advantage to making all of the KVMState global and
>>> implicit.  It seems like a big step backwards to me.  Can you give a
>>> very concrete example of where you think it results in easier to
>>> understand code as I don't see how making relationships implicit ever
>>> makes code easier to understand?
>>>      
>> The best example does not yet exist (fortunately): Just look at patch 28
>> and then try to pass some kvm_state reference to the kvmclock device. Is
>> this handle worth changing the sysbus API?
>>    
> 
> Let me look at that patch and reply there.
> 

OK, great.

Jan

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]