Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to

From:	Andreas Färber
Subject:	Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
Date:	Tue, 23 Jun 2015 21:41:51 +0200
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0

Am 23.06.2015 um 21:25 schrieb Eduardo Habkost:
> On Tue, Jun 23, 2015 at 08:35:54PM +0200, Andreas Färber wrote:
>> Am 23.06.2015 um 19:39 schrieb Eduardo Habkost:
>>> On Tue, Jun 23, 2015 at 07:18:06PM +0200, Andreas Färber wrote:
>>>> Am 23.06.2015 um 19:08 schrieb Eduardo Habkost:
>>>>> On Tue, Jun 23, 2015 at 06:44:57PM +0200, Andreas Färber wrote:
>>>>>> Am 23.06.2015 um 18:38 schrieb Eduardo Habkost:
>>>>>>> On Tue, Jun 23, 2015 at 06:33:05PM +0200, Michael S. Tsirkin wrote:
>>>>>>>> On Tue, Jun 23, 2015 at 05:25:55PM +0100, Daniel P. Berrange wrote:
>>>>>>>>> Whether QEMU changed the CPU for existing machines, or only for new
>>>>>>>>> machines is actually not the core problem. Even if we only changed
>>>>>>>>> the CPU in new machines that would still be an unsatisfactory 
>>>>>>>>> situation
>>>>>>>>> because we want to be able to be able to access different versions of
>>>>>>>>> the CPU without the machine type changing, and access different 
>>>>>>>>> versions
>>>>>>>>> of the machine type, without the CPU changing. IOW it is the fact 
>>>>>>>>> that the
>>>>>>>>> changes in CPU are tied to changes in machine type that is the core
>>>>>>>>> problem.
>>>>>>>>
>>>>>>>> But that's because we are fixing bugs.  If CPU X used to work on
>>>>>>>> hardware Y in machine type A and stopped in machine type B, this is
>>>>>>>> because we have determined that it's the right thing to do for the
>>>>>>>> guests and the users. We don't break stuff just for fun.
>>>>>>>> Why do you want to bring back the bugs we fixed?
>>>>>>>
>>>>>>> I didn't take the time to count them, but I bet most of the commits I
>>>>>>> listed on my previous e-mail message are not bug fixes, but new
>>>>>>> features.
>>>>>>
>>>>>> Huh? Of course the latest machine model get new features. The point is
>>>>>> that the previous ones don't and that's what we are providing them for -
>>>>>> libvirt is expected to choose one machine and the contract with QEMU is
>>>>>> that for that machine the CPU does *not* grow new features, and we're
>>>>>> going at great lengths to achieve that. So this thread feels more and
>>>>>> more weird...
>>>>>
>>>>> We are not talking about changes to existing machines. We are talking
>>>>> about having changes introduced in new machines (the one we did on
>>>>> purpose) affecting the runnability of the VM.
>>>>
>>>> You are talking abstract!
>>>
>>> I am just talking about a different problem, and I don't know if you are
>>> purposely trying to ignore it, or are just denying that it is a problem.
>>
>> So, are you and Dan talking about the same problem or different ones?
> 
> The same one.
> 
>> I am not deliberately ignoring anything here, but I am denying there is
>> a problem until either of you explains what a concrete problem is. Seems
>> we are slowly getting there now.
> 
> I hope so. :)
> 
>>
>>>> Example 1:
>>>>
>>>> Point A: Machine pc-i440fx-2.3 exists
>>>>
>>>> Runs or runs not.
>>>>
>>>> Point B: Machine pc-i440fx-2.3 still exists
>>>>
>>>> Still runs or runs not due to guest ABI stability rules.
>>>
>>> If you didn't change the machine name, this is not the problem we are
>>> talking about.
>>
>> OK.
>>
>>>> Example 2:
>>>>
>>>> Point A: pc-i440fx-2.4 does not exist in 2.3
>>>>
>>>> Does not run becomes it doesn't exist.
>>>>
>>>> Point B: New pc-i440fx-2.4
>>>>
>>>> Runs or does not run, and if so has more features than pc-i440fx-2.3.
>>>
>>> If you didn't change the machine name, this is not the problem we are
>>> talking about.
>>>
>>>>
>>>> There is no runnability problem - either it runs or it doesn't, but
>>>> there's no change over time.
>>>>
>>>> This is what the machine -x.y versioning is all about.
>>>
>>> Let's try a concrete example:
>>>
>>> * User is running a kernel that can't emulate x2apic
>>> * User is running pc-i440fx-1.7
>>> * User wants the gigabyte alignment change implemented by commit
>>>   bb43d3839c29b17a2f5c122114cd4ca978065a18
>>> * User changes machine to pc-i440fx-2.0
>>> * x2apic is now enabled by default in all CPU models
>>> * VM with the same configuration (just the machine change) is not
>>>   runnable anymore in the same host
>>
>> Then let's take a step back: In order to change the machine type, the
>> user shuts the machine down (it does not run!), edits the XML and tries
>> to boot it up again. That's where I've challenged your use of the term
>> of changed "runnability" above. I acknowledged, it might happen that it
>> does not run. But that has nothing to do with compatibility of QEMU
>> versions v2.3.0 vs. v2.4.0 then, it is the user's active choice of
>> options that are incompatible with her system and that never before
>> worked there. That seems perfectly valid and unavoidable, just like
>> adding a non-existing command-line option or an unknown XML element to
>> the guest config.
> 
> If you add a non-existing command-line option or unknown XML element,
> you are providing bad input. The most obvious way to handle it is an
> error.
> 
> If you change -machine, you are providing good input, but the user won't
> have a good explanation why it can't run (because it is a perfectly
> valid machine name, reported as supported by QEMU). Or maybe they will
> see an explanation, but will have no idea what they need to change to
> make the new machine runnable.
> 
> CPU model runnability, on the other hand, is well documented in the
> libvirt API, and would even allow a management system to automatically
> find a solution (because it can tell exactly what's the feature
> preventing the VM from running), and tell the user which configurations
> are runnable.
> 
>>
>> The difference of opinion seems to be that when there is a bug in QEMU,
>> I require that the user updates QEMU (not necessarily to a new version),
>> whereas you are proposing that libvirt should be the one to work around
>> bugs in QEMU by tweaking command line parameters.
> 
> In the case of bugs related to CPU definitions, yes, because they are
> different kinds of changes: they affect runnability of the VM when they
> are enabled.
> 
>>
>> In order to get a virtio-scsi or gigabyte alignment fix that varies
>> across -x.y machines, that feature can just as well be enabled via
>> global properties on the old machine. New machines are primarily for new
>> features.
> 
> Replace "gibabyte alignment" with any possible reason an user 5 years
> from now may want to change to a newer machine.
> 
> If everything was configurable using globals, we wouldn't even need to
> introduce new machines, and we could let libvirt or the user configure
> everything they want, and they would never touch the machine name again
> in their configuration. But we don't live in that world yet.
> 
>>
>> If someone wants to use that new -2.0 machine, they need to pass the
>> correct options such as ",-x2apic" in your example or use a CPU model
>> that does not enable such options by default. (FWIW in that concrete
>> example I remember Paolo(?) saying that that feature had been supported
>> for a really long time already.)
>> The user, who actively edited the guest definition, gets an error
>> message and has to edit the guest again and then it starts.
> 
> That's what we are trying to avoid. The user may be not human, the user
> may have thousands of machines being upgraded, and even a human may take
> some time to figure out what's needed to make the VM runnable again.
> 
> libvirt has no API for checking if a machine name is runnable, or for
> checking if a machine+CPU combination is runnable. And I don't see a
> reason to force them to add that to their API if they could just
> decouple the sets of CPU features from the machine versions.

I am going to stop arguing here and suggest you put this on the agenda
for the next KVM call.

Given that we have had this versioning system for years and no problem
specifically with 2.4 has been raised, I see this as 2.5+ material at
this point.

Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham Norton; HRB
21284 (AG Nürnberg)

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models, (continued)

Prev by Date: [Qemu-devel] Migration issue with 4.0.x
Next by Date: Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
Previous by thread: Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
Next by thread: Re: [Qemu-devel] [PATCH 0/2] target-i386: "custom" CPU model + script to dump existing CPU models
Index(es):
- Date
- Thread