qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC 0/4] Adding -cdrom, -hd[abcd] and -drive file=...


From: Markus Armbruster
Subject: Re: [Qemu-devel] [RFC 0/4] Adding -cdrom, -hd[abcd] and -drive file=... to Q35
Date: Wed, 20 Aug 2014 09:22:14 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux)

John Snow <address@hidden> writes:

> On 08/19/2014 02:08 PM, Markus Armbruster wrote:
>> John Snow <address@hidden> writes:
>>
>>> On 08/19/2014 04:05 AM, Markus Armbruster wrote:
>>>> John Snow <address@hidden> writes:
>>>>
>>>>> Currently, the drive definitions created by drive_new() when using
>>>>> the -drive file=...[,if=ide] or -cdrom or -hd[abcd] options are not
>>>>> picked up by the Q35 initialization routine.
>>>>>
>>>>> To fix this, we have to add hooks to search for these drives using
>>>>> something like pc_piix's ide_drive_get and then add them using
>>>>> something like pci_ide_create_devs.
>>>>
>>>> ide_drive_get() isn't pc_piix's, it's a helper function in the IDE core
>>>> which boards (not just pc_piix) use to find the if=ide drives.  It fills
>>>> in an array of DriveInfo.
>>>>
>>>> pci_ide_create_devs() is a helper function in the IDE PCI code which PCI
>>>> IDE controllers (not just piix3-ide) use to create IDE devices for an
>>>> array of DriveInfo.
>>>
>>> Yes. I meant to say pc_piix's /call to/ ide_drive_get. I would have to
>>> patch up the other boards if I changed this function! Only an RFC
>>> before I got too far down this path :]
>>>
>>>>> Where it gets slightly wonky is the fact that if=ide is specified
>>>>> to use two devices per bus, whereas AHCI does not utilize that
>>>>> same master/slave mechanic. Therefore, many places inside of
>>>>> blockdev.c where we add and define new drives use incorrect math
>>>>> for AHCI devices and try to place them on impossible buses.
>>>>> Notably -hdb and -hdd would become inaccessible.
>>>>
>>>> Yes.
>>>>
>>>>> To remedy this, I added a new interface type, IF_AHCI. Corresponding
>>>>> to this change, I modified the default machine properties for Q35
>>>>> to use this interface as a default.
>>>>>
>>>>> The changes appear to work well, but where I'd like some feedback
>>>>> is what should happen if people do something like:
>>>>>
>>>>> qemu -M q35 -drive if=ide,file=fedora.qcow2
>>>>>
>>>>> The code as presented here is not going to look for or attempt to
>>>>> connect IDE devices, because it is now looking for /AHCI/ devices.
>>>>>
>>>>> At worst, this may break a few existing scripts, but I actually think
>>>>> that since the if=ide,file=... shorthand never worked to begin with,
>>>>> the impact might actually be minimal.
>>>>>
>>>>> But since the legacy IDE interface of the ICH9 is as of yet unemulated,
>>>>> the if=ide drives don't have a reasonable place to go yet. I am also
>>>>> not sure what a reasonable way to handle people specifying BOTH
>>>>> if=ide and if=ahci drives would be.
>>>>
>>>> We've been through IF_AHCI before, more than once, but that was before
>>>> you got involved :)
>>>>
>>>> The problem at hand is that "-drive if=ide" and its sugared forms -hda,
>>>> -hdb, -cdrom, ... don't work with Q35.
>>>>
>>>> You provide a solution for the sugared forms, you leave the problem
>>>> unsolved for the unsugared form, and you add a new problem: -drive
>>>> if=ahci doesn't work with i440FX.  Progress, sort of.
>>>
>>> Adding a call to boards that support the AHCI device during their
>>> initialization should be easy enough, if we decide that "ide means
>>> ISA/PCI, ahci means the AHCI HBA." I could probably even write one
>>> generic routine between i440fx and q35 to do both IDE/AHCI.
>>>
>>> If we decide that IF_IDE and IF_AHCI mean different things, the
>>> problem of the unsugared form being unsolved depends on me (well, or
>>> someone) implementing the legacy IDE interface for Q35.
>>
>> Let me come back to this further down.

I didn't.  This time I will.

>>>> Let's take a step back, and recap previous discussion.  There are two
>>>> defensible point of views, in my opinion.
>>>>
>>>> One is that IDE and AHCI should be separate interface types, just like
>>>> IDE and SCSI are.
>>>>
>>>> Attempts to define an if=X drive with a board that doesn't provide a
>>>> controller for X should fail[*].  Only onboard controllers matter,
>>>> add-ons plugged in with -device don't.  An i440FX board provides only
>>>> IDE.  A Q35 board provides only AHCI, not IDE.  If we implement an
>>>> ich9-ahci legacy mode, and switch it on, then it provides only IDE, not
>>>> AHCI.  Or maybe both, depending on how we do it.
>>>
>>> I think I am leaning towards this viewpoint, but it depends on what
>>> "interface" means in QEMU. Currently, the number of units per bus is
>>> tied to the "interface" and clearly the AHCI SATA interface only
>>> supports one per bus, so semantically this makes sense.
>>
>> An index <-> (bus, unit) mapping doesn't make an interface!  Yes, it's
>> tightly coupled to the interface, but that became wrong way back when we
>> went beyond 8-bit SCSI HBAs, long before we added up AHCI HBAs.
>
> "Interface" seems nebulous in QEMU. In the physical world, there
> definitely is a difference between IDE/EIDE/PATA and
> SATA/AHCI. Different physical and link layers -- the verbs stayed the
> same. What I am getting at is that I am not sure what an "interface"
> is /supposed/ to encompass here in QEMU. Underlying device type? If
> that's the case, then IDE/AHCI are definitely identical.

"Interface" in the sense of BlockInterfaceType isn't exactly a product
of careful design, weighing considerations from the physical world
against practical needs of the virtual world.  It's more like "for every
board, list ways to plug in block devices, then look at these lists out
of the corner of your eye to come up with an enumeration type that'll do
for all boards."

Its not a big user interface deal anyway.  It's just for -drive, and
-drive is getting less and less prominent.  If you want sugar, there's
-hda & friends.  If you want power, there's -device with -drive if=none
(soon to be superseded by -blockdev).

>From a purely pragmatic point of view, I can accept a new
BlockInterfaceType member when a board has a way to plug in block
devices we want to offer with -drive that doesn't fit existing members.
I don't think that's the case for AHCI.

>From a post hoc design point of view, the SCSI precedence of using the
same IF_SCSI for all the different SCSI HBAs makes me want to use the
same IF_IDE for all the different ATA HBAs.

>>> I think the real ICH9 AHCI device supports only fully AHCI or fully
>>> legacy, but the AHCI spec itself appears to allow you to run a
>>> mixed-mode device.
>>>
>>> I am not sure we have a usage case for mixed-mode, so enforcing
>>> either/or for the AHCI device makes sense for now, I think.
>>
>> I can't see a use for mixed mode, either.
>>
>>>> The other point of view is that IDE and AHCI are no more different than
>>>> the different kinds of SCSI HBAs.  This is certainly true from a qdev
>>>> point of view: just like SCSI devices can connect to any SCSI qbus,
>>>> regardless of the HBA providing it, so can IDE devices connect to any
>>>> IDE qbus, regardless of the controller providing it.
>>>
>>> Yes... Really the only difference are some mapping semantics. I don't
>>> think there's any /other/ reason I needed IF_AHCI, of course, I wasn't
>>> around for the previous discussions, so maybe there are other reasons
>>> I am not aware of.
>>
>> I've always been in the "we don't need or want if=ahci" camp :)
>>
>> The one argument for if=ahci I found convincing was a desire for Q35
>> with its ICH9 in legacy mode.  And that's just as easily done with a
>> machine option.  Personally, I find that more natural.
>
> So piix would always try to add to the PCI IDE bus, and Q35 would
> always try to add to the AHCI bus.

Yes.

> With a machine option for Q35, we could tell it to use the AHCI device
> in legacy mode and add to that device accordingly.

Yes.

> Do we care about the case for adding an AHCI device to piix? Do we
> change the behavior of what bus we use by a machine option (like a
> default_hba machine property and hba_type drive properties?), or by
> the presence of an AHCI device if someone adds one?

No.

-drive plugs into onboard devices.  If you want additional devices, use
-device.

There's one exception: pc machines automatically add N+1 lsi53c895a
HBAs, where N is the largest bus number used with -drive if=scsi,
explicitly or implicitly.  This HBA has become a pretty bad choice, and
it's getting worse.  I'd like us to kill the "create SCSI HBAs" feature
for new machine types.

> As it stands now, using if=ahci or if=ide is a pretty strong hint for
> what bus to look for and add to; though it is inflexible between
> command invocations that use different HBAs.

Yes, but

1. there is no board sporting both IDE and AHCI HBA at the same time
(like what we called "mixed mode" above), and

2. if there was, we still could route -drive to HBAs according to bus,
exactly like we do now with multiple homogeneous HBAs, albeit with a
more complex index <-> (bus, unit) mapping.

Besides, let me reiterate: why only IDE vs. AHCI?  Why not lsi53c895a
vs. virtio-scsi vs. megasas?  Where would that end?

>>>> There's a wrinkle: the mapping between index to (bus, unit).  This
>>>> mapping is ABI.  The current mapping makes sense for the first
>>>> generation of controllers: PATA (two devices per bus, thus
>>>> if_max_devs[IF_IDE] = 2), and 8-bit SCSI (seven per bus, thus
>>>> if_max_devs[IF_SCSI = 7).
>>>>
>>>> The mapping is silly for newer SCSI HBAs.  Commit 622b520f tried to make
>>>> it less silly, but had to be reverted in 27d6bf4 because the silliness
>>>> was ABI.
>>>>
>>>> The mapping is also silly for ich9-ahci.  You side-step that silliness
>>>> only, by adding a new interface type for it.  But shouldn't we add a
>>>> number of SCSI interface types then, too?  Where does that end?
>>>>
>>>> Can we do better?  I think we can, by making this part of the ABI
>>>> board-specific.  The general form of the mapping remains
>>>>
>>>>       (bus, unit) = (index / N, index % N)
>>>>
>>>> but N now depends on board and interface type, not just the latter.
>>>>
>>>> If the board connects if=scsi to an lsi53c895a, then N = 7.
>>>>
>>>> If the board connects if=ide to an piix3-ide, then N = 2.
>>>>
>>>> If the board connects if=ide to an ich9-ahci, then N = 1.
>>>>
>>>> I trust you get the idea :)
>>>
>>> I suppose we could make it something like:
>>> if (HBA.max_units > 0) {
>>>    N := min(HBA.max_units, IF.max_units);
>>> } else {
>>>    N := IF.max_units;
>>> }
>>> (bus, unit) := (index / N, index % N);
>>>
>>> Which sets a default property for the interface but allows the device
>>> (not the board) to override. Does that make more sense? If we allow
>>> people to wire up an AHCI device to piix, we'll run back into the same
>>> problems of the bus/unit mappings unless we make this a device
>>> property.
>>
>> Yes, it is a property of the device (property not in the qdev sense).
>>
>> Why would HBA.max_units ever be greater than IF.max_units?
>>
>> If the answer is "only if somebody screwed up the HBA device model",
>> then the above can be simplified to just N = HBA.max_units.
>
> Oh, yeah. I did Web UI programming for a while during college. Not
> trusting plugin values is a side effect! I might pepper in an
> assertion to make it clear that you can't bump the number of units
> /up/, though. That's a smarter thing to do.

If we have a bus-specific limit, then asserting the device's limit is in
range makes lots of sense :)

>>> I do feel like I'd rather just make it an interface property and have
>>> people specify which type of bus they want to wire it up to, but that
>>> does create a lot of disparity against the SCSI devices.
>>
>> What do you mean by "interface property"?
>
> A logical property of the interface specification; what QEMU and this
> patch does now.

I guess we can discuss that over patches.

I promised to talk about desugaring.  I feel right after discussing
index <-> (bus, unit) mapping is a good place.

There are three desugaring steps: 1. -hda, ... -> -drive, 2. -drive ->
DriveInfo + BlockDriverState, 3. DriveInfo -> device model.

1. -hda, ... -> -drive

   Straightforward macro expansion, done by main() with help from
   drive_add().

   Example: -hda FNAME -> -drive index=0,file=FNAME,media=disk
   Note: no if=..., this means "use board default".

   Example: -sd FNAME -> -drive if=sd,file=FNAME
   Note: no index=..., this means "use next available".

2. -drive -> DriveInfo + BlockDriverState

   Done by drive_new().  Takes care of supplying defaults, such as
   board's default interface type, mapping index to (bus, unit), picking
   next available bus unit, ...

3. DriveInfo -> device model

   Done by board code, separately for each interface type the board
   recognizes.  Picks a device model, where to plug and how to configure
   it, all according to DriveInfo,

Making the index <-> (bus, unit) mapping depend on the board hopefully
affects just step 2: have boards export suitable parameters for the
mapping, just like they export their default interface type.

Obvious idea: move if_name[] into struct QEMUMachine.  You may want to
use a pointer instead of an array there, so you can avoid duplication.
Maybe have NULL mean the traditional value of if_name[]; then only new
boards have to bother setting up the pointer.

>>>> [*] Currently, they're silently ignored with most boards for most X, but
>>>> I regard that as implementation defect.
>>>
>>> Yes. Is there a bool in the drive info array that we can set to say
>>> "this drive has been added as a device" and check for any that went
>>> unset?
>>
>> Not yet :)
>>
>>>         I can add one and a routine to check for it, which may help
>>> flush out more of the weird legacy sugar option bugs.
>>
>> We do something like that for -netdev:
>>
>>      $ qemu -nodefaults -display none -netdev user,id=foo
>>      Warning: netdev foo has no peer
>>
>> and -net:
>>
>>      $ qemu -nodefaults -display none -net user,id=foo
>>      Warning: vlan 0 with no nics
>>
>> I think as long as we leave picking up configuration to boards, having
>> the boards mark the pieces they pick up is the best we can do.
>>
>> An alternative to leaving it to boards is making the boads define
>> callbacks that get fed configuration.  But that's more surgery.
>>
>> There's more than just -netdev, -net and -drive, though.  Many command
>> line options to configure devices also merely create a piece of
>> configuration for boards to pick up:
>>
>> * Character devices (-serial, -parallel) end up in serial_hds[],
>>    parallel_hds[].
>>
>> * Graphics devices (-vga) end up in vga_interface_type.
>>
>> These all need the same "did the board pick it up?" check as -drive.
>>
>> Some old options have been converted to work independent of boards:
>>
>> * Virtio consoles (-virtioconsole) in commit 98b1925.
>>
>> * Sound devices (-soundhw) in commit b3e6d59.
>>
>> Newer convenience options should always worked this way.  -watchdog
>> does.
>>
>> I may have missed options.
>
> That is indeed more than a few. I probably need a little bit more
> exposure to different boards and how configuration works before I
> attack the problem as a whole. I may just fix -drive for now ...

I'm not in the habit of rejecting incremental progress we can have now
in favor of a perfect solution we can't have now :)

>>> (Sugar attracts bugs. heh-heh-heh...)
>>
>> Indeed!
>>
>
> Your position is pretty clear. I will give other people the time to
> chime in before I do too much more work on this, though I will
> probably go ahead and add the unused drive check now, since we'll want
> that no matter which path we take.

Makes sense.  Have fun!



reply via email to

[Prev in Thread] Current Thread [Next in Thread]