qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 1/1] block: add missed BDRV_O_NOCACHE when block


From: Denis V. Lunev
Subject: Re: [Qemu-devel] [PATCH 1/1] block: add missed BDRV_O_NOCACHE when block device is opened without file
Date: Mon, 30 Jan 2017 11:31:16 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.7.0

On 01/28/2017 07:23 PM, Max Reitz wrote:
> On 25.01.2017 20:44, Denis V. Lunev wrote:
>> On 01/25/2017 08:59 PM, Max Reitz wrote:
>>> [CC-ing John]
>>>
>>> On 25.01.2017 17:42, Denis V. Lunev wrote:
>>>> Technically there is a problem when the guest DVD is created by libvirt
>>>> with AIO mode 'native' on Linux. Current QEMU is unable to start the
>>>> domain configured as follows:
>>>>     <disk type='file' device='cdrom'>
>>>>       <driver name='qemu' type='raw' cache='none' io='native'/>
>>>>       <target dev='sdb' bus='scsi'/>
>>>>       <readonly/>
>>>>     </disk>
>>>> The problem comes from the combination of 'cache' and 'io' options.
>>>>
>>>> 'io' option is common option and it is removed from block driver
>>>> specific options. 'cache' originally is not. The patch makes 'cache'
>>>> option common. This works fine as long as cdrom media insertion
>>>> later on.
>>>>
>>>> Signed-off-by: Denis V. Lunev <address@hidden>
>>>> CC: Kevin Wolf <address@hidden>
>>>> CC: Max Reitz <address@hidden>
>>>> ---
>>>>  blockdev.c | 18 +++++++++++++++---
>>>>  1 file changed, 15 insertions(+), 3 deletions(-)
>>> There was a Red Hat BZ for this:
>>>
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1342999
>>>
>>> There is now a corresponding BZ for libvirt:
>>>
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1377321
>>>
>>> The gist is that it was determined to be a problem with libvirt.
>>>
>>> RHEL has a downstream commit to work around this issue by ignoring the
>>> cache mode set for empty CD-ROMs -- but that is only because that was
>>> simpler than fixing libvirt.
>>>
>>> (Note that it's ignoring the cache mode instead of actually evaluating it.)
>>>
>> This is what I have exactly started from:
>> http://ftp.redhat.com/pub/redhat/linux/enterprise/7Server/en/RHEV/SRPMS/qemu-kvm-rhev-2.6.0-27.el7.src.rpm
>>
>> This package starts VM well for the above mentioned configuration:
>>
>>     <disk type='file' device='cdrom'>
>>       <driver name='qemu' type='raw' cache='none' io='native'/>
>>       <target dev='sdb' bus='scsi'/>
>>       <readonly/>
>>     </disk>
>>
>> but the problem comes later at 'change' moment. It reports
>>
>> 'Details: internal error: unable to execute QEMU command 'change':
>> aio=native was specified, but it requires cache.direct=on, which
>> was not specified.)'
>>
>>
>> Thus partial solution implemented by the RedHat is really
>> partial and does not work at the second stage.
> Yes, because it is not supposed to work for that. The only thing the
> downstream patch does is ignore the cache mode so you can still start
> the VM even if libvirt decides to specify a cache parameter for an empty
> drive (which it should not do).
>
>>                                                I have started from
>>
>> diff --git a/blockdev.c b/blockdev.c
>> index d280dc4..e2c9053 100644   
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -608,6 +608,13 @@ static BlockBackend *blockdev_init(const char
>> *file, QDict *bs_opts,
>>              goto early_err;
>>          }
>>  
>> +        if (qdict_haskey(bs_opts, BDRV_OPT_CACHE_DIRECT)) {
>> +            bdrv_flags |= BDRV_O_NOCACHE;
>> +        }
>> +        if (qdict_haskey(bs_opts, BDRV_OPT_CACHE_NO_FLUSH)) {
>> +            bdrv_flags |= BDRV_O_NO_FLUSH;
>> +        }
>> +
>>          blk_rs = blk_get_root_state(blk);
>>          blk_rs->open_flags    = bdrv_flags;
>>          blk_rs->read_only     = !(bdrv_flags & BDRV_O_RDWR);
>>
>> The problem for us is that we have such previously valid configurations
>> in the field.
>>
>>>> May be this has already discussed, but still. AIO=native for CDROM without
>>>> media seems important case.
>>> First, I personally don't find aio=native very important for CD-ROM
>>> drives. Yes, we shouldn't make IDE CD-ROM slower than necessary, but I
>>> don't think aio=native will make the difference between "Slow like
>>> CD-ROMs are in reality" and "As fast as virtio".
>> the problem for me is that each clone() call is costly and counts. That
>> is why we are trying to avoid it whenever possible. That is why 'native'
>> mode is MUCH better. Also it would be very nice not to use cached
>> IO, which is very good for memory overcommit situations.
>>
>> Here I am fighting not with the performance, which does not make
>> any real sense, but with a memory footprint.
> Fair enough. Still, specifying a cache mode for an empty drive simply
> doesn't make sense.
>
> Have you considered inserting a null-co:// file at startup as a
> workaround? If you use the old 'change' (or blockdev-change-medium)
> command, it should retain AIO and cache mode.
ok. Thanks for the suggestion.


>
>>> Second, all this patch does is revert some changes done by commit
>>> 91a097e7478940483e76d52217f05bc05b98d5a5, which was very deliberate.
>>>
>>> Third, you may then be asking for the recommended way to put an
>>> aio=native medium into a CD-ROM drive. Good thing you ask, because we
>>> have a way that we want to recommend but can't because it's still
>>> considered experimental:
>>>
>>> The BDS is added using blockdev-add, with all of the appropriate caching
>>> and aio options. Then it's inserted into the drive using the
>>> x-blockdev-insert-medium command, and the drive is closed using
>>> blockdev-close-tray.
>> the problem, again, is that with x-blockdev-insert-medium I can not
>> deal with block driver options, or may be I am missing something.
> x-blockdev-insert-medium just inserts a BDS as a medium. The BDS is
> added via blockdev-add which allows you to set all of these options.
>
>>> There are a couple of issues with this: First, blockdev-add and
>>> x-blockdev-insert-medium are still experimental. The good news are that
>>> I think the reason why the latter is experimental has actually
>>> disappeared and we can just drop the x- by now. The
>>> not-so-good-but-not-so-bad-either news are that we plan to get
>>> blockdev-add declared non-experimental for 2.9. Let's see how it goes.
>> Does this mean that x-blockdev-del would also lose x- prefix?
> As far as I'm aware, yes.

that sounds good!


>>> The other issue is that of course it's more cumbersome to use than a
>>> simple 'change' via HMP. I argue that if someone communicates with a VM
>>> by hand, they either have to deal with this added complexity or they
>>> cannot use aio=native for CD-ROM drives -- which, as I said, I don't
>>> think is too bad.
>> this is how 'virsh change-media' works at the moment, at least with
>> latest downstreams.
>> That is why this is not that clear.
> Well, that can be easily changed, the question is how fast that will be.
> You would still need to make libvirt aware of what cache mode to use for
> the new medium, and that may require deeper changes.
>
> (I supposed libvirt only stores a cache mode per drive while it should
> actually store a cache mode per image.)

fair enough. libvirt could do that.
Thank you for these clarifications.

Den




reply via email to

[Prev in Thread] Current Thread [Next in Thread]