[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-block] [Qemu-devel] [PATCH] qcow2: allocate cluster_cache/clus
From: |
Alexey Kardashevskiy |
Subject: |
Re: [Qemu-block] [Qemu-devel] [PATCH] qcow2: allocate cluster_cache/cluster_data on demand |
Date: |
Sat, 19 Aug 2017 12:46:11 +1000 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 |
On 19/08/17 01:18, Eric Blake wrote:
> On 08/18/2017 08:31 AM, Stefan Hajnoczi wrote:
>> Most qcow2 files are uncompressed so it is wasteful to allocate (32 + 1)
>> * cluster_size + 512 bytes upfront. Allocate s->cluster_cache and
>> s->cluster_data when the first read operation is performance on a
>> compressed cluster.
>>
>> The buffers are freed in .bdrv_close(). .bdrv_open() no longer has any
>> code paths that can allocate these buffers, so remove the free functions
>> in the error code path.
>>
>> Reported-by: Alexey Kardashevskiy <address@hidden>
>> Cc: Kevin Wolf <address@hidden>
>> Signed-off-by: Stefan Hajnoczi <address@hidden>
>> ---
>> Alexey: Does this improve your memory profiling results?
>
> Is this a regression from earlier versions?
Hm, I have not thought about this.
So. I did bisect and this started happening from
9a4c0e220d8a4f82b5665d0ee95ef94d8e1509d5
"hw/virtio-pci: fix virtio behaviour"
Before that, the very same command line would take less than 1GB of
resident memory. That thing basically enforces virtio-1.0 for QEMU <=2.6
which means that upstream with "-machine pseries-2.6" works fine (less than
1GB), "-machine pseries-2.7" does not (close to 7GB, sometime even 9GB).
Then I tried bisecting again, with
"scsi=off,disable-modern=off,disable-legacy=on" on my 150 virtio-block
devices, started from
e266d421490e0 "virtio-pci: add flags to enable/disable legacy/modern" (it
added the disable-modern switch) which uses 2GB of memory.
I ended up with ada434cd0b44 "virtio-pci: implement cfg capability".
Then I removed proxy->modern_as on v2.10.0-rc3 (see below) and got 1.5GB of
used memory (yay!)
I do not really know how to reinterpret all of this, do you?
Note: 1GB..9GB numbers from below are the peak values from valgrind's
massif. This is pretty much resident memory used by QEMU process. In my
testing I did not enable KVM and I did not start the guest (i.e. used -S).
150 virtio-block devices, 2GB RAM for the guest.
Also, while bisecting, I only paid attention if it is 1..2GB or 6..9GB -
all tests did fit these 2 ranges, for any given sha1 the amount of memory
would be stable but among "good" commits it could change between 1GB and 2GB.
diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 5d14bd6..7ad447a 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -1783,6 +1783,7 @@ static void virtio_pci_realize(PCIDevice *pci_dev,
Error **errp)
/* PCI BAR regions must be powers of 2 */
pow2ceil(proxy->notify.offset + proxy->notify.size));
+#if 0
memory_region_init_alias(&proxy->modern_cfg,
OBJECT(proxy),
"virtio-pci-cfg",
@@ -1791,7 +1792,7 @@ static void virtio_pci_realize(PCIDevice *pci_dev,
Error **errp)
memory_region_size(&proxy->modern_bar));
address_space_init(&proxy->modern_as, &proxy->modern_cfg,
"virtio-pci-cfg-as");
-
+#endif
if (proxy->disable_legacy == ON_OFF_AUTO_AUTO) {
proxy->disable_legacy = pcie_port ? ON_OFF_AUTO_ON : ON_OFF_AUTO_OFF;
}
@@ -1860,10 +1861,10 @@ static void virtio_pci_realize(PCIDevice *pci_dev,
Error **errp)
static void virtio_pci_exit(PCIDevice *pci_dev)
{
- VirtIOPCIProxy *proxy = VIRTIO_PCI(pci_dev);
+ //VirtIOPCIProxy *proxy = VIRTIO_PCI(pci_dev);
msix_uninit_exclusive_bar(pci_dev);
- address_space_destroy(&proxy->modern_as);
+ //address_space_destroy(&proxy->modern_as);
}
static void virtio_pci_reset(DeviceState *qdev)
> Likely, it is NOT -rc4
> material, and thus can wait for 2.11; but it should be fine for -stable
> as part of 2.10.1 down the road.
>
>> +++ b/block/qcow2-cluster.c
>> @@ -1516,6 +1516,23 @@ int qcow2_decompress_cluster(BlockDriverState *bs,
>> uint64_t cluster_offset)
>> nb_csectors = ((cluster_offset >> s->csize_shift) & s->csize_mask)
>> + 1;
>> sector_offset = coffset & 511;
>> csize = nb_csectors * 512 - sector_offset;
>> +
>> + /* Allocate buffers on first decompress operation, most images are
>> + * uncompressed and the memory overhead can be avoided. The buffers
>> + * are freed in .bdrv_close().
>> + */
>> + if (!s->cluster_data) {
>> + /* one more sector for decompressed data alignment */
>> + s->cluster_data = qemu_try_blockalign(bs->file->bs,
>> + QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size + 512);
>> + if (!s->cluster_data) {
>> + return -EIO;
>
> Is -ENOMEM any better than -EIO here?
>
--
Alexey
signature.asc
Description: OpenPGP digital signature
Re: [Qemu-block] [Qemu-devel] [PATCH] qcow2: allocate cluster_cache/cluster_data on demand, Stefan Hajnoczi, 2017/08/21
Re: [Qemu-block] [PATCH] qcow2: allocate cluster_cache/cluster_data on demand, Alexey Kardashevskiy, 2017/08/19