|
From: | Eric Blake |
Subject: | Re: [Qemu-devel] [PATCH 2/2] qcow2: Avoid memory over-allocation on compressed images |
Date: | Wed, 21 Feb 2018 12:32:23 -0600 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 |
On 02/21/2018 11:39 AM, Kevin Wolf wrote:
See my commit message comment - we have other spots in the code base that blindly g_malloc(2 * s->cluster_size).Though is that a reason to do the same in new code or to phase out such allocations whenever you touch them?
Touché.
And I intended (but sent the email without amending my commit) to use g_malloc(). But as Berto has convinced me that an externally produced image can convince us to read up to 4M (even though we don't need that much to decompress), I suppose that the try_ variant plus checking is reasonable (and care in NULL'ing out if one but not both allocations succeed).Sounds good. Another thought I had is whether we should do per-request allocation for compressed clusters, too, instead of having per-BDS buffers.
The only benefit of a per-BDS buffer is that we cache things - multiple sub-cluster reads in a row all from the same compressed cluster benefit from decompressing only once. The drawbacks of a per-BDS buffer: we can't do things in parallel (everything else in qcow2 drops the lock around bdrv_co_pread[v]), so the initial read prevents anything else in the qcow2 layer from progressing.
I also wonder - since we ARE allowing multiple parallel readers in other parts of qcow2 (without a patch, decompression is not in this boat, but decryption and even bounce buffers due to lower-layer alignment constraints are), what sort of mechanisms do we have for using a pool of reusable buffers, rather than having each cluster access that requires a buffer malloc and free the buffer on a per-access basis? I don't know how much time the malloc/free per-transaction overhead adds, or if it is already much smaller than the actual I/O time.
But note that while reusable buffers from a pool would cut down on the per-I/O malloc/free overhead if we switch decompression away from per-BDS buffer, it would still not solve the fact that we only get the caching ability where multiple sub-cluster requests from the same compressed cluster require only one decompression, since that's only possible on a per-BDS caching level.
-- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
[Prev in Thread] | Current Thread | [Next in Thread] |