Re: [Qemu-devel] [PATCH v12 10/13] Add xbzrle_encode_buffer and xbzrle

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v12 10/13] Add xbzrle_encode_buffer and xbzrle_d

From:	Orit Wasserman
Subject:	Re: [Qemu-devel] [PATCH v12 10/13] Add xbzrle_encode_buffer and xbzrle_decode_buffer functions
Date:	Tue, 26 Jun 2012 15:23:00 +0300
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:12.0) Gecko/20120430 Thunderbird/12.0.1

On 06/19/2012 09:08 PM, Eric Blake wrote:
> On 06/19/2012 09:43 AM, Orit Wasserman wrote:
>> Signed-off-by: Benoit Hudzia <address@hidden>
>> Signed-off-by: Petter Svard <address@hidden>
>> Signed-off-by: Aidan Shribman <address@hidden>
>> Signed-off-by: Orit Wasserman <address@hidden>
>> ---
>>  savevm.c |   91 
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 files changed, 91 insertions(+), 0 deletions(-)
>>
> 
>> +int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
>> +                         uint8_t *dst, int dlen)
>> +{
>> +    uint32_t zrun_len = 0, nzrun_len = 0;
>> +    int d = 0 , i = 0;
>> +    uint8_t *nzrun_start = NULL;
>> +
>> +    while (i < slen) {
>> +        /* overflow */
>> +        if (d + 2 > dlen) {
>> +            return -1;
>> +        }
>> +
>> +        while (!(old_buf[i] ^ new_buf[i]) && ++i <= slen) {
>> +            zrun_len++;
>> +        }
> 
> Can we vectorize this to compare a word at a time for improved speed?
> It would be really cool if glibc had an interface like memcmp(), but
> which returned the offset of the first difference instead of a
> comparison of the overall memory regions.  But even without such an
> interface, pseudocode such as:
> 
> while (i < slen) {
>   check for overflow
>   if (not aligned)
>     up to sizeof(long) byte comparisons
>   while (long comparisons && not end)
>     increment i by sizeof(long)
>   if (end of buffer) {
>     either buffer unchanged, or we have trailing zrun
>     break;
>   }
>   // found a word that differed
>   do byte-wise comparisons on that word
>   then back to word-size xor, checking for no zero bytes in the xor (or
> anywhere that a zero byte occurs in isolation, it is more compact to
> pass that lone byte as part of the xor encoding instead of breaking out
> to a zrun)
> }
I will add it.
> 
>> +
>> +        zrun_len = 0;
>> +        nzrun_start = new_buf + i;
>> +        while ((old_buf[i] ^ new_buf[i]) != 0 && ++i <= slen) {
>> +            nzrun_len++;
>> +        }
> 
> Note that the encoding for a single unchanged byte with bytes changed on
> both sides is more compact to pass the unchanged byte as part of the
> nzrun, instead of breaking out into a zrun and starting a second nzrun.
>  If I'm analyzing things correctly, that's also true for a run of two
> bytes, and it's not until we get to a run of three or more unchanged
> bytes where we start to get compression (except for the corner case of
> ending on a 1- or 2-byte zrun).  Should we be altering this algorithm to
> take advantage of this?

I used to have this optimization but didn't see any noticeable performance 
improvement.
It made the code much more complicated and less readable so I chose to remove 
it.
I think adding SSE support will have better chance in improving the compression 
performance.

> 
>> +int xbzrle_decode_buffer(uint8_t *src, int slen, uint8_t *dst, int dlen)
>> +{
>> +    int i = 0, d = 0;
>> +    uint32_t count = 0;
>> +
>> +    while (i < slen) {
>> +
>> +        /* zrun */
>> +        i += uleb128_decode_small(src + i, &count);
>> +        d += count;
>> +
>> +        /* overflow */
>> +        g_assert(d <= dlen);
> 
> Again, is g_assert() appropriate for parsing user-supplied input, or are
> there more appropriate ways for gracefully failing an incoming migration
> attempt from corrupt data?
I will change it to return -1.
> 
>> +
>> +        /* completed decoding */
>> +        if (i == slen - 1) {
>> +            return d;
>> +        }
>> +
>> +        /* nzrun */
>> +        i += uleb128_decode_small(src + i, &count);
>> +
>> +        g_assert(d + count <= dlen);
>> +
>> +        memcpy(dst + d , src + i, count);
> 
> Am I correct that you are using XOR to compute the locations of nzrun,
> but then transferring the original data (and not the xor data) over the
> wire?  This was not made clear in the doc patch of 3/13.  I would
> suggest adding an example to the docs of a 4k page that consists only of
> a short string at the beginning of the page (and NUL bytes thereafter),
> then show both the before and after version of that page to demonstrate
> sparse memory changes, as well as the XBZRLE encoding of that page.
> 
ok.

Orit

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-devel] [PATCH v12 11/13] Add XBZRLE to ram_save_block and ram_save_live, (continued)
- [Qemu-devel] [PATCH v12 11/13] Add XBZRLE to ram_save_block and ram_save_live, Orit Wasserman, 2012/06/19
  - Re: [Qemu-devel] [PATCH v12 11/13] Add XBZRLE to ram_save_block and ram_save_live, Eric Blake, 2012/06/19
- [Qemu-devel] [PATCH v12 13/13] Add XBZRLE statistics, Orit Wasserman, 2012/06/19
  - Re: [Qemu-devel] [PATCH v12 13/13] Add XBZRLE statistics, Eric Blake, 2012/06/19
- [Qemu-devel] [PATCH v12 12/13] Add set_cachesize command, Orit Wasserman, 2012/06/19
  - Re: [Qemu-devel] [PATCH v12 12/13] Add set_cachesize command, Eric Blake, 2012/06/19
- [Qemu-devel] [PATCH v12 09/13] Add migration_end function, Orit Wasserman, 2012/06/19
- [Qemu-devel] [PATCH v12 08/13] Change ram_save_block to return -1 if there are no more changes, Orit Wasserman, 2012/06/19
- [Qemu-devel] [PATCH v12 10/13] Add xbzrle_encode_buffer and xbzrle_decode_buffer functions, Orit Wasserman, 2012/06/19
  - Re: [Qemu-devel] [PATCH v12 10/13] Add xbzrle_encode_buffer and xbzrle_decode_buffer functions, Eric Blake, 2012/06/19
    - Re: [Qemu-devel] [PATCH v12 10/13] Add xbzrle_encode_buffer and xbzrle_decode_buffer functions, Orit Wasserman <=

Prev by Date: Re: [Qemu-devel] [PATCH v4 0/7] file descriptor passing using pass-fd
Next by Date: [Qemu-devel] [PATCH v2] ahci: add -drive support
Previous by thread: Re: [Qemu-devel] [PATCH v12 10/13] Add xbzrle_encode_buffer and xbzrle_decode_buffer functions
Next by thread: Re: [Qemu-devel] [PATCH qom-next v2 33/33] target-arm: Remove ARM_CPUID_* macros
Index(es):
- Date
- Thread