qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 4/9] hbitmap: store / restore


From: John Snow
Subject: Re: [Qemu-devel] [PATCH 4/9] hbitmap: store / restore
Date: Tue, 13 Jan 2015 12:08:40 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0



On 01/13/2015 07:59 AM, Vladimir Sementsov-Ogievskiy wrote:

Best regards,
Vladimir

On 09.01.2015 00:21, John Snow wrote:


On 12/11/2014 09:17 AM, Vladimir Sementsov-Ogievskiy wrote:
Functions to store / restore HBitmap. HBitmap should be saved to linear
bitmap format independently of endianess.

Because of restoring in several steps, every step writes only the last
level of the bitmap. All other levels are restored by
hbitmap_restore_finish as a last step of restoring. So, HBitmap is
inconsistent while restoring.
.....

I guess by nature of how bitmap migration has been implemented, we
cannot help but restore parts of the bitmap piece by piece, which
requires us to force the bitmap into an inconsistent state.

Yuck.

It might be "nice" to turn on a disable bit inside the hbitmap that
disallows its use until it is made consistent again, but I don't know
what the performance hit of the extra conditionals everywhere would be
like.


Another approach is to restore the bitmap after each piece set. It is a
bit slower of course but the bitmap will be consistent after
hbitmap_restore_data()


Yeah, that's not great either. The approach you have already taken is probably the best.

+/**
+ * hbitmap_restore_finish
+ * @hb: HBitmap to operate on.
+ *
+ * Repair HBitmap after calling hbitmap_restore_data. Actuall all
HBitmap
+ * layers are restore here.
+ */
+void hbitmap_restore_finish(HBitmap *hb);
+
+/**
   * hbitmap_free:
   * @hb: HBitmap to operate on.
   *

These are just biased opinions:

- It might be nice to name the store/restore functions "serialize" and
"deserialize," to describe exactly what they are doing.

- I might refer to "restore_finish" as "post_load" or "post_restore"
or something similar to mimic how device migration functions are
named. I think hbitmap_restore_data_finalize would also be fine; the
key part here is clearly naming it relative to "restore_data."


Hmm. Ok, what about the following set:
     hbitmap_serialize()
     hbitmap_deserialize_part()
     hbitmap_deserialize_finish()


Looks good to me!

diff --git a/util/hbitmap.c b/util/hbitmap.c
index 8aa7406..ac0323f 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -362,6 +362,90 @@ bool hbitmap_get(const HBitmap *hb, uint64_t item)
      return (hb->levels[HBITMAP_LEVELS - 1][pos >> BITS_PER_LEVEL] &
bit) != 0;
  }

+uint64_t hbitmap_data_size(const HBitmap *hb, uint64_t count)
+{
+    uint64_t size;
+
+    if (count == 0) {
+        return 0;
+    }
+
+    size = (((count - 1) >> hb->granularity) >> BITS_PER_LEVEL) + 1;
+
+    return size * sizeof(unsigned long);
+}
+

This seems flawed to me: number of bits without an offset can't be
mapped to a number of real bytes, because "two bits" may or may not
cross a granularity boundary, depending on *WHERE* you start counting.

e.g.

granularity = 1 (i.e. 2^1 = 2 virtual bits per 1 real bit)
virtual: 001100
real:     0 1 0

The amount of space required to hold "two bits" here could be as
little as one bit, if the offset is k={0,2,4} but it could be as much
as two bits if the offset is k={1,3}.

You may never use the function in this way, but mapping virtual bits
to an implementation byte-size seems like it is inviting an
inconsistency.
I dislike this function too.. But unfortunately we need the size in
bytes used for serialization.

Hmm. Ok, without loss of generality, let offset be less than
granularity. The precise formula should look like:

size = (((offset+count-1) >> hb->granularity) >> BITS_PER_LEVEL);

So,
size = ((((1 << hb->granularity) + count - 2) >> hb->granularity) >>
BITS_PER_LEVEL) + 1;
would be enough in any case. Ok?


I think so, as long as when you deserialize the object it does so correctly, regardless of whether or not you start on an even multiple of the granularity.




+void hbitmap_store_data(const HBitmap *hb, uint8_t *buf,
+                        uint64_t start, uint64_t count)
+{
+    uint64_t last = start + count - 1;
+    unsigned long *out = (unsigned long *)buf;
+
+    if (count == 0) {
+        return;
+    }
+
+    start = (start >> hb->granularity) >> BITS_PER_LEVEL;
+    last = (last >> hb->granularity) >> BITS_PER_LEVEL;
+    count = last - start + 1;
+
+#ifdef __BIG_ENDIAN_BITFIELD
+    for (i = start; i <= last; ++i) {
+        unsigned long el = hb->levels[HBITMAP_LEVELS - 1][i];
+        out[i] = (BITS_PER_LONG == 32 ? cpu_to_le32(el) :
cpu_to_le64(el));
+    }
+#else
+    memcpy(out, &hb->levels[HBITMAP_LEVELS - 1][start],
+           count * sizeof(unsigned long));
+#endif
+}
+

I suppose the ifdefs are trying to take an optimization by using
memcpy if at all possible, without a conversion.

Why are the conversions to little endian, though? Shouldn't we be
serializing to a Big Endian format?
As Paolo already explained it is for portability. LE bitmap
representation is independent of size of bitmap array elements, which
may be 32bit/64bit, or whatever else with LE format.

Yes, that makes sense. :)


+void hbitmap_restore_data(HBitmap *hb, uint8_t *buf,
+                          uint64_t start, uint64_t count)
+{
+    uint64_t last = start + count - 1;
+    unsigned long *in = (unsigned long *)buf;
+
+    if (count == 0) {
+        return;
+    }
+
+    start = (start >> hb->granularity) >> BITS_PER_LEVEL;
+    last = (last >> hb->granularity) >> BITS_PER_LEVEL;
+    count = last - start + 1;
+
+#ifdef __BIG_ENDIAN_BITFIELD
+    for (i = start; i <= last; ++i) {
+        hb->levels[HBITMAP_LEVELS - 1][i] =
+            (BITS_PER_LONG == 32 ? be32_to_cpu(in[i]) :
be64_to_cpu(in[i]));
+    }
+#else
+    memcpy(&hb->levels[HBITMAP_LEVELS - 1][start], in,
+           count * sizeof(unsigned long));
+#endif
+}
+

...? We're storing as LE but restoring from BE? I'm confused.
Oh, it's a mistake. Thanks. I've missed it when testing because of
{#ifdef __BIG_ENDIAN_BITFIELD } branch was never compiled.

Glad I am not crazy :)


I'm also not clear on the __BIG_ENDIAN_BITFIELD macro. Why do we want
to pack differently based on how C-bitfields are packed by the
compiler? I don't think that has any impact on how longs are stored
(always in the host native format.)

I would rather see the serialize/deserialize always convert to and
from BE explicitly with cpu_to_be[32|64] and be[32|64]_to_cpu. I think
trying to optimize the pack/unpack in the no-op condition with a #
define and a memcpy is inviting portability problems.
Ok. I'll leave only branch with for(...) in v2.

You can correct me if I am mistaken, but I seem to recall determining endianness in a platform, compiler and libc independent way is very difficult during the pre-processor stage.

If there _IS_ some way to do this, the optimization is fine, but I do not think it will work as written.

The un-optimized version is the safest.


+void hbitmap_restore_finish(HBitmap *bitmap)
+{
+    int64_t i, size;
+    int lev, j;
+
+    /* restore levels starting from penultimate to zero level, assuming
+     * that the last one is ok */
+    size = MAX((bitmap->size + BITS_PER_LONG - 1) >> BITS_PER_LEVEL,
1);
+    for (lev = HBITMAP_LEVELS - 1; lev-- > 0; ) {
+        size = MAX((size + BITS_PER_LONG - 1) >> BITS_PER_LEVEL, 1);
+        for (i = 0; i < size; ++i) {
+            bitmap->levels[lev][i] = 0;
+            for (j = 0; j < BITS_PER_LONG; ++j) {
+                if (bitmap->levels[lev+1][(i << BITS_PER_LEVEL) + j]) {
+                    bitmap->levels[lev][i] |=  (1 << j);
+                }
+            }

Just a musing: Should we cache the size of each level? I know we can
keep recalculating it, but... why bother? We recalculate it in so many
places now.
Interesting point.

However, there is another real bug: there are may be no longs on the
next level for last bits of the current one. The new version of this
function will be in v2.

Great :)


+        }
+    }
+}
+
  void hbitmap_free(HBitmap *hb)
  {
      unsigned i;



Thank you!
--js



reply via email to

[Prev in Thread] Current Thread [Next in Thread]