Re: [Qemu-block] [PATCH] qcow2: always initialize specific image info

qemu-block

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH] qcow2: always initialize specific image info

From:	Denis V. Lunev
Subject:	Re: [Qemu-block] [PATCH] qcow2: always initialize specific image info
Date:	Tue, 22 Dec 2015 00:33:21 +0300
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0

On 12/21/2015 04:18 PM, Denis V. Lunev wrote:

On 12/07/2015 08:07 PM, Roman Kagan wrote:

qcow2_get_specific_info() used to have a code path which would leave
pointer to ImageInfoSpecificQCow2 uninitialized.

We guess that it caused sporadic crashes on freeing an invalid pointer
in response to "query-block" QMP command in
visit_type_ImageInfoSpecificQCow2 with QapiDeallocVisitor.

Although we have neither a solid proof nor a reproduction scenario,
making sure the field is initialized appears a reasonable thing to do.

Signed-off-by: Roman Kagan <address@hidden>
---
  block/qcow2.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 88f56c8..67c9d3d 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c

@@ -2739,7 +2739,7 @@ static ImageInfoSpecific*qcow2_get_specific_info(BlockDriverState *bs)

        *spec_info = (ImageInfoSpecific){
          .type  = IMAGE_INFO_SPECIFIC_KIND_QCOW2,
-        .u.qcow2 = g_new(ImageInfoSpecificQCow2, 1),
+        .u.qcow2 = g_new0(ImageInfoSpecificQCow2, 1),
      };
      if (s->qcow_version == 2) {
          *spec_info->u.qcow2 = (ImageInfoSpecificQCow2){


It looks like I have caught this (on QEMU 2.3 RHEV) got from here

http://ftp.redhat.com/pub/redhat/linux/enterprise/7Server/en/RHEV/SRPMS/qemu-kvm-rhev-2.3.0-31.el7_2.3.src.rpm


Here is a dump


(gdb) p *blk_backends.tqh_first->bs
$12 = {total_sectors = 134217728, read_only = 0, open_flags = 24802,
  encrypted = 0, valid_key = 0, sg = 0, copy_on_read = 0, probed = false,
  drv = 0x7fe0c047ec40 <bdrv_qcow2>, opaque = 0x7fe0c23acdb0,
  blk = 0x7fe0c23a4640, aio_context = 0x7fe0c237f350, aio_notifiers = {
    lh_first = 0x0},

filename = "/vz/vmprivate/VM_a80bf3e5.pvm/harddisk.hdd", '\000'<repeats 4053 times>, backing_file = '\000' <repeats 4095 times>,

  backing_format = '\000' <repeats 15 times>,
  full_open_options = 0x7fe0c2513e30,

exact_filename = "/vz/vmprivate/VM_a80bf3e5.pvm/harddisk.hdd",'\000' <repeats 4053 times>, backing_hd = 0x0, file = 0x7fe0c23a9a50,close_notifiers = {notifiers = {lh_first = 0x0}}, before_write_notifiers = {notifiers= {lh_first = 0x0}}, serialising_in_flight = 0, throttle_state ={cfg = {buckets = {{avg = 0, max = 0, level = 0}, {avg = 0, max = 0,level = 0},

        {avg = 0, max = 0, level = 0}, {avg = 0, max = 0, level = 0}, {
          avg = 0, max = 0, level = 0}, {avg = 0, max = 0, level = 0}},
      op_size = 0}, previous_leak = 0, timers = {0x0, 0x0},
    clock_type = QEMU_CLOCK_REALTIME, read_timer_cb = 0x0,

write_timer_cb = 0x0, timer_opaque = 0x0}, throttled_reqs ={{entries = {

        tqh_first = 0x0, tqh_last = 0x7fe0c23a7810}}, {entries = {
        tqh_first = 0x0, tqh_last = 0x7fe0c23a7820}}},

io_limits_enabled = false, stats = {nr_bytes = {0, 0, 0}, nr_ops ={0, 0,

      0}, total_time_ns = {0, 0, 0}, merged = {0, 0, 0},
    wr_highest_sector = 0}, bl = {max_discard = 0, discard_alignment = 0,
    max_write_zeroes = 0, write_zeroes_alignment = 128,
    opt_transfer_length = 0, max_transfer_length = 0,

opt_mem_alignment = 512}, zero_beyond_eof = true,request_alignment = 512,

  guest_block_size = 512, enable_write_cache = 1,
  on_read_error = BLOCKDEV_ON_ERROR_REPORT,
  on_write_error = BLOCKDEV_ON_ERROR_ENOSPC, iostatus_enabled = true,

iostatus = BLOCK_DEVICE_IO_STATUS_OK, node_name = '\000' <repeats 31times>,

  node_list = {tqe_next = 0x0, tqe_prev = 0x0}, device_list = {
    tqe_next = 0x7fe0c2514e50, tqe_prev = 0x7fe0c047cff0 <bdrv_states>},
  dirty_bitmaps = {lh_first = 0x0}, refcnt = 1, tracked_requests = {
    lh_first = 0x0}, op_blockers = {{lh_first = 0x0} <repeats 15 times>},
  job = 0x0, options = 0x7fe0c23a2ca0,

detect_zeroes = BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF, backing_blocker= 0x0,

  write_threshold_offset = 0, write_threshold_notifier = {notify = 0x0,
    node = {le_next = 0x0, le_prev = 0x0}}}


(gdb) p *(BDRVQcowState*)blk_backends.tqh_first->bs->opaque

$13 = {cluster_bits = 0, cluster_size = 0, cluster_sectors = 0,l2_bits = 0,

  l2_size = 0, l1_size = 0, cluster_offset_mask = 0, l1_table_offset = 0,

l1_table = 0x0, l2_cache = 0x0, l2_cache_offsets = {0 <repeats 16times>},

  l2_cache_counts = {0 <repeats 16 times>}, cluster_cache = 0x0,
  cluster_data = 0x0, cluster_cache_offset = 0, crypt_method = 0,

crypt_method_header = 0, aes_encrypt_key = {rd_key = {0 <repeats 60times>},

    rounds = 0}, aes_decrypt_key = {rd_key = {0 <repeats 60 times>},

rounds = 0}, lock = {locked = false, queue = {entries = {tqh_first= 0x0,

        tqh_last = 0x0}}}, migration_blocker = 0x0}
(gdb)


you see, all parameters are zeroes.

Technically this could happen in the middle of the qcow2_invalidate_cache
which, in turn is called from migration code f.e. during resuming from
suspend (which is how this thing is implemented in libvirt).

Migration code is running actually at the moment as far as could be seen
in coredump.

Any thoughts? :)

Den

The situation is even worse than I have feared.

1) the hypothesis is correct. We are in the middle of theqcow2_invalidate_cache

    which is called from here
static void process_incoming_migration_co(void *opaque)
{
     ...
    /* Make sure all file formats flush their mutable metadata */
    bdrv_invalidate_cache_all(&local_err);

2) Though the code is problematic, coroutine context is switched inside,
    there are a lot of places doing IO inside.

3) Incoming request for qmp_query_block is called, it is safely performed
    from monitor context

diff --git a/block/qapi.c b/block/qapi.c
index 8a19aed..8a07d75 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -368,7 +368,12 @@ BlockInfoList *qmp_query_block(Error **errp)

     for (blk = blk_next(NULL); blk; blk = blk_next(blk)) {
         BlockInfoList *info = g_malloc0(sizeof(*info));
+        AioContext *ctx = blk_get_aio_context(blk);
+
+        aio_context_acquire(ctx);
         bdrv_query_info(blk, &info->value, &local_err);
+        aio_context_release(ctx);
+
         if (local_err) {
             error_propagate(errp, local_err);
             goto err;

will not help us as the thread is the same! We will increase
nesting in rfifolock_lock() and proceed inside to crash.

4) locking with coroutine lock in qcow2_get_specific_info
    will not help too - it is initialized in qcow2_open inside
    invalidate cache and is taken/dropped too several
    time

:(((

Den

[Prev in Thread]

Current Thread

[Next in Thread]

[Qemu-block] [PATCH] qcow2: always initialize specific image info, Roman Kagan, 2015/12/07
- Re: [Qemu-block] [PATCH] qcow2: always initialize specific image info, Denis V. Lunev, 2015/12/07
- Re: [Qemu-block] [PATCH for-2.5] qcow2: always initialize specific image info, Eric Blake, 2015/12/07
  - Re: [Qemu-block] [Qemu-devel] [PATCH for-2.5?] qcow2: always initialize specific image info, Eric Blake, 2015/12/07
    - Re: [Qemu-block] [Qemu-devel] [PATCH for-2.5?] qcow2: always initialize specific image info, Denis V. Lunev, 2015/12/07
    - Re: [Qemu-block] [Qemu-devel] [PATCH for-2.5?] qcow2: always initialize specific image info, Max Reitz, 2015/12/07
    - Re: [Qemu-block] [Qemu-devel] [PATCH for-2.5?] qcow2: always initialize specific image info, Kevin Wolf, 2015/12/08
- Re: [Qemu-block] [PATCH] qcow2: always initialize specific image info, Denis V. Lunev, 2015/12/21
  - Re: [Qemu-block] [PATCH] qcow2: always initialize specific image info, Denis V. Lunev <=
    - Re: [Qemu-block] [PATCH] qcow2: always initialize specific image info, Max Reitz, 2015/12/21
    - Re: [Qemu-block] [PATCH] qcow2: always initialize specific image info, Kevin Wolf, 2015/12/22
    - Re: [Qemu-block] [PATCH] qcow2: always initialize specific image info, Eric Blake, 2015/12/22
    - Re: [Qemu-block] [PATCH] qcow2: always initialize specific image info, Denis V. Lunev, 2015/12/22

Prev by Date: Re: [Qemu-block] [Qemu-devel] [PATCH COLO-Frame v12 25/38] qmp event: Add event notification for COLO error
Next by Date: Re: [Qemu-block] [PATCH] qcow2: always initialize specific image info
Previous by thread: Re: [Qemu-block] [PATCH] qcow2: always initialize specific image info
Next by thread: Re: [Qemu-block] [PATCH] qcow2: always initialize specific image info
Index(es):
- Date
- Thread