qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] [PATCH] qcow2: always initialize specific image info


From: Denis V. Lunev
Subject: Re: [Qemu-block] [PATCH] qcow2: always initialize specific image info
Date: Tue, 22 Dec 2015 00:33:21 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0

On 12/21/2015 04:18 PM, Denis V. Lunev wrote:
On 12/07/2015 08:07 PM, Roman Kagan wrote:
qcow2_get_specific_info() used to have a code path which would leave
pointer to ImageInfoSpecificQCow2 uninitialized.

We guess that it caused sporadic crashes on freeing an invalid pointer
in response to "query-block" QMP command in
visit_type_ImageInfoSpecificQCow2 with QapiDeallocVisitor.

Although we have neither a solid proof nor a reproduction scenario,
making sure the field is initialized appears a reasonable thing to do.

Signed-off-by: Roman Kagan <address@hidden>
---
  block/qcow2.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 88f56c8..67c9d3d 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2739,7 +2739,7 @@ static ImageInfoSpecific *qcow2_get_specific_info(BlockDriverState *bs)
        *spec_info = (ImageInfoSpecific){
          .type  = IMAGE_INFO_SPECIFIC_KIND_QCOW2,
-        .u.qcow2 = g_new(ImageInfoSpecificQCow2, 1),
+        .u.qcow2 = g_new0(ImageInfoSpecificQCow2, 1),
      };
      if (s->qcow_version == 2) {
          *spec_info->u.qcow2 = (ImageInfoSpecificQCow2){

It looks like I have caught this (on QEMU 2.3 RHEV) got from here
http://ftp.redhat.com/pub/redhat/linux/enterprise/7Server/en/RHEV/SRPMS/qemu-kvm-rhev-2.3.0-31.el7_2.3.src.rpm

Here is a dump


(gdb) p *blk_backends.tqh_first->bs
$12 = {total_sectors = 134217728, read_only = 0, open_flags = 24802,
  encrypted = 0, valid_key = 0, sg = 0, copy_on_read = 0, probed = false,
  drv = 0x7fe0c047ec40 <bdrv_qcow2>, opaque = 0x7fe0c23acdb0,
  blk = 0x7fe0c23a4640, aio_context = 0x7fe0c237f350, aio_notifiers = {
    lh_first = 0x0},
filename = "/vz/vmprivate/VM_a80bf3e5.pvm/harddisk.hdd", '\000' <repeats 4053 times>, backing_file = '\000' <repeats 4095 times>,
  backing_format = '\000' <repeats 15 times>,
  full_open_options = 0x7fe0c2513e30,
exact_filename = "/vz/vmprivate/VM_a80bf3e5.pvm/harddisk.hdd", '\000' <repeats 4053 times>, backing_hd = 0x0, file = 0x7fe0c23a9a50, close_notifiers = { notifiers = {lh_first = 0x0}}, before_write_notifiers = {notifiers = { lh_first = 0x0}}, serialising_in_flight = 0, throttle_state = {cfg = { buckets = {{avg = 0, max = 0, level = 0}, {avg = 0, max = 0, level = 0},
        {avg = 0, max = 0, level = 0}, {avg = 0, max = 0, level = 0}, {
          avg = 0, max = 0, level = 0}, {avg = 0, max = 0, level = 0}},
      op_size = 0}, previous_leak = 0, timers = {0x0, 0x0},
    clock_type = QEMU_CLOCK_REALTIME, read_timer_cb = 0x0,
write_timer_cb = 0x0, timer_opaque = 0x0}, throttled_reqs = {{entries = {
        tqh_first = 0x0, tqh_last = 0x7fe0c23a7810}}, {entries = {
        tqh_first = 0x0, tqh_last = 0x7fe0c23a7820}}},
io_limits_enabled = false, stats = {nr_bytes = {0, 0, 0}, nr_ops = {0, 0,
      0}, total_time_ns = {0, 0, 0}, merged = {0, 0, 0},
    wr_highest_sector = 0}, bl = {max_discard = 0, discard_alignment = 0,
    max_write_zeroes = 0, write_zeroes_alignment = 128,
    opt_transfer_length = 0, max_transfer_length = 0,
opt_mem_alignment = 512}, zero_beyond_eof = true, request_alignment = 512,
  guest_block_size = 512, enable_write_cache = 1,
  on_read_error = BLOCKDEV_ON_ERROR_REPORT,
  on_write_error = BLOCKDEV_ON_ERROR_ENOSPC, iostatus_enabled = true,
iostatus = BLOCK_DEVICE_IO_STATUS_OK, node_name = '\000' <repeats 31 times>,
  node_list = {tqe_next = 0x0, tqe_prev = 0x0}, device_list = {
    tqe_next = 0x7fe0c2514e50, tqe_prev = 0x7fe0c047cff0 <bdrv_states>},
  dirty_bitmaps = {lh_first = 0x0}, refcnt = 1, tracked_requests = {
    lh_first = 0x0}, op_blockers = {{lh_first = 0x0} <repeats 15 times>},
  job = 0x0, options = 0x7fe0c23a2ca0,
detect_zeroes = BLOCKDEV_DETECT_ZEROES_OPTIONS_OFF, backing_blocker = 0x0,
  write_threshold_offset = 0, write_threshold_notifier = {notify = 0x0,
    node = {le_next = 0x0, le_prev = 0x0}}}


(gdb) p *(BDRVQcowState*)blk_backends.tqh_first->bs->opaque
$13 = {cluster_bits = 0, cluster_size = 0, cluster_sectors = 0, l2_bits = 0,
  l2_size = 0, l1_size = 0, cluster_offset_mask = 0, l1_table_offset = 0,
l1_table = 0x0, l2_cache = 0x0, l2_cache_offsets = {0 <repeats 16 times>},
  l2_cache_counts = {0 <repeats 16 times>}, cluster_cache = 0x0,
  cluster_data = 0x0, cluster_cache_offset = 0, crypt_method = 0,
crypt_method_header = 0, aes_encrypt_key = {rd_key = {0 <repeats 60 times>},
    rounds = 0}, aes_decrypt_key = {rd_key = {0 <repeats 60 times>},
rounds = 0}, lock = {locked = false, queue = {entries = {tqh_first = 0x0,
        tqh_last = 0x0}}}, migration_blocker = 0x0}
(gdb)


you see, all parameters are zeroes.

Technically this could happen in the middle of the qcow2_invalidate_cache
which, in turn is called from migration code f.e. during resuming from
suspend (which is how this thing is implemented in libvirt).

Migration code is running actually at the moment as far as could be seen
in coredump.

Any thoughts? :)

Den

The situation is even worse than I have feared.

1) the hypothesis is correct. We are in the middle of the qcow2_invalidate_cache
    which is called from here
static void process_incoming_migration_co(void *opaque)
{
     ...
    /* Make sure all file formats flush their mutable metadata */
    bdrv_invalidate_cache_all(&local_err);

2) Though the code is problematic, coroutine context is switched inside,
    there are a lot of places doing IO inside.

3) Incoming request for qmp_query_block is called, it is safely performed
    from monitor context

diff --git a/block/qapi.c b/block/qapi.c
index 8a19aed..8a07d75 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -368,7 +368,12 @@ BlockInfoList *qmp_query_block(Error **errp)

     for (blk = blk_next(NULL); blk; blk = blk_next(blk)) {
         BlockInfoList *info = g_malloc0(sizeof(*info));
+        AioContext *ctx = blk_get_aio_context(blk);
+
+        aio_context_acquire(ctx);
         bdrv_query_info(blk, &info->value, &local_err);
+        aio_context_release(ctx);
+
         if (local_err) {
             error_propagate(errp, local_err);
             goto err;

will not help us as the thread is the same! We will increase
nesting in rfifolock_lock() and proceed inside to crash.

4) locking with coroutine lock in qcow2_get_specific_info
    will not help too - it is initialized in qcow2_open inside
    invalidate cache and is taken/dropped too several
    time

:(((

Den



reply via email to

[Prev in Thread] Current Thread [Next in Thread]