qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] migration: fix migration shutdown


From: Yury Kotov
Subject: Re: [Qemu-devel] [PATCH] migration: fix migration shutdown
Date: Mon, 08 Apr 2019 14:36:34 +0300

Hi,

I've sent another patch to fix this UAF:
"migration: Fix use-after-free during process exit"

It's more simple and fixes only the regression.

Regards,
Yury

05.04.2019, 12:07, "Dr. David Alan Gilbert" <address@hidden>:
> * Yury Kotov (address@hidden) wrote:
>>  03.04.2019, 22:06, "Dr. David Alan Gilbert" <address@hidden>:
>>  > * Yury Kotov (address@hidden) wrote:
>>  >>  It fixes heap-use-after-free which was found by clang's ASAN.
>>  >>
>>  >>  Control flow of this use-after-free:
>>  >>  main_thread:
>>  >>      * Got SIGTERM and completes main loop
>>  >>      * Calls migration_shutdown
>>  >>        - migrate_fd_cancel (so, migration_thread begins to complete)
>>  >>        - object_unref(OBJECT(current_migration));
>>  >>
>>  >>  migration_thread:
>>  >>      * migration_iteration_finish -> schedule cleanup bh
>>  >>      * object_unref(OBJECT(s)); (Now, current_migration is freed)
>>  >>      * exits
>>  >>
>>  >>  main_thread:
>>  >>      * Calls vm_shutdown -> drain bdrvs -> main loop
>>  >>        -> cleanup_bh -> use after free
>>  >>
>>  >>  If you want to reproduce, these couple of sleeps will help:
>>  >>  vl.c:4613:
>>  >>       migration_shutdown();
>>  >>  + sleep(2);
>>  >>  migration.c:3269:
>>  >>  + sleep(1);
>>  >>       trace_migration_thread_after_loop();
>>  >>       migration_iteration_finish(s);
>>  >
>>  > Fun; I hadn't realised you could get more main loop after exiting
>>  > the main loop.
>>  >
>>  > Is this fallout from my 892ae715b6b or is this just another case it
>>  > didn't catch?
>>  > I guess it moved the shutdown early, so it probably is fallout, and
>>  > so probably does need to go to 4.0 ?
>>  >
>>  I think this is closer to another case than fallout. Because without
>>  892ae715b6b, reproducing of UAF at exit was much easier.
>>  Another way to fix it - remove unref from migration_shutdown
>>  (but keep migrate_fd_cancel) and restore migration_object_finalize.
>
> OK, so in that case I'd rather leave this to 4.1 rather than rush it in
> 4.0rc's - I'd rather only take regressions at this point.
>
> Dave
>
>>  >>  Original output:
>>  >>  qemu-system-x86_64: terminating on signal 15 from pid 31980 (<unknown 
>> process>)
>>  >>  =================================================================
>>  >>  ==31958==ERROR: AddressSanitizer: heap-use-after-free on address 
>> 0x61900001d210
>>  >>    at pc 0x555558a535ca bp 0x7fffffffb190 sp 0x7fffffffb188
>>  >>  READ of size 8 at 0x61900001d210 thread T0 (qemu-vm-0)
>>  >>      #0 0x555558a535c9 in migrate_fd_cleanup 
>> migration/migration.c:1502:23
>>  >>      #1 0x5555594fde0a in aio_bh_call util/async.c:90:5
>>  >>      #2 0x5555594fe522 in aio_bh_poll util/async.c:118:13
>>  >>      #3 0x555559524783 in aio_poll util/aio-posix.c:725:17
>>  >>      #4 0x555559504fb3 in aio_wait_bh_oneshot util/aio-wait.c:71:5
>>  >>      #5 0x5555573bddf6 in virtio_blk_data_plane_stop
>>  >>        hw/block/dataplane/virtio-blk.c:282:5
>>  >>      #6 0x5555589d5c09 in virtio_bus_stop_ioeventfd 
>> hw/virtio/virtio-bus.c:246:9
>>  >>      #7 0x5555589e9917 in virtio_pci_stop_ioeventfd 
>> hw/virtio/virtio-pci.c:287:5
>>  >>      #8 0x5555589e22bf in virtio_pci_vmstate_change 
>> hw/virtio/virtio-pci.c:1072:9
>>  >>      #9 0x555557628931 in virtio_vmstate_change hw/virtio/virtio.c:2257:9
>>  >>      #10 0x555557c36713 in vm_state_notify vl.c:1605:9
>>  >>      #11 0x55555716ef53 in do_vm_stop cpus.c:1074:9
>>  >>      #12 0x55555716eeff in vm_shutdown cpus.c:1092:12
>>  >>      #13 0x555557c4283e in main vl.c:4617:5
>>  >>      #14 0x7fffdfdb482f in __libc_start_main
>>  >>        (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
>>  >>      #15 0x555556ecb118 in _start 
>> (x86_64-softmmu/qemu-system-x86_64+0x1977118)
>>  >>
>>  >>  0x61900001d210 is located 144 bytes inside of 952-byte region
>>  >>    [0x61900001d180,0x61900001d538)
>>  >>  freed by thread T6 (live_migration) here:
>>  >>      #0 0x555556f76782 in __interceptor_free
>>  >>        
>> /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:124:3
>>  >>      #1 0x555558d5fa94 in object_finalize qom/object.c:618:9
>>  >>      #2 0x555558d57651 in object_unref qom/object.c:1068:9
>>  >>      #3 0x555558a55588 in migration_thread migration/migration.c:3272:5
>>  >>      #4 0x5555595393f2 in qemu_thread_start 
>> util/qemu-thread-posix.c:502:9
>>  >>      #5 0x7fffe057f6b9 in start_thread 
>> (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9)
>>  >>
>>  >>  previously allocated by thread T0 (qemu-vm-0) here:
>>  >>      #0 0x555556f76b03 in __interceptor_malloc
>>  >>        
>> /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:146:3
>>  >>      #1 0x7ffff6ee37b8 in g_malloc 
>> (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4f7b8)
>>  >>      #2 0x555558d58031 in object_new qom/object.c:640:12
>>  >>      #3 0x555558a31f21 in migration_object_init 
>> migration/migration.c:139:25
>>  >>      #4 0x555557c41398 in main vl.c:4320:5
>>  >>      #5 0x7fffdfdb482f in __libc_start_main 
>> (/lib/x86_64-linux-gnu/libc.so.6+0x2082f)
>>  >>
>>  >>  Thread T6 (live_migration) created by T0 (qemu-vm-0) here:
>>  >>      #0 0x555556f5f0dd in pthread_create
>>  >>        
>> /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_interceptors.cc:210:3
>>  >>      #1 0x555559538cf9 in qemu_thread_create 
>> util/qemu-thread-posix.c:539:11
>>  >>      #2 0x555558a53304 in migrate_fd_connect migration/migration.c:3332:5
>>  >>      #3 0x555558a72bd8 in migration_channel_connect 
>> migration/channel.c:92:5
>>  >>      #4 0x555558a6ef87 in exec_start_outgoing_migration 
>> migration/exec.c:42:5
>>  >>      #5 0x555558a4f3c2 in qmp_migrate migration/migration.c:1922:9
>>  >>      #6 0x555558bb4f6a in qmp_marshal_migrate 
>> qapi/qapi-commands-migration.c:607:5
>>  >>      #7 0x555559363738 in do_qmp_dispatch qapi/qmp-dispatch.c:131:5
>>  >>      #8 0x555559362a15 in qmp_dispatch qapi/qmp-dispatch.c:174:11
>>  >>      #9 0x5555571bac15 in monitor_qmp_dispatch monitor.c:4124:11
>>  >>      #10 0x55555719a22d in monitor_qmp_bh_dispatcher monitor.c:4207:9
>>  >>      #11 0x5555594fde0a in aio_bh_call util/async.c:90:5
>>  >>      #12 0x5555594fe522 in aio_bh_poll util/async.c:118:13
>>  >>      #13 0x5555595201e0 in aio_dispatch util/aio-posix.c:460:5
>>  >>      #14 0x555559503553 in aio_ctx_dispatch util/async.c:261:5
>>  >>      #15 0x7ffff6ede196 in g_main_context_dispatch
>>  >>        (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4a196)
>>  >>
>>  >>  SUMMARY: AddressSanitizer: heap-use-after-free 
>> migration/migration.c:1502:23
>>  >>    in migrate_fd_cleanup
>>  >>  Shadow bytes around the buggy address:
>>  >>    0x0c327fffb9f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
>>  >>    0x0c327fffba00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
>>  >>    0x0c327fffba10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
>>  >>    0x0c327fffba20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
>>  >>    0x0c327fffba30: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
>>  >>  =>0x0c327fffba40: fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd
>>  >>    0x0c327fffba50: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
>>  >>    0x0c327fffba60: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
>>  >>    0x0c327fffba70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
>>  >>    0x0c327fffba80: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
>>  >>    0x0c327fffba90: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
>>  >>  Shadow byte legend (one shadow byte represents 8 application bytes):
>>  >>    Addressable: 00
>>  >>    Partially addressable: 01 02 03 04 05 06 07
>>  >>    Heap left redzone: fa
>>  >>    Freed heap region: fd
>>  >>    Stack left redzone: f1
>>  >>    Stack mid redzone: f2
>>  >>    Stack right redzone: f3
>>  >>    Stack after return: f5
>>  >>    Stack use after scope: f8
>>  >>    Global redzone: f9
>>  >>    Global init order: f6
>>  >>    Poisoned by user: f7
>>  >>    Container overflow: fc
>>  >>    Array cookie: ac
>>  >>    Intra object redzone: bb
>>  >>    ASan internal: fe
>>  >>    Left alloca redzone: ca
>>  >>    Right alloca redzone: cb
>>  >>    Shadow gap: cc
>>  >>  ==31958==ABORTING
>>  >>
>>  >>  Signed-off-by: Yury Kotov <address@hidden>
>>  >>  ---
>>  >>   migration/migration.c | 30 ++++++++++++++++++++++++------
>>  >>   1 file changed, 24 insertions(+), 6 deletions(-)
>>  >>
>>  >>  diff --git a/migration/migration.c b/migration/migration.c
>>  >>  index 609e0df5d0..54d82bb88b 100644
>>  >>  --- a/migration/migration.c
>>  >>  +++ b/migration/migration.c
>>  >>  @@ -128,6 +128,8 @@ static int migration_maybe_pause(MigrationState *s,
>>  >>                                    int *current_active_state,
>>  >>                                    int new_state);
>>  >>   static void migrate_fd_cancel(MigrationState *s);
>>  >>  +static void migrate_join_thread(MigrationState *s);
>>  >>  +static void migrate_fd_cleanup(void *opaque);
>>  >>
>>  >>   void migration_object_init(void)
>>  >>   {
>>  >>  @@ -176,6 +178,17 @@ void migration_shutdown(void)
>>  >>        * stop the migration using this structure
>>  >>        */
>>  >>       migrate_fd_cancel(current_migration);
>>  >>  + /*
>>  >>  + * migration_thread schedules cleanup_bh, but migration_shutdown is 
>> called
>>  >>  + * from the end of main, so cleanu_up may not be called because 
>> main-loop is
>>  >>  + * complete. So, wait for the complition of migration_thread, cancel 
>> bh and
>>  >>  + * call cleanup ourselves.
>>  >>  + */
>>  >
>>  > (Some typos)
>>  >
>>  Ok :)
>>
>>  >>  + migrate_join_thread(current_migration);
>>  >
>>  > I'll admit to being a bit nervous that something in here could be
>>  > hanging and thus we could hang here rather than die as SIGTERM is
>>  > expecting us to.
>>  >
>>  I think it's only way to be sure we havn't other races with 
>> migration_thread.
>>  And this joining is cheaper than vm_shutdown IMHO.
>>
>>  >>  + if (current_migration->cleanup_bh) {
>>  >>  + qemu_bh_cancel(current_migration->cleanup_bh);
>>  >
>>  > Is that needed? Doesn't migrate_fd_cleanup start with a qemu_bh_delete?
>>  >
>>  Agree, it's odd.
>>
>>  >>  + migrate_fd_cleanup(current_migration);
>>  >>  + }
>>  >>       object_unref(OBJECT(current_migration));
>>  >>   }
>>  >>
>>  >>  @@ -1495,6 +1508,16 @@ static void 
>> block_cleanup_parameters(MigrationState *s)
>>  >>       }
>>  >>   }
>>  >>
>>  >>  +static void migrate_join_thread(MigrationState *s)
>>  >>  +{
>>  >>  + qemu_mutex_unlock_iothread();
>>  >>  + if (s->migration_thread_running) {
>>  >>  + qemu_thread_join(&s->thread);
>>  >>  + s->migration_thread_running = false;
>>  >>  + }
>>  >>  + qemu_mutex_lock_iothread();
>>  >>  +}
>>  >>  +
>>  >>   static void migrate_fd_cleanup(void *opaque)
>>  >>   {
>>  >>       MigrationState *s = opaque;
>>  >>  @@ -1508,12 +1531,7 @@ static void migrate_fd_cleanup(void *opaque)
>>  >>           QEMUFile *tmp;
>>  >>
>>  >>           trace_migrate_fd_cleanup();
>>  >>  - qemu_mutex_unlock_iothread();
>>  >>  - if (s->migration_thread_running) {
>>  >>  - qemu_thread_join(&s->thread);
>>  >>  - s->migration_thread_running = false;
>>  >>  - }
>>  >>  - qemu_mutex_lock_iothread();
>>  >>  + migrate_join_thread(s);
>>  >>
>>  >>           multifd_save_cleanup();
>>  >>           qemu_mutex_lock(&s->qemu_file_lock);
>>  >
>>  > Dave
>>  >
>>  >>  --
>>  >>  2.21.0
>>  > --
>>  > Dr. David Alan Gilbert / address@hidden / Manchester, UK
> --
> Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]