qemu-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-commits] [qemu/qemu] cce804: migration: Allow migrate_fd_connect t


From: GitHub
Subject: [Qemu-commits] [qemu/qemu] cce804: migration: Allow migrate_fd_connect to take an Err...
Date: Wed, 07 Feb 2018 06:38:11 -0800

  Branch: refs/heads/master
  Home:   https://github.com/qemu/qemu
  Commit: cce8040bb0ea6ff56d8882aeb0a0435a61901d93
      
https://github.com/qemu/qemu/commit/cce8040bb0ea6ff56d8882aeb0a0435a61901d93
  Author: Dr. David Alan Gilbert <address@hidden>
  Date:   2018-02-06 (Tue, 06 Feb 2018)

  Changed paths:
    M migration/channel.c
    M migration/migration.c
    M migration/migration.h
    M migration/rdma.c

  Log Message:
  -----------
  migration: Allow migrate_fd_connect to take an Error *

Allow whatever is performing the connection to pass migrate_fd_connect
an error to indicate there was a problem during connection, an allow
us to clean up.

The caller must free the error.

Signed-off-by: Dr. David Alan Gilbert <address@hidden>
Reviewed-by: Juan Quintela <address@hidden>
Signed-off-by: Juan Quintela <address@hidden>
Signed-off-by: Dr. David Alan Gilbert <address@hidden>


  Commit: 688a3dcba980bf01344a1ae2bc37fea44c6014ac
      
https://github.com/qemu/qemu/commit/688a3dcba980bf01344a1ae2bc37fea44c6014ac
  Author: Dr. David Alan Gilbert <address@hidden>
  Date:   2018-02-06 (Tue, 06 Feb 2018)

  Changed paths:
    M migration/channel.c
    M migration/channel.h
    M migration/exec.c
    M migration/fd.c
    M migration/socket.c
    M migration/tls.c
    M migration/trace-events

  Log Message:
  -----------
  migration: Route errors down through migration_channel_connect

Route async errors (especially from sockets) down through
migration_channel_connect and on to migrate_fd_connect where they
can be cleaned up.

Signed-off-by: Dr. David Alan Gilbert <address@hidden>
Reviewed-by: Juan Quintela <address@hidden>
Signed-off-by: Juan Quintela <address@hidden>
Signed-off-by: Dr. David Alan Gilbert <address@hidden>


  Commit: ee555cdf4d495ddd83633406e3099c5d1ef22e0a
      
https://github.com/qemu/qemu/commit/ee555cdf4d495ddd83633406e3099c5d1ef22e0a
  Author: Daniel Henrique Barboza <address@hidden>
  Date:   2018-02-06 (Tue, 06 Feb 2018)

  Changed paths:
    M migration/savevm.c

  Log Message:
  -----------
  migration/savevm.c: set MAX_VM_CMD_PACKAGED_SIZE to 1ul << 32

MAX_VM_CMD_PACKAGED_SIZE is a constant used in qemu_savevm_send_packaged
and loadvm_handle_cmd_packaged to determine whether a package is too
big to be sent or received. qemu_savevm_send_packaged is called inside
postcopy_start (migration/migration.c) to send the MigrationState
in a single blob to the destination, using the MIG_CMD_PACKAGED subcommand,
which will read it up using loadvm_handle_cmd_packaged. If the blob is
larger than MAX_VM_CMD_PACKAGED_SIZE, an error is thrown and the postcopy
migration is aborted. Both MAX_VM_CMD_PACKAGED_SIZE and MIG_CMD_PACKAGED
were introduced by commit 11cf1d984b ("MIG_CMD_PACKAGED: Send a packaged
chunk ..."). The constant has its original value of 1ul << 24 (16MB).

The current MAX_VM_CMD_PACKAGED_SIZE value is not enough to support postcopy
migration of bigger pseries guests. The blob size for a postcopy migration of
a pseries guest with the following setup:

qemu-system-ppc64 --nographic -vga none -machine pseries,accel=kvm -m 64G \
-smp 1,maxcpus=32 -device virtio-blk-pci,drive=rootdisk \
-drive file=f27.qcow2,if=none,cache=none,format=qcow2,id=rootdisk \
-netdev user,id=u1 -net nic,netdev=u1

Goes around 12MB. Bumping the RAM to 128G makes the blob sizes goes to 20MB.
With 256G the blob goes to 37MB - more than twice the current maximum size.
At this moment the pseries machine can handle guests with up to 1TB of RAM,
making this postcopy blob goes to 128MB of size approximately.

Following the discussions made in [1], there is a need to understand what
devices are aggressively consuming the blob in that manner and see if that
can be mitigated. Until then, we can set MAX_VM_CMD_PACKAGED_SIZE to the
maximum value allowed. Since the size is a 32 bit int variable, we can set
it as 1ul << 32, giving a maximum blob size of 4G that is enough to support
postcopy migration of 32TB RAM guests given the above constraints.

[1] https://lists.nongnu.org/archive/html/qemu-devel/2018-01/msg06313.html

Signed-off-by: Daniel Henrique Barboza <address@hidden>
Reported-by: Balamuruhan S <address@hidden>
Reviewed-by: Juan Quintela <address@hidden>
Signed-off-by: Juan Quintela <address@hidden>
Signed-off-by: Dr. David Alan Gilbert <address@hidden>


  Commit: 0781c1ed1cbe1361b45f8fddfc85d202a517a88c
      
https://github.com/qemu/qemu/commit/0781c1ed1cbe1361b45f8fddfc85d202a517a88c
  Author: Wei Wang <address@hidden>
  Date:   2018-02-06 (Tue, 06 Feb 2018)

  Changed paths:
    M migration/migration.c

  Log Message:
  -----------
  migration: use s->threshold_size inside migration_update_counters

Fixes: b15df1ae50 ("migration: cleanup stats update into function")
The threshold size is changed to be recorded in s->threshold_size.

Signed-off-by: Wei Wang <address@hidden>
Reviewed-by: Peter Xu <address@hidden>
Reviewed-by: Juan Quintela <address@hidden>
Signed-off-by: Juan Quintela <address@hidden>
Signed-off-by: Dr. David Alan Gilbert <address@hidden>


  Commit: 7faccdc3e761ce44d77f986065d4a0e1df5c8a01
      
https://github.com/qemu/qemu/commit/7faccdc3e761ce44d77f986065d4a0e1df5c8a01
  Author: Juan Quintela <address@hidden>
  Date:   2018-02-06 (Tue, 06 Feb 2018)

  Changed paths:
    M migration/ram.c

  Log Message:
  -----------
  migration: Drop current address parameter from save_zero_page()

It already has RAMBlock and offset, it can calculate it itself.

Signed-off-by: Juan Quintela <address@hidden>
Reviewed-by: Dr. David Alan Gilbert <address@hidden>
Signed-off-by: Dr. David Alan Gilbert <address@hidden>


  Commit: 1f90d797110a1e1800ba21cd79b6cd15318cb36a
      
https://github.com/qemu/qemu/commit/1f90d797110a1e1800ba21cd79b6cd15318cb36a
  Author: Juan Quintela <address@hidden>
  Date:   2018-02-06 (Tue, 06 Feb 2018)

  Changed paths:
    M tests/migration-test.c

  Log Message:
  -----------
  tests: Remove deprecated migration tests commands

We move to use migration_set_parameter() for everything.

Signed-off-by: Juan Quintela <address@hidden>
Reviewed-by: Dr. David Alan Gilbert <address@hidden>
Signed-off-by: Dr. David Alan Gilbert <address@hidden>


  Commit: 63b2d935f7f9b09c75380d1ffb37a8f1fa23fdcb
      
https://github.com/qemu/qemu/commit/63b2d935f7f9b09c75380d1ffb37a8f1fa23fdcb
  Author: Juan Quintela <address@hidden>
  Date:   2018-02-06 (Tue, 06 Feb 2018)

  Changed paths:
    M tests/migration-test.c

  Log Message:
  -----------
  tests: Consolidate accelerators declaration

Signed-off-by: Juan Quintela <address@hidden>
Reviewed-by: Dr. David Alan Gilbert <address@hidden>
Reviewed-by: Peter Xu <address@hidden>
Signed-off-by: Dr. David Alan Gilbert <address@hidden>


  Commit: 31a6bb74fa5383192c010f87d079993c99dd5bf8
      
https://github.com/qemu/qemu/commit/31a6bb74fa5383192c010f87d079993c99dd5bf8
  Author: Juan Quintela <address@hidden>
  Date:   2018-02-06 (Tue, 06 Feb 2018)

  Changed paths:
    M tests/migration-test.c

  Log Message:
  -----------
  tests: Use consistent names for migration

Signed-off-by: Juan Quintela <address@hidden>
Reviewed-by: Dr. David Alan Gilbert <address@hidden>
Reviewed-by: Peter Xu <address@hidden>
Signed-off-by: Dr. David Alan Gilbert <address@hidden>


  Commit: 4c27486dc75b803ba8fe9eb9375cc9c075d3f127
      
https://github.com/qemu/qemu/commit/4c27486dc75b803ba8fe9eb9375cc9c075d3f127
  Author: Juan Quintela <address@hidden>
  Date:   2018-02-06 (Tue, 06 Feb 2018)

  Changed paths:
    M tests/migration-test.c

  Log Message:
  -----------
  tests: Add deprecated commands migration test

We add deprecated commands on a new test, so we don't have to add it
on normal tests.

Signed-off-by: Juan Quintela <address@hidden>
Reviewed-by: Peter Xu <address@hidden>
Signed-off-by: Dr. David Alan Gilbert <address@hidden>


  Commit: eb665d7d92200d948238f67b827d604856ac061e
      
https://github.com/qemu/qemu/commit/eb665d7d92200d948238f67b827d604856ac061e
  Author: Juan Quintela <address@hidden>
  Date:   2018-02-06 (Tue, 06 Feb 2018)

  Changed paths:
    M tests/migration-test.c

  Log Message:
  -----------
  tests: Create migrate-start-postcopy command

This way, it is like the rest of commands

Signed-off-by: Juan Quintela <address@hidden>
Reviewed-by: Dr. David Alan Gilbert <address@hidden>
Reviewed-by: Peter Xu <address@hidden>
Signed-off-by: Dr. David Alan Gilbert <address@hidden>


  Commit: 6a7724e9a239b5f1342df00deedab06f3d360083
      
https://github.com/qemu/qemu/commit/6a7724e9a239b5f1342df00deedab06f3d360083
  Author: Juan Quintela <address@hidden>
  Date:   2018-02-06 (Tue, 06 Feb 2018)

  Changed paths:
    M tests/migration-test.c

  Log Message:
  -----------
  tests: Adjust sleeps for migration test

Also reorder code to not sleep when event already happened.

Signed-off-by: Juan Quintela <address@hidden>
Reviewed-by: Dr. David Alan Gilbert <address@hidden>
Reviewed-by: Peter Xu <address@hidden>
Signed-off-by: Dr. David Alan Gilbert <address@hidden>


  Commit: 6039dd5b1c45d76403b9dcadd2afd7efd8f42330
      
https://github.com/qemu/qemu/commit/6039dd5b1c45d76403b9dcadd2afd7efd8f42330
  Author: Dr. David Alan Gilbert <address@hidden>
  Date:   2018-02-06 (Tue, 06 Feb 2018)

  Changed paths:
    M migration/migration.c

  Log Message:
  -----------
  migration: Recover block devices if failure in device state

In e91d895 I added the new pause-before-switchover mechanism
to allow migration completion to be delayed; this changes the
last state prior to completion to MIGRATE_STATUS_DEVICE rather
than MIGRATE_STATUS_ACTIVE.

Fix the failure path in migration_completion to recover the block
devices if it fails in MIGRATE_STATUS_DEVICE, not just the
MIGRATE_STATUS_ACTIVE that it previously had.

This corresponds to rh bz:
  https://bugzilla.redhat.com/show_bug.cgi?id=1538494
whose symptom is an occasional source crash on a failed migration.

Fixes: e91d8951d59d483f085f
Signed-off-by: Dr. David Alan Gilbert <address@hidden>
Reviewed-by: Peter Xu <address@hidden>
Signed-off-by: Dr. David Alan Gilbert <address@hidden>


  Commit: 032b79f7173051e7f8742a43d106c7fc526856f9
      
https://github.com/qemu/qemu/commit/032b79f7173051e7f8742a43d106c7fc526856f9
  Author: Ross Lagerwall <address@hidden>
  Date:   2018-02-06 (Tue, 06 Feb 2018)

  Changed paths:
    M migration/savevm.c

  Log Message:
  -----------
  migration: Don't leak IO channels

Since qemu_fopen_channel_{in,out}put take references on the underlying
IO channels, make sure to release our references to them.

Signed-off-by: Ross Lagerwall <address@hidden>
Message-Id: <address@hidden>
Reviewed-by: Daniel P. Berrange <address@hidden>
Signed-off-by: Dr. David Alan Gilbert <address@hidden>


  Commit: 875fcd013ab68c64802998b22f54f0184479d21b
      
https://github.com/qemu/qemu/commit/875fcd013ab68c64802998b22f54f0184479d21b
  Author: Greg Kurz <address@hidden>
  Date:   2018-02-06 (Tue, 06 Feb 2018)

  Changed paths:
    M migration/savevm.c

  Log Message:
  -----------
  migration: incoming postcopy advise sanity checks

If postcopy-ram was set on the source but not on the destination,
migration doesn't occur, the destination prints an error and boots
the guest:

qemu-system-ppc64: Expected vmdescription section, but got 0

We end up with two running instances.

This behaviour was introduced in 2.11 by commit 58110f0acb1a "migration:
split common postcopy out of ram postcopy" to prepare ground for the
upcoming dirty bitmap postcopy support. It adds a new case where the
source may send an empty postcopy advise because dirty bitmap doesn't
need to check page sizes like RAM postcopy does.

If the source has enabled postcopy-ram, then it sends an advise with
the page size values. If the destination hasn't enabled postcopy-ram,
then loadvm_postcopy_handle_advise() leaves the page size values on
the stream and returns. This confuses qemu_loadvm_state() later on
and causes the destination to start execution.

As discussed several times, postcopy-ram should be enabled both sides
to be functional. This patch changes the destination to perform some
extra checks on the advise length to ensure this is the case. Otherwise
an error is returned and migration is aborted.

Reported-by: Balamuruhan S <address@hidden>
Signed-off-by: Greg Kurz <address@hidden>
Reviewed-by: Daniel Henrique Barboza <address@hidden>
Reviewed-by: Vladimir Sementsov-Ogievskiy <address@hidden>
Reviewed-by: Dr. David Alan Gilbert <address@hidden>
Message-Id: <address@hidden>
Signed-off-by: Dr. David Alan Gilbert <address@hidden>


  Commit: 0833df03f4206a6cf416fbb3d380fa95c8e61fba
      
https://github.com/qemu/qemu/commit/0833df03f4206a6cf416fbb3d380fa95c8e61fba
  Author: Peter Maydell <address@hidden>
  Date:   2018-02-07 (Wed, 07 Feb 2018)

  Changed paths:
    M migration/channel.c
    M migration/channel.h
    M migration/exec.c
    M migration/fd.c
    M migration/migration.c
    M migration/migration.h
    M migration/ram.c
    M migration/rdma.c
    M migration/savevm.c
    M migration/socket.c
    M migration/tls.c
    M migration/trace-events
    M tests/migration-test.c

  Log Message:
  -----------
  Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20180206a' 
into staging

Migration pull 2018-02-06

This is based off Juan's last pull with a few extras, but
also removing:
   Add migration xbzrle test
   Add migration precopy test

As well as my normal test boxes, I also gave it a test
on a 32 bit ARM box and it seems happy (a Calxeda highbank)
and a big-endian power box.

Dave

# gpg: Signature made Tue 06 Feb 2018 15:33:31 GMT
# gpg:                using RSA key 0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <address@hidden>"
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert/tags/pull-migration-20180206a:
  migration: incoming postcopy advise sanity checks
  migration: Don't leak IO channels
  migration: Recover block devices if failure in device state
  tests: Adjust sleeps for migration test
  tests: Create migrate-start-postcopy command
  tests: Add deprecated commands migration test
  tests: Use consistent names for migration
  tests: Consolidate accelerators declaration
  tests: Remove deprecated migration tests commands
  migration: Drop current address parameter from save_zero_page()
  migration: use s->threshold_size inside migration_update_counters
  migration/savevm.c: set MAX_VM_CMD_PACKAGED_SIZE to 1ul << 32
  migration: Route errors down through migration_channel_connect
  migration: Allow migrate_fd_connect to take an Error *

Signed-off-by: Peter Maydell <address@hidden>


Compare: https://github.com/qemu/qemu/compare/bc2943d6caf7...0833df03f420

reply via email to

[Prev in Thread] Current Thread [Next in Thread]