[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PULL 22/33] migration: remove the QEMUFileOps 'get_buffer' callback
From: |
Daniel P . Berrangé |
Subject: |
Re: [PULL 22/33] migration: remove the QEMUFileOps 'get_buffer' callback |
Date: |
Tue, 28 Jun 2022 08:40:52 +0100 |
User-agent: |
Mutt/2.2.6 (2022-06-05) |
On Mon, Jun 27, 2022 at 04:32:00PM -0400, Peter Xu wrote:
> On Mon, Jun 27, 2022 at 04:03:09PM +0100, Daniel P. Berrangé wrote:
> > On Wed, Jun 22, 2022 at 03:34:52PM -0400, Peter Xu wrote:
> > > On Wed, Jun 22, 2022 at 07:39:06PM +0100, Dr. David Alan Gilbert (git)
> > > wrote:
> > > > diff --git a/migration/qemu-file.c b/migration/qemu-file.c
> > > > index 74f919de67..e206b05550 100644
> > > > --- a/migration/qemu-file.c
> > > > +++ b/migration/qemu-file.c
> > > > @@ -377,8 +377,22 @@ static ssize_t qemu_fill_buffer(QEMUFile *f)
> > > > return 0;
> > > > }
> > > >
> > > > - len = f->ops->get_buffer(f->ioc, f->buf + pending,
> > > > f->total_transferred,
> > > > - IO_BUF_SIZE - pending, &local_error);
> > > > + do {
> > > > + len = qio_channel_read(f->ioc,
> > > > + (char *)f->buf + pending,
> > > > + IO_BUF_SIZE - pending,
> > > > + &local_error);
> > > > + if (len == QIO_CHANNEL_ERR_BLOCK) {
> > > > + if (qemu_in_coroutine()) {
> > > > + qio_channel_yield(f->ioc, G_IO_IN);
> > > > + } else {
> > > > + qio_channel_wait(f->ioc, G_IO_IN);
> > > > + }
> > > > + } else if (len < 0) {
> > > > + len = EIO;
> > >
> > > This should be -EIO.
> > >
> > > > + }
> > > > + } while (len == QIO_CHANNEL_ERR_BLOCK);
> > >
> > > It's failing only with the new TLS test I added for postcopy somehow (at
> > > least /x86_64/migration/postcopy/recovery/tls).. I also verified after the
> > > change it'll work again.
> >
> > Assuming you can still reproduce the pre-existing flaw, can you capture
> > a stack trace when it hangs. I'm wondering if it is a sign that the
> > migration is not converging when using TLS under certain load conditions,
> > because the test waits forever for converge.
>
> Yes it is, and it reproduces here every time. It hangs at:
>
> if (!got_stop) {
> qtest_qmp_eventwait(from, "STOP");
> }
>
> >
> > Also what scenario are you running in ? Bare metal or a VM, and what
> > host arch ? Wondering if the machine is at all slow, or for example
> > missing AES hardware acceleration or some such thing.
>
> It's Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz, 40 cores.
>
> It'll pass after I modify the downtime:
>
> migrate_set_parameter_int(from, "downtime-limit", 100000);
>
> And with QTEST_LOG=1 I found that the bw is indeed low, ~700mbps.
Good, this all makes sense, and I've got pending patchues I'm testing
that will fix this.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
- Re: [PULL 22/33] migration: remove the QEMUFileOps 'get_buffer' callback, (continued)
- Re: [PULL 22/33] migration: remove the QEMUFileOps 'get_buffer' callback, Peter Xu, 2022/06/22
- Re: [PULL 22/33] migration: remove the QEMUFileOps 'get_buffer' callback, Daniel P . Berrangé, 2022/06/23
- Re: [PULL 22/33] migration: remove the QEMUFileOps 'get_buffer' callback, Peter Xu, 2022/06/23
- Re: [PULL 22/33] migration: remove the QEMUFileOps 'get_buffer' callback, Daniel P . Berrangé, 2022/06/27
- Re: [PULL 22/33] migration: remove the QEMUFileOps 'get_buffer' callback, Daniel P . Berrangé, 2022/06/27
- Re: [PULL 22/33] migration: remove the QEMUFileOps 'get_buffer' callback, Peter Xu, 2022/06/27
- Re: [PULL 22/33] migration: remove the QEMUFileOps 'get_buffer' callback, Daniel P . Berrangé, 2022/06/23
- Re: [PULL 22/33] migration: remove the QEMUFileOps 'get_buffer' callback, Dr. David Alan Gilbert, 2022/06/23
- Re: [PULL 22/33] migration: remove the QEMUFileOps 'get_buffer' callback, Daniel P . Berrangé, 2022/06/27
- Re: [PULL 22/33] migration: remove the QEMUFileOps 'get_buffer' callback, Peter Xu, 2022/06/27
- Re: [PULL 22/33] migration: remove the QEMUFileOps 'get_buffer' callback,
Daniel P . Berrangé <=
[PULL 29/33] softmmu/dirtylimit: Implement vCPU dirtyrate calculation periodically, Dr. David Alan Gilbert (git), 2022/06/22
[PULL 31/33] softmmu/dirtylimit: Implement virtual CPU throttle, Dr. David Alan Gilbert (git), 2022/06/22
[PULL 33/33] tests: Add dirty page rate limit test, Dr. David Alan Gilbert (git), 2022/06/22
[PULL 20/33] migration: remove the QEMUFileOps 'set_blocking' callback, Dr. David Alan Gilbert (git), 2022/06/22
[PULL 26/33] accel/kvm/kvm-all: Refactor per-vcpu dirty ring reaping, Dr. David Alan Gilbert (git), 2022/06/22
[PULL 32/33] softmmu/dirtylimit: Implement dirty page rate limit, Dr. David Alan Gilbert (git), 2022/06/22
[PULL 25/33] migration: remove the QEMUFileOps abstraction, Dr. David Alan Gilbert (git), 2022/06/22
[PULL 21/33] migration: remove the QEMUFileOps 'close' callback, Dr. David Alan Gilbert (git), 2022/06/22
[PULL 30/33] accel/kvm/kvm-all: Introduce kvm_dirty_ring_size function, Dr. David Alan Gilbert (git), 2022/06/22
[PULL 28/33] migration/dirtyrate: Refactor dirty page rate calculation, Dr. David Alan Gilbert (git), 2022/06/22