[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC 29/29] migration: reset migrate thread vars when r
From: |
Peter Xu |
Subject: |
Re: [Qemu-devel] [RFC 29/29] migration: reset migrate thread vars when resumed |
Date: |
Mon, 7 Aug 2017 14:57:41 +0800 |
User-agent: |
Mutt/1.5.24 (2015-08-30) |
On Fri, Aug 04, 2017 at 10:52:27AM +0100, Dr. David Alan Gilbert wrote:
> * Peter Xu (address@hidden) wrote:
> > On Thu, Aug 03, 2017 at 02:54:35PM +0100, Dr. David Alan Gilbert wrote:
[...]
> > > > @@ -2319,6 +2327,7 @@ static void *migration_thread(void *opaque)
> > > > /* The active state we expect to be in; ACTIVE or POSTCOPY_ACTIVE
> > > > */
> > > > enum MigrationStatus current_active_state =
> > > > MIGRATION_STATUS_ACTIVE;
> > > > bool enable_colo = migrate_colo_enabled();
> > > > + MigThrError thr_error;
> > > >
> > > > rcu_register_thread();
> > > >
> > > > @@ -2395,8 +2404,17 @@ static void *migration_thread(void *opaque)
> > > > * Try to detect any kind of failures, and see whether we
> > > > * should stop the migration now.
> > > > */
> > > > - if (migration_detect_error(s)) {
> > > > + thr_error = migration_detect_error(s);
> > > > + if (thr_error == MIG_THR_ERR_FATAL) {
> > > > + /* Stop migration */
> > > > break;
> > > > + } else if (thr_error == MIG_THR_ERR_RECOVERED) {
> > > > + /*
> > > > + * Just recovered from a e.g. network failure, reset all
> > > > + * the local variables.
> > > > + */
> > > > + initial_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> > > > + initial_bytes = 0;
> > >
> > > They don't seem that important to reset?
> >
> > The problem is that we have this in migration_thread():
> >
> > if (current_time >= initial_time + BUFFER_DELAY) {
> > uint64_t transferred_bytes = qemu_ftell(s->to_dst_file) -
> > initial_bytes;
> > uint64_t time_spent = current_time - initial_time;
> > double bandwidth = (double)transferred_bytes / time_spent;
> > threshold_size = bandwidth * s->parameters.downtime_limit;
> > ...
> > }
> >
> > Here qemu_ftell() would possibly be very small since we have just
> > resumed... and then transferred_bytes will be extremely huge since
> > "qemu_ftell(s->to_dst_file) - initial_bytes" is actually negative...
> > Then, with luck, we'll got extremely huge "bandwidth" as well.
>
> Ah yes that's a good reason to reset it then; add a comment like
> 'important to avoid breaking transferred_bytes and bandwidth
> calculation'
Will do.
--
Peter Xu