qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v8 32/54] Postcopy: Maintain sentmap and calcula


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-devel] [PATCH v8 32/54] Postcopy: Maintain sentmap and calculate discard
Date: Fri, 30 Oct 2015 18:43:00 +0000
User-agent: Mutt/1.5.24 (2015-08-30)

* Juan Quintela (address@hidden) wrote:
> "Dr. David Alan Gilbert (git)" <address@hidden> wrote:
> > From: "Dr. David Alan Gilbert" <address@hidden>
> >
> > Where postcopy is preceeded by a period of precopy, the destination will
> > have received pages that may have been dirtied on the source after the
> > page was sent.  The destination must throw these pages away before
> > starting it's CPUs.
> >
> > Maintain a 'sentmap' of pages that have already been sent.
> > Calculate list of sent & dirty pages
> > Provide helpers on the destination side to discard these.
> >
> > Signed-off-by: Dr. David Alan Gilbert <address@hidden>
> > Reviewed-by: Amit Shah <address@hidden>
> 
> 
> Hi

(I'm going to reply to this mail in a few separate mails as I get
to them)

> >      /* Flag set once the migration has been asked to enter postcopy */
> >      bool start_postcopy;
> 
> 
> This is from a previous patch, but ....
> 
> Change the name of the variable or the comment?  From the comment it
> sholud be "in_postcopy", no?

We have to be careful to differentiate between two separate things:
  1) The user has issued 'migrate_start_postcopy'
     - that sets this 'start_postcopy' flag

  2) The non-postcopiable data has dropped below the limit and we've
     now been able to take notice of 'start_postcopy' and actually
     enter postcopy.

  I think 'in_postcopy' would imply (2); while 'start_postcopy'
  matches the command that's been issued.

> > +struct PostcopyDiscardState {
> > +    const char *name;
> 
> Iht is not obvious to me what name means here.  I assume ram block name,
> change it to block_name, ramblock?

Now ramblock_name.

> 
> > + * returns: 0 on success.
> > + */
> > +int postcopy_ram_discard_range(MigrationIncomingState *mis, uint8_t *start,
> > +                               size_t length)
> > +{
> > +    trace_postcopy_ram_discard_range(start, length);
> > +    if (madvise(start, length, MADV_DONTNEED)) {
> > +        error_report("%s MADV_DONTNEED: %s", __func__, strerror(errno));
> > +        return -1;
> > +    }
> > +
> > +    return 0;
> > +}
> > +
> >  #else
> >  /* No target OS support, stubs just fail */
> >  bool postcopy_ram_supported_by_host(void)
> > @@ -153,5 +192,95 @@ bool postcopy_ram_supported_by_host(void)
> >      return false;
> >  }
> >  
> > +int postcopy_ram_discard_range(MigrationIncomingState *mis, uint8_t *start,
> > +                               size_t length)
> > +{
> > +    assert(0);
> 
> I will assume that just returning -1 would work here.
> 
> But yes, I understand that this code shouldn't be reach ...

Yes, it really shouldn't happen if the previous code that says
postcopy isn't supported has been obeyed; I'm happy to change
it if you want.

> > +}
> >  #endif
> >  
> > +/* 
> > ------------------------------------------------------------------------- */
> > +
> > +/**
> > + * postcopy_discard_send_init: Called at the start of each RAMBlock before
> > + *   asking to discard individual ranges.
> > + *
> > + * @ms: The current migration state.
> > + * @offset: the bitmap offset of the named RAMBlock in the migration
> > + *   bitmap.
> > + * @name: RAMBlock that discards will operate on.
> > + *
> > + * returns: a new PDS.
> > + */
> > +PostcopyDiscardState *postcopy_discard_send_init(MigrationState *ms,
> > +                                                 unsigned long offset,
> > +                                                 const char *name)
> > +{
> > +    PostcopyDiscardState *res = g_try_malloc(sizeof(PostcopyDiscardState));
> 
> Why are we using here g_try_malloc instead of g_malloc()?  Even
> g_malloc0()?
>
> Specially when we don't check if res is NULL on return.  Please change.

Eek yes; I've gone with malloc0.

> 
> 
> > +
> > +    if (res) {
> > +        res->name = name;
> > +        res->cur_entry = 0;
> > +        res->nsentwords = 0;
> > +        res->nsentcmds = 0;
> 
> With the zero variant, this three can be removed.

Done.

> 
> > +        res->offset = offset;
> > +    }
> > +
> > +    return res;
> > +}
> 
> > -/* Called with rcu_read_lock() to protect migration_bitmap */
> > +/* Called with rcu_read_lock() to protect migration_bitmap
> > + * mr: The region to search for dirty pages in
> 
> Haha, you forgot to update the comment when you moved the function to
> use ram blocks O:-)

Oops, fixed :-)

(Rest of the patch another time)

Dave

> > @@ -662,6 +672,24 @@ static int save_zero_page(QEMUFile *f, RAMBlock 
> > *block, ram_addr_t offset,
> >  }
> >  
> >  /**
> > + * ram_find_block_by_id: Find a ramblock by name.
> > + *
> > + * Returns: The RAMBlock with matching ID, or NULL.
> > + */
> > +static RAMBlock *ram_find_block_by_id(const char *id)
> > +{
> > +    RAMBlock *block;
> > +
> > +    QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
> > +        if (!strcmp(id, block->idstr)) {
> > +            return block;
> > +        }
> > +    }
> > +
> > +    return NULL;
> > +}
> 
> We don't have this function already.....
> 
> Once here, could we split it in its own patch and use it in ram_load?
> 
> 
>                 QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
>                     if (!strncmp(id, block->idstr, sizeof(id))) {
>                         if (length != block->used_length) {
>                             Error *local_err = NULL;
> 
>                             ret = qemu_ram_resize(block->offset, length, 
> &local_err);
>                             if (local_err) {
>                                 error_report_err(local_err);
>                             }
>                         }
>                         ram_control_load_hook(f, RAM_CONTROL_BLOCK_REG,
>                                               block->idstr);
>                         break;
>                     }
>                 }
> 
>                 if (!block) {
>                     error_report("Unknown ramblock \"%s\", cannot "
>                                  "accept migration", id);
>                     ret = -EINVAL;
>                 }
> 
> 
> We could also use it in:
> 
> host_from_stream_offset
> 
> 
> > +/* **** functions for postcopy ***** */
> > +
> > +/*
> > + * Callback from postcopy_each_ram_send_discard for each RAMBlock
> > + * start,end: Indexes into the bitmap for the first and last bit
> > + *            representing the named block
> > + */
> > +static int postcopy_send_discard_bm_ram(MigrationState *ms,
> > +                                        PostcopyDiscardState *pds,
> > +                                        unsigned long start, unsigned long 
> > end)
> > +{
> > +    unsigned long current;
> > +
> > +    for (current = start; current <= end; ) {
> > +        unsigned long set = find_next_bit(ms->sentmap, end + 1, current);
> > +
> > +        if (set <= end) {
> > +            unsigned long zero = find_next_zero_bit(ms->sentmap,
> > +                                                    end + 1, set + 1);
> > +
> > +            if (zero > end) {
> > +                zero = end + 1;
> > +            }
> > +            postcopy_discard_send_range(ms, pds, set, zero - 1);
> > +            current = zero + 1;
> > +        } else {
> > +            current = set;
> > +        }
> > +    }
> 
> I think I undrestood the logic  here at the end....
> 
> But if we change the meaning of postcopy_discard_send_range() from
> (begin, end), to (begin, length), I think everything goes clearer, no?
> 
>         if (set <= end) {
>             unsigned long zero = find_next_zero_bit(ms->sentmap,
>                                                     end + 1, set + 1);
>             unsigned long length;
> 
>             if (zero > end) {
>                 length = end - set;
>             } else {
>                 lenght = zero - 1 - set;
>                 current = zero + 1;
>             }
>             postcopy_discard_send_range(ms, pds, set, len);
>         } else {
>             current = set;
>         }
>     }
> 
> Y would clame that if we call one zero, the other would be called one.
> Or change to set/unset, but that is just me.  Yes, I haven't tested, and
> it is possible that there is a off-by-one somewhere...
> 
> 
> Looking at postocpy_eand_ram_send_discard, I even think that it would be
> a good idea to pass length to this function.
> 
> > +/*
> > + * Transmit the set of pages to be discarded after precopy to the target
> > + * these are pages that:
> > + *     a) Have been previously transmitted but are now dirty again
> > + *     b) Pages that have never been transmitted, this ensures that
> > + *        any pages on the destination that have been mapped by background
> > + *        tasks get discarded (transparent huge pages is the specific 
> > concern)
> > + * Hopefully this is pretty sparse
> > + */
> > +int ram_postcopy_send_discard_bitmap(MigrationState *ms)
> > +{
> > +    int ret;
> > +
> > +    rcu_read_lock();
> > +
> > +    /* This should be our last sync, the src is now paused */
> > +    migration_bitmap_sync();
> > +
> > +    /*
> > +     * Update the sentmap to be sentmap = ~sentmap | dirty
> > +     */
> > +    bitmap_complement(ms->sentmap, ms->sentmap,
> > +               last_ram_offset() >> TARGET_PAGE_BITS);
> > +
> > +    bitmap_or(ms->sentmap, ms->sentmap, migration_bitmap,
> > +               last_ram_offset() >> TARGET_PAGE_BITS);
> 
> This bitmaps are really big, I don't know how long would take to do this
> operations here, but I think that we can avoid (at least) the
> bitmap_complement.  We can change the bitmap name to notsentbitmap, init
> it to one and clear it each time that we sent a page, no?
> 
> We can also do the bitmap_or() at migration_sync_bitmap() time, at that
> point, we shouldn't be on the critical path?
> 
> Later, Juan.
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]