qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 12/16] translate-all: discard TB when tb_link_pa


From: Emilio G. Cota
Subject: Re: [Qemu-devel] [PATCH 12/16] translate-all: discard TB when tb_link_page returns an existing matching TB
Date: Thu, 5 Apr 2018 21:23:38 -0400
User-agent: Mutt/1.5.24 (2015-08-30)

On Thu, Mar 29, 2018 at 16:19:39 +0100, Alex Bennée wrote:
> 
> Emilio G. Cota <address@hidden> writes:
> 
> > Use the recently-gained QHT feature of returning the matching TB if it
> > already exists. This allows us to get rid of the lookup we perform
> > right after acquiring tb_lock.
> >
> > Suggested-by: Richard Henderson <address@hidden>
> > Signed-off-by: Emilio G. Cota <address@hidden>
> > ---
> >  accel/tcg/cpu-exec.c      | 14 ++------------
> >  accel/tcg/translate-all.c | 47 
> > ++++++++++++++++++++++++++++++++++++++---------
> >  2 files changed, 40 insertions(+), 21 deletions(-)
> >
> > diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
> > index 7c83887..8aed38c 100644
> > --- a/accel/tcg/cpu-exec.c
> > +++ b/accel/tcg/cpu-exec.c
> > @@ -243,10 +243,7 @@ void cpu_exec_step_atomic(CPUState *cpu)
> >          if (tb == NULL) {
> >              mmap_lock();
> >              tb_lock();
> > -            tb = tb_htable_lookup(cpu, pc, cs_base, flags, cf_mask);
> > -            if (likely(tb == NULL)) {
> > -                tb = tb_gen_code(cpu, pc, cs_base, flags, cflags);
> > -            }
> > +            tb = tb_gen_code(cpu, pc, cs_base, flags, cflags);
> 
> tb_gen_code needs to be renamed to reflect it's semantics.
> tb_get_or_gen_code? Or maybe tb_get_code with a sub-helper to do the
> generation.

I think it can remain as tb_gen_code. The caller still gets
a TB, and whether that TB has been generated by this thread or
any other thread is irrelevant.

(snip)
> > --- a/accel/tcg/translate-all.c
> > +++ b/accel/tcg/translate-all.c
> > @@ -1503,12 +1503,16 @@ static inline void tb_page_add(PageDesc *p, 
> > TranslationBlock *tb,
> >   * (-1) to indicate that only one page contains the TB.
> >   *
> >   * Called with mmap_lock held for user-mode emulation.
> > + *
> > + * Returns @tb or an existing TB that matches @tb.
> 
> That's just confusing to read. So this returns a TB like the @tb we
> passed in but actually a different one matching the same conditions?

Good point. Here tb_link_page is not a great name, but instead
of adding a long name such as tb_link_page_or_get_existing, in
v2 I've expanded the above comment. It now looks as follows:

 * Returns a pointer @tb, or a pointer to an existing TB that matches @tb.
 * Note that in !user-mode, another thread might have already added a TB
 * for the same block of guest code that @tb corresponds to. In that case,
 * the caller should discard the original @tb, and use instead the returned TB.
 
> > @@ -1706,7 +1727,15 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
> >       * memory barrier is required before tb_link_page() makes the TB 
> > visible
> >       * through the physical hash table and physical page list.
> >       */
> > -    tb_link_page(tb, phys_pc, phys_page2);
> > +    existing_tb = tb_link_page(tb, phys_pc, phys_page2);
> > +    /* if the TB already exists, discard what we just translated */
> 
> So are we in the position now that we could potentially do a translation
> but be beaten by another thread generating the same code?

Exactly.

> I suspect we could
> do with a bit of explanatory commentary for the tb_gen_code functions.

As I said above I don't think tb_gen_code changes at all
to its callers, since the caller still gets a TB pointer that it
did not have before.

tb_link_page is the key here -- I hope the updated comment
I quoted above is enough to make things clear.

> Also I think the "Translation Blocks" section needs updating in the
> MTTCG design document to make this clear.

I've added a comment at the bottom of that section:

Parallel code generation is supported. QHT is used at insertion time
as the synchronization point across threads, thereby ensuring that we only
keep track of a single TranslationBlock for each guest code block.

> I'm curious if we should be counting unused translations somewhere in
> the JIT stats. I'm guessing you need to work at a pathalogical case to
> hit this much?

This should be extremely rare on most workloads. Given that and the
fact that we won't have unused translated code (we discard it by
resetting code_gen_ptr), I wouldn't worry too much about this.
In the unlikely case that it ever became a problem, TCG profiling
time would account for it, and on a perf profile we'd see the slow
path in tb_link_page being taken.

Thanks,

                Emilio



reply via email to

[Prev in Thread] Current Thread [Next in Thread]