qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Making all TB invalidation asynchronous (MTTCG safe_wor


From: Alex Bennée
Subject: Re: [Qemu-devel] Making all TB invalidation asynchronous (MTTCG safe_work)?
Date: Thu, 25 Feb 2016 16:24:24 +0000
User-agent: mu4e 0.9.17; emacs 25.0.91.1

Frederic Konrad <address@hidden> writes:

> Hi Alex,
>
> We decided in Seattle to make this flag per tb (eg move it to the tb
> struct).

Indeed and looking at the state of Emilo's branch I see he did exactly
that and solved a lot of the lock contention problems.

>
> On 24/02/2016 18:30, Alex Bennée wrote:
>> Hi,
>>
>> So I've been working on reducing MTTCG tb_lock contention and currently
>> have a tb_lock around the following code (in my cpu_exec):
>>
>>     /* Note: we do it here to avoid a gcc bug on Mac OS X when
>>        doing it in tb_find_slow */
>>     tb_lock();
>>     if (tcg_ctx.tb_ctx.tb_invalidated_flag) {
>>         /* as some TB could have been invalidated because
>>            of memory exceptions while generating the code, we
>>            must recompute the hash index here */
>>         next_tb = 0;
>>         tcg_ctx.tb_ctx.tb_invalidated_flag = 0;
>>     }
>>     /* see if we can patch the calling TB. When the TB
>>        spans two pages, we cannot safely do a direct
>>        jump. */
>>     if (next_tb != 0 && tb->page_addr[1] == -1
>>         && !qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
>>         tb_add_jump((TranslationBlock *)(next_tb & ~TB_EXIT_MASK),
>>                     next_tb & TB_EXIT_MASK, tb);
>>     }
>>     tb_unlock();
>>
>> And this started me down the rabbit hole of the meaning of
>> tcg_ctx.tb_ctx.tb_invalidated_flag. So as far as I follow there are two
>> places this is set:
>>
>>  * We've run out of translation memory and we are throwing everything
>>    away (tb_alloc == NULL)
>>  * We've invalidated the physical pages of some TranslationBlocks
>>
>> The first case there is a slightly convoluted buffer overflow handing
>> code (tb_gen_code):
>>
>>     if (unlikely(!tb)) {
>>  buffer_overflow:
>>         /* flush must be done */
>>         tb_flush_safe(cpu);
>>         /* Don't forget to invalidate previous TB info.  */
>>         tcg_ctx.tb_ctx.tb_invalidated_flag = 1;
>>         tb_unlock();
>>         cpu_loop_exit(cpu);
>>     }
>>
>> Which I'm sure could be more simply handled by just queuing the safe
>> tb_flush and returning a NULL tb and letting the execution loop unwind
>> before resetting the translation buffers.
>>
>> The second case has been partially asynced by Fred:
>>
>>     /* invalidate one TB */
>>     void tb_phys_invalidate(TranslationBlock *tb, tb_page_addr_t page_addr)
>>     {
>>         CPUState *cpu;
>>         PageDesc *p;
>>         unsigned int h;
>>         tb_page_addr_t phys_pc;
>>         struct CPUDiscardTBParams *params;
>>
>>         assert_tb_lock(); /* added by me because of bellow */
>>
>>         /* remove the TB from the hash list */
>>         phys_pc = tb->page_addr[0] + (tb->pc & ~TARGET_PAGE_MASK);
>>         h = tb_phys_hash_func(phys_pc);
>>         tb_hash_remove(&tcg_ctx.tb_ctx.tb_phys_hash[h], tb);
>>
>>         /* remove the TB from the page list */
>>         if (tb->page_addr[0] != page_addr) {
>>             p = page_find(tb->page_addr[0] >> TARGET_PAGE_BITS);
>>             tb_page_remove(&p->first_tb, tb);
>>             invalidate_page_bitmap(p);
>>         }
>>         if (tb->page_addr[1] != -1 && tb->page_addr[1] != page_addr) {
>>             p = page_find(tb->page_addr[1] >> TARGET_PAGE_BITS);
>>             tb_page_remove(&p->first_tb, tb);
>>             invalidate_page_bitmap(p);
>>         }
>>
>>         tcg_ctx.tb_ctx.tb_invalidated_flag = 1;
>>
>>         CPU_FOREACH(cpu) {
>>             params = g_malloc(sizeof(struct CPUDiscardTBParams));
>>             params->cpu = cpu;
>>             params->tb = tb;
>>             async_run_on_cpu(cpu, cpu_discard_tb_from_jmp_cache, params);
>>         }
>>         async_run_safe_work_on_cpu(first_cpu, tb_invalidate_jmp_remove, tb);
>>
>>         tcg_ctx.tb_ctx.tb_phys_invalidate_count++;
>>     }
>>
>> But I'm wondering why we can't defer all the page invalidation to safe
>> work?
>>
>> I don't think it matters to the invalidating vCPU as it has to get
>> to the end of its block anyway. For other vCPUs as there is no strict
>> synchronisation can we not pretend what ever the operation was that
>> triggered the invalidation happened just as the block ended?
>>
>> The final case I don't quite follow is the avoiding invalidation of
>> tb_next in cpu_exec_nocache() if we have already caused a tb
>> invalidation event:
>>
>>     tb->orig_tb = tcg_ctx.tb_ctx.tb_invalidated_flag ? NULL : orig_tb;
>>
>> Which is later (in cpu_io_recompile):
>>
>>     if (tb->cflags & CF_NOCACHE) {
>>         if (tb->orig_tb) {
>>             /* Invalidate original TB if this TB was generated in
>>              * cpu_exec_nocache() */
>>             tb_phys_invalidate(tb->orig_tb, -1);
>>         }
>>         tb_free(tb);
>>     }
>>
>> My aim in all of this is to see if we can remove another flag from
>> tb_ctx (one less thing to mutex access to) and make the code flow easier
>> to follow. So remaining question:
>>
>> * Are there cases where not immediately invalidating the tb_page
>>   structures would cause problems for the emulation?
>
> Is that the same issue we might have with the memory barriers?
>
> Fred
>>
>> Thanks in advance for any elucidation ;-)
>>
>> --
>> Alex Bennée
>>


--
Alex Bennée



reply via email to

[Prev in Thread] Current Thread [Next in Thread]