qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 0/6] trace: [tcg] Optimize per-vCPU tracing s


From: Richard Henderson
Subject: Re: [Qemu-devel] [PATCH v3 0/6] trace: [tcg] Optimize per-vCPU tracing states with separate TB caches
Date: Fri, 23 Dec 2016 12:09:24 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1

On 12/23/2016 10:51 AM, Lluís Vilanova wrote:
>> On 12/22/2016 10:35 AM, Lluís Vilanova wrote:
>>> To handle both issues, this series replicates the shared physical TB cache,
>>> creating a separate physical TB cache for every combination of event states
>>> (those with the 'vcpu' and 'tcg' properties). Then, all vCPUs tracing the 
>>> same
>>> events will use the same physical TB cache.
> 
>> Why do we need to "split the physical TB cache" as opposed to simply 
>> including
>> the trace state into the TB hash function?
> 
> Mmmm, that's an interesting alternative I did not consider. Are you aiming at
> minimizing the changes, or do you also think it would be more efficient?

I suspect that it will be more efficient.

> The dynamic tracing state would then be an arbitrarily long bitmap (defined by
> the number of events with the 'vcpu' property), so I'm not sure how to fit it
> into the hashing function with minimal collisions (the bitmap is now limited 
> to
> an unsigned long to use it as an index to the TB cache "matrix").

You could consider that index a unique identifier for the tracing state, and
then only compare and hash that integer.

> The other drawback I see is that then it would also take longer to compute the
> hashing function, instead of the simpler array indexing. As a benefit, 
> workloads
> with a high frequency of TB-flushing operations might be a bit faster (there
> would be a single QHT).

I don't see adding one more integer to the hashing function to be significant
at all.  Certainly not the 15% that you describe in your cover letter.

> If someone can provide me the code for the modified hash lookup function to
> account for the trace dstate bitmap contents, I will integrate it and measure 
> if
> there is any important change in performance.

Something like the following should do it.  There are two /* cpu->??? */
markers that would need to be filled in.

If you can reduce the tracing identifier to 8 bits, that would be excellent.
I've been wanting to make some other changes to TB hashing, and that would fit
in well with a second "flags" value.


r~

Attachment: z
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]