qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] MTTCG Tasks (kvmforum summary)


From: dovgaluk
Subject: Re: [Qemu-devel] MTTCG Tasks (kvmforum summary)
Date: Fri, 04 Sep 2015 16:10:05 +0300
User-agent: Roundcube Webmail/1.1.2

Lluís Vilanova писал 2015-09-04 16:00:
Mark Burton writes:
[...]
* What to do about icount?

What is the impact of multi-thread on icount? Do we need to disable it for MTTCG or can it be correct per-cpu? Can it be updated lock-step?

We need some input from the guys that use icount the most.

That means Edgar. :)

Hi!

IMO it would be nice if we could run the cores in some kind of lock-step
with a configurable amount of instructions that they can run ahead
of time (X).

For example, if X is 10000, every thread/core would checkpoint at
10000 insn boundaries and wait for other cores. Between these
checkpoints, the cores will not be in sync. We might need to
consider synchronizing at I/O accesses aswell to avoid weird
timing issues when reading counter registers for example.

Of course the devil will be in the details but an approach roughly
like that sounds useful to me.

And “works" in other domains.
Theoretically we dont need to sync at IO (Dynamic quantums), for most systems that have ’normal' IO its normally less efficient I believe. However, the trouble is, the user typically doesn’t know, and mucking about with quantum lengths, dynamic quantum switches etc is probably a royal pain in the butt. And if you dont set your quantum right, the thing will run really slowly (or will
break)…

The choices are a rock or a hard place. Dynamic quantums risk to be slow (you’ll be forcing an expensive ’sync’ - all CPU’s will have to exit etc) on each IO access from each core…. not great. Syncing with host time (e.g. each CPU tries to sync with host clock as best it can) will fail when one or other CPU can’t keep up…. In the end you end up with leaving the user with a nice long bit of
string and a message saying “hang yourself here”.

That price would not be paid when icount is disabled. Well, the code complexity
price is always paid... I meant runtime :)

Then, I think this depends on what type of guarantees you require from
icount. I see two possible semantics:

* All CPUs are *exactly* synchronized at icount granularity

  This means that every icount instructions everyone has to stop and
  synchronize.

* All CPUs are *loosely* synchronized at icount granularity

You can implement it in a way that ensures that every cpu has *at least*
  reached a certain timestamp. So cpus can keep on running nonetheless.


Is the third possibility looks sane?

* All CPUs synchronize at shared memory operations.
When somebody tries to read/write shared memory, it should wait until all others
  will reach the same icount.

The downside is that the latter loses the ability for reproducible runs, which IMHO are useful. A more complex option is to merge both: icount sets the "synchronization granularity" and another parameter sets the maximum delta between cpus (i.e., set it to 0 to have the first option, and infinite for the
second).


Pavel Dovgalyuk



reply via email to

[Prev in Thread] Current Thread [Next in Thread]