qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Self-modifying test case for mttcg


From: Andrew Jones
Subject: Re: [Qemu-devel] Self-modifying test case for mttcg
Date: Thu, 23 Jul 2015 12:04:51 +0200
User-agent: Mutt/1.5.23.1 (2014-03-12)

On Thu, Jul 23, 2015 at 01:12:30AM +0200, Alexander Spyridakis wrote:
> Hello Andrew,
> 
> First, thanks for the comments.
> 
> On 22 July 2015 at 14:38, Andrew Jones <address@hidden> wrote:
> > I took a quick look at this and see issues with the test code. First,
> > you're spinning on a stack variable with this,
> >
> >     /* Wait for our turn */
> >     while(next_cpu != cpu);
> >
> > next_cpu needs to be global, and incremented atomically. I haven't gotten
> > around to adding atomic_add/inc yet, but it would easy, and I'm happy to
> > do it, even yet this week.
> 
> I tend to agree with what you are saying, but isn't a static local
> declaration forcing all CPUs to see the same variable (and not their
> instance)? Otherwise the test would hang indefinitely with multiple
> CPUs in this check.
> 
> About atomicity I thought of using a lock, but then I realized that
> there is no way to have a race condition. Let me explain in more
> detail:
> Variable 'next_cpu', as being static will be initialized to 0, making
> all CPUs, except CPU0, to spin. Since only CPU0 will update next_cpu
> (from 0 to 1), while all others are spinning, incrementing it
> atomically shouldn't be mandatory. Of course I might have missed
> something very obvious, so please correct me if I am wrong.

I didn't read the unit test closely enough, looked over the static.
You're right about it. I would actually make the test more smp-ish
though. Rather than force cpus to run one at a time, I would use atomic
ops, allowing the cpus to run in parallel.

> 
> > And, as for the MMU, I see from the comment in your test code that you're
> > hitting an exception when trying to modify code. This is because the code
> > is mapped readonly in order to use it from usermode. I suggest you modify
> > the page tables (see below for how) to map the code writeable. Do this
> > before kicking your secondary cpus, so they'll come up ready.
> 
> Indeed it is as you say. When I was experimenting with the MMU, by
> removing PTE_RDONLY in lib/arm/mmu.c:113 I was able to run the test
> without the need to disable it. At the time, I was uncertain how to do
> it properly without this hack (which was outside the test case), so
> instead I opted to disable the MMU. Your suggested way, to set it
> inside the test case, makes sense, so I will probably do it like that.
> 
> > There are other issues you'll need to fix as well though in the test code;
> > count should be initialized, result should be volatile, others? I suggest
> > you make sure it works for one vcpu first.
> 
> Again, 'count' is declared as static, which initializes it to 0. I
> played around with 'result' being volatile or not, but I didn't see
> any difference. I guess it is better to play it on the safe side,
> since this one is sensitive for the test.

Actually, now that I'm looking closer. You can scratch this suggestion.
As you only use result in an asm where you have an explicit store, it
doesn't need to be declared volatile.

> 
> > Just thought of another issue with the unit test. There's no isb()
> > following the code modification.
> 
> Good catch, I played around with dsb and isb, as well as
> local_flush_tlb_all() at the time, but I wasn't sure about if I am
> paranoid or really needed. Since you also mention this I will test
> more the use of an isb after modifying the opcode.

Right, the lack of isb() is the reason you have a printf in there as a
hack to make things work.

> 
> > I look forward to seeing your fixed up kvm-unit-test test posted. Please
> > CC me on it.
> 
> Thanks again for your input, I will try to update by early next week.
> 
> Best regards.
> 

One more observation; to make this work for aarch64 you'll need to modify
your instruction modification. The add immediate can't be changed with
+-1 like it can for arm. Please make sure the code works for both, even
if you need to do

#ifdef __arm__
asm volatile(arm asm)
#else
asm volatile(aarch64 asm)
#endif

Also, if you need a flip-flopping flag, why not use xor instead of
a counter mod 2?

Thanks,
drew



reply via email to

[Prev in Thread] Current Thread [Next in Thread]