qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] qemu vs gcc4


From: Rob Landley
Subject: Re: [Qemu-devel] qemu vs gcc4
Date: Tue, 31 Oct 2006 19:00:15 -0500
User-agent: KMail/1.9.1

On Tuesday 31 October 2006 5:08 pm, Paul Brook wrote:
> On Tuesday 31 October 2006 20:41, Rob Landley wrote:
> > Welcome to Stupid Question Theatre!  With your host, Paul Brook.  Today's
> > contestant is: Rob Landley.  How dumb will it get?

Bonus round!

> > I thought what you were doing was replacing the pregenerated blocks with
> > hand-coded assembly statements, but your description here seems to be 
about
> > changing the disassembly routines that figure out which qops to string
> > together in part 2.
> 
> Replacing the pregenerated blocks with hand written assembly isn't feasible. 
> Each target has its own set of ops, and each host would need its own
> assembly implementation of those ops. Multiply 11 targets by 11 hosts and
> you get a unmaintainable mess :-)

Actually it sounds additive rather than multiplicative.  Does each target have 
an entirely unrelated set of ops, or is there a shared set of primitive ops 
plus some oddballs?

But backing up and just accepting that for a moment, in theory what you need 
is some way to compile a C function to machine code, and then unwrap that 
function into a .raw file containing just the machine code.  So the only 
per-compiler thing would be this unwrapper thingy.  But I already know that 
doesn't work because it doesn't explain the "unable to find spill register" 
problem.  Presumably, just beating the right .raw contents out of the 
compiler is nontrivial, let alone unwrapping it...

> It corresponds to "T0" in dyngen. In addition to the actual CPU state, 
dyngen 
> uses 3 fixed register as scratch workspace. for qop purposes these are part 
> of the guest CPU state. They're only there to aid conversion of the 
> translation code, they'll go away eventually.

Presumably the m68k target is pure qop, and hasn't got this sort of thing?

> > My brain hurts a lot now.  I'm just letting you know.  What is all this
> > complication actually trying to accomplish?
> 
> Generation of 3 different things (QREG_* constants, the target_reginfo 
> structure, and qreg_names) from a single source. This avoid having to keep 3 
> big hairy arrays in sync with each other.
> It's also used implement 64-bit qregs as a pair of 32-bit qregs on 32-bit 
> hosts.

Ok, the QREG_* constants are for the intermediate code the decompiler stuff 
generates.  I have no idea what target_reginfo and qreg_names are for, but 
maybe it'll come to me as I read the code...

> > Or the value currently in a qreg has a type associated with it, but 
> > the next value stored in that qreg may have a different type?
> 
> A qreg has a fixed type. The value stored in that qreg has that type. To 
> convert it to a different type you need to use an explicit conversion qop.

So values don't have types, the qregs the values are _in_ have types.  But I 
thought there were an unlimited number of them (well, 1024 or so), and 
they're dynamically allocated (at least some of the time).  How does it keep 
track of the type of a given qreg?  (When you convert, you copy values from 
one qreg into another?)

> > Possible translation: you can feed a qreg containing an I64 value to a qop
> > taking an i32 argument, and it'll typecast the sucker down intelligently,
> > but if you produce an I32 result and expect to use that qreg's value as an
> > I64 argument later, you have to call a sign-extending qop on it first?
> 
> Exactly.
> If you mix I32,F32 and/or F64 in this way Bad Things will happen.

Presumably just the same kinds of Bad Things as "float f; *(int *)&f;"?

> > seeing end with _im which I presume means "immediate".  The alternative is
> > _cc, but what does that mean?  (Presumably not "closed captioned".)
> 
> _cc are variants that set the condition codes. I may have got T0 and T1 
> backwards in the first 3 lines.

Ah!

Is this written down anywhere?  I've read Fabrice's paper and the design 
documentation, and I'm not remembering this.  It's quite possible I missed it 
when my brain filled up, though.

> > Um, is my earlier characterization of "unwrapping stuff" at all close?
> 
> Not entirely. I'm also replacing fixed locations (T2) with dynamicall 
> allocated qregs.

The dynamic allocation buys you what?  (Less spilling?)

> > Ok, now I'm really lost.
> 
> Most x86 instructions set the condition code flags. However most of the time 
> these flags are ignored. eg. if you have to consecutive add instructions the 
> first will set the flags, and the second will immediately overwrite them.
> 
> qemu contains a back-propagation pass that will remove the code to set the 
> flags after the first instruction. Currently this is implemented by changing 
> an addl_cc op into a plain addl op.

I actually understood that.  Yay!

> The flag-setting code would most likely require several qops to implement,
> so  
> it would be much harder to prove it is not needed and remove it. So there is 
> a mechanism for adding extra target qops, doing the flag elimination pass, 
> then expanding those to generic qops.

Um, wouldn't the flag setting code be fairly straightforward as a qop that 
comes right _before_ the other op, as in "set the flags for doing this with 
these registers", that does nothing but set the flags (I.E. it wouldn't 
modify the contents of any the registers, so it could be immediately followed 
by the appropriate add or shift or so on), and then the flag setting pass 
could just turn all the ones that weren't needed into QOP_NULL?

Or is that what's happening now?  (Do QOPs ever modify their input registers, 
or only the output one?)

> > Ah, hang on.  There's target_reginfo in translate-all.c, that's using some
> > of the other values.  So what the heck does translate-all.c do?  (Shared
> > code called by all the platform-dependent translate functions?)
> 
> There are three fairly independent stages:
> 1) target-*/translate.c converts guest code into qops.
> 2) translate-all.c messes about with those qops a bit (allocates host 
> registers, etc).
> 3) translate-op.c,translate-qop.c and target-*/ turns those qops into host 
> code.

Is pass 2 where the flag elimination pass goes (and presumably any other 
optimizations that might get added)?  No, that can't be the case or the m68k 
code wouldn't need its own implementation of the flag elimination pass...

> > > For converting targets you can probably ignore most of the translate-all
> > > and host-*/ changes. These implement generating code from the qops.
> >
> > Ok, this implies that qops are a new thing.  Which looking at the code 
sort
> > of supports.  Which means I don't understand what's going on at all.
> 
> qops and dyngen ops are both small "functions" that are represented in a 
> similar way. The difference is that dyngen ops are target specific fixed 
> functions, whereas qops are generic parameterized functions.

So the 11x11 exponential complexity of qemu producing its own assembly output 
might not be as much of a problem after switching to qops?

Possibly some of the common qops can have an asm block for 'em, and the rest 
can go through the contortions target-*/op.c is currently doing with 
(glue(glue(blah))) and so on.

> While they are really separate things, the details have been chosen so it 
> should be possible to adapt the existing translate.c code rather than having 
> to rewrite it from scratch. Decoding x86 instruction semantics is 
> complicated :-)

Yay iterative transformation with regression testing.  (And nothing says 
regression testing like booting a Linux distro under the sucker.)

> Many of the simpler dyngen ops can be replaced with a single qop. Others can 
> be replaces with a sequence of a few qops. Some of the more complicated ones 
> may need to be moved into helper functions.

At some point, I hope to understand helper functions.  But I'm not there yet.

> > I need to re-read this later.  My brain's full and I'm deeply confused.
> 
> I started off by saying qops were effectively instructions for an imaginary 
> machine. translate-all.c rearranges them so they match up very closely with 
> the instructions available on the host. Once this has been done turning them 
> into binary code is relatively simple.

I sort of thought this is what it was already doing, but apparently not...

> > The implementation calls the appropriate host functions to handle the
> > floating point, using soft-float if necessary?  (Under the old dyngen 
thing
> > outputting blocks of gcc-produced code, I could understand how that works. 
> > But if you're outputting assembly directly...  I'm back in the "totally
> > lost" aread again, I think.)
> 
> Err, sort of. There's a couple of different layers.
> 
> In translate.c you'll do something like
> 
>   tmp = gen_new_qreg(QMODE_F32);
>   gen_op_addf32(tmp, QREG_FOO, QREG_BAR).
> 
> If the host implements the floating point qops 'natively' then this will 
work 
> exactly the same as the integer qops and end up as host floating point 
> instructions. Currently this is not implemented for any hosts.

Ok.

> If native host FP is not available qemu will include appropriate bits so 
that 
> after macro expansion and inlining you end up with:
> 
>   tmp = gen_new_qreg(QMODE_I32);
>   gen_op_helper(HELPER_addf32, tmp, QREG_FOO, QREG_BAR).
> 
> and the addf32 helper does the floating point addition using the "softfloat" 
> library. The qemu softfloat library implementation may actually use hardware 
> floating point rather than doing everything manually.

No reason (except speed) the code output into a translation block can't do 
function calls.  I think.

> Likewise if the host doesn't have 64-bit operations gen_op_and64 will 
actually 
> expand to a pair of and32 operations.

Ok.

I'm still trying to follow a translation all the way from source to target.  
Just getting application emulation to do "hello world" is pretty darn 
complicated.  Your dump mode earlier sounded highly interesting.  (It's on my 
todo list.)

Rob
-- 
"Perfection is reached, not when there is no longer anything to add, but
when there is no longer anything to take away." - Antoine de Saint-Exupery




reply via email to

[Prev in Thread] Current Thread [Next in Thread]