qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Qemu-ppc] [PATCH v3 11/14] ioport: Switch dispatching


From: Peter Maydell
Subject: Re: [Qemu-devel] [Qemu-ppc] [PATCH v3 11/14] ioport: Switch dispatching to memory core layer
Date: Sat, 13 Jul 2013 00:10:56 +0100

On 12 July 2013 23:50, Benjamin Herrenschmidt <address@hidden> wrote:
> On Fri, 2013-07-12 at 19:26 +0100, Peter Maydell wrote:
>> It's not wrong when the hardware actually does a byteswap at
>> some level in the memory hierarchy. You can see this for instance
>> on ARMv7M systems, where byteswapping for bigendian happens at
>> an intermediate level that not all accesses go through:
>>
>>  [CPU] ---->  [byteswap here] --> [memory and ext. devices]
>>          |
>>          -->  [internal memory mapped devices]
>>
>> so some things see always little endian regardless.
>
> Ugh ? That's so completely fucked up, if that's indeed what the HW is
> doing this is a piece of trash and the designers are in urgent need of
> being turned into fertilizer.
>
> Unless again you are talking about "lane swapping" which allows to
> preserve the byte address invariance when the CPU decides to flip its
> bus around, but I would have thought that modern CPUs do not do that
> sort of shit anymore.

The block marked "byteswap here" does "byte invariant bigendian",
so byte accesses are unchanged, 16 bit accesses have the two words
of data flipped, and 32 bit accesses have the four bytes flipped;
this happens as the data passes through; addresses are unchanged.
It only happens if the CPU is configured by the guest to operate
in big-endian mode, obviously.
(Contrast 'word invariant bigendian', which is what ARM used to do,
where the addresses are changed but the data is not. That would be
pretty painful to implement in the memory region API though it is
of course trivial in hardware since it is just XORing of the low
address bits according to the access size...)

> Again, the only endian
> attribute that exists are the byte order of the original access (which
> byte has the lowest address, regardless of significance of those bytes
> in the target, ie, purely from a qemu standpoint, in the variable that
> carries the access around inside qemu, which byte has the lowest
> address)

What does this even mean? At the point where a memory access leaves
the CPU (emulation or real hardware) it has (a) an address and
(b) a width -- a 16 bit access is neither big nor little endian,
it's just a request for 16 bits of data (on real hardware it's
typically a bus transaction on a bunch of data lines with some
control lines indicating transaction width). Now the CPU emulation
may internally be intending to put that data into its emulated
register one way round or the other, but that's an internal detail
of the CPU emulation. (Similarly for stores.)

>, and the same on the target device (at which point a concept of
> significance does apply, but it's a guest driver business to get it
> right, qemu just need to make sure byte 0 goes to byte 0).

Similarly, at the target device end there is no concept
of a "big endian access" -- we make a request for 16
bits of data at a particular address (via the MemoryRegion
API) and the device returns 16 bits of data. It's entirely
possible to design hardware so that byte access to address
X, halfword access to address X and word access to address
X all return entirely different data (though it would be
a bit perverse.) (As an implementation convenience we may
choose to provide helper infrastructure so you don't have
to actually implement all of byte/halfword/word access by hand.)

> If a bridge flips things around in a way that breaks the
> model,

That breaks what model?

> then add some property describing the flipping
> properties but don't call it "big
> endian" or "little endian" at the bridge level, that has no meaning,
> confuses things and introduces breakage like we have seen.

I'm happy to call the property "byteswap", yes, because
that's what it does. If you did two of these in a row you'd
get a no-op.

>> (Our other serious endianness problem is that we don't really
>> do very well at supporting a TCG CPU arbitrarily flipping
>> endianness -- TARGET_WORDS_BIGENDIAN is a compile time setting
>> and ideally it should not be.)
>
> Our experience is that it actually works fine for almost everything
> except virtio :-) ie mostly TARGET_WORDS_BIGENDIAN is irrelevant (and
> should be).

I agree that TARGET_WORDS_BIGENDIAN *should* go away, but
it exists currently. Do you actually implement a CPU which
does dynamic endianness flipping? Is it at all efficient
in the config which is the opposite of whatever
TARGET_WORDS_BIGENDIAN says?

thanks
-- PMM



reply via email to

[Prev in Thread] Current Thread [Next in Thread]