qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] KVM and variable-endianness guest CPUs


From: Avi Kivity
Subject: Re: [Qemu-devel] KVM and variable-endianness guest CPUs
Date: Tue, 28 Jan 2014 11:04:54 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0

On 01/22/2014 12:22 PM, Peter Maydell wrote:
On 22 January 2014 05:39, Victor Kamensky <address@hidden> wrote:
Hi Guys,

Christoffer and I had a bit heated chat :) on this
subject last night. Christoffer, really appreciate
your time! We did not really reach agreement
during the chat and Christoffer asked me to follow
up on this thread.
Here it goes. Sorry, it is very long email.

I don't believe we can assign any endianity to
mmio.data[] byte array. I believe mmio.data[] and
mmio.len acts just memcpy and that is all. As
memcpy does not imply any endianity of underlying
data mmio.data[] should not either.
This email is about five times too long to be actually
useful, but the major issue here is that the data being
transferred is not just a bag of bytes. The data[]
array plus the size field are being (mis)used to indicate
that the memory transaction is one of:
  * an 8 bit access
  * a 16 bit access of some uint16_t value
  * a 32 bit access of some uint32_t value
  * a 64 bit access of some uint64_t value

exactly as a CPU hardware bus would do. It's
because the API is defined in this awkward way with
a uint8_t[] array that we need to specify how both
sides should go from the actual properties of the
memory transaction (value and size) to filling in the
array.

That is not how x86 hardware works. Back when there was a bus, there were no address lines A0-A2; instead we had 8 byte enables BE0-BE7. A memory transaction placed the qword address on the address lines and asserted the byte enables for the appropriate byte, word, dword, or qword, shifted for the low order bits of the address.

If you generated an unaligned access, the transaction was split into two, so an 8-byte write might appear as a 5-byte write followed by a 3-byte write. In fact, the two halves of the transaction might go to different devices, or one might go to a device and another to memory.

PCI works the same way.




Furthermore, device endianness is entirely irrelevant
for deciding the properties of mmio.data[], because the
thing we're modelling here is essentially the CPU->bus
interface. In real hardware, the properties of individual
devices on the bus are irrelevant to how the CPU's
interface to the bus behaves, and similarly here the
properties of emulated devices don't affect how KVM's
interface to QEMU userspace needs to work.

MemoryRegion's 'endianness' field, incidentally, is
a dreadful mess that we should get rid of. It is attempting
to model the property that some buses/bridges have of
doing byte-lane-swaps on data that passes through as
a property of the device itself. It would be better if we
modelled it properly, with container regions having possible
byte-swapping and devices just being devices.


No, that is not what it is modelling.

Suppose a little endian cpu writes a dword 0x12345678 to address 0 of a device, and read back a byte from address 0. What value do you read back?

Some (most) devices will return 0x78, others will return 0x12. Other devices don't support mixed sizes at all, but many do. PCI configuration space is an example; it is common to read both Device ID and Vendor ID with a single 32-bit transaction, but you can also read them separately with two 16-bit transaction. Because PCI is little-endian, the Vendor ID at address 0 will be returned as the low word of the 32-bit read of a little-endian processor.

If you remove device endianness from memory regions, you have to pass the data as arrays of bytes (like the KVM interface) and let the device assemble words from those bytes itself, taking into consideration its own endianness. What MemoryRegion's endianness does is let the device declare its endianness to the API and let it do all the work.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]