[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] KVM and variable-endianness guest CPUs
From: |
Christoffer Dall |
Subject: |
Re: [Qemu-devel] KVM and variable-endianness guest CPUs |
Date: |
Thu, 23 Jan 2014 18:14:25 -0800 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On Thu, Jan 23, 2014 at 04:50:18PM -0800, Victor Kamensky wrote:
> On 23 January 2014 12:45, Christoffer Dall <address@hidden> wrote:
> > On Thu, Jan 23, 2014 at 08:25:35AM -0800, Victor Kamensky wrote:
> >> On 23 January 2014 07:33, Peter Maydell <address@hidden> wrote:
> >> > On 23 January 2014 15:06, Victor Kamensky <address@hidden> wrote:
> >> >> In [1] I wrote
> >> >>
> >> >> "I don't see why you so attached to desire to describe
> >> >> data part of memory transaction as just one of int
> >> >> types. If we are talking about bunch of hypothetical
> >> >> cases imagine such bus that allow transaction with
> >> >> size of 6 bytes. How do you describe such data in
> >> >> your ints speak? What endianity you can assign to
> >> >> sequence of 6 bytes? While note that description of
> >> >> such transaction as set of 6 byte values at address
> >> >> $whatever makes perfect sense."
> >> >>
> >> >> But notice that in your next reply [2] you just dropped it
> >> >
> >> > Yes. This is because it was one of the places where
> >> > I would have just had to repeat "no, I'm afraid you're wrong
> >> > about how hardware works". I think in general it's going
> >> > to be better if I don't try to reply point by point to this
> >> > email; I think you should go back and reread the emails I've
> >> > sent. Key points:
> >> > (1) hardware is not doing anything involving arrays
> >> > of bytes
> >>
> >> Array of bytes or integers is just a way to describe data lines
> >> on the bus. Did you look at this document?
> >>
> >> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0290g/ch06s05s01.html
> >>
> >> A0, A1, ,,, A7 byte values are the same for both LE and BE-8
> >> case (first two columns in the table) and they unambiguously
> >> describe data bus signals
> >>
> >
> > The point is simple, and Peter has made it over and over:
> > Any consumer of a memory operation sees "value, len, address".
>
> and "endianess" of operation.
no, value is value, is value. By a consumer I mean whatever sits and
the end of the memory bus. There is no endianness.
>
> here is memory operation
>
> *(int *) (0x1000) = 0x01020304;
this is from the CPU's perspective and involves specifics of a
programming language and a compiler. You cannot compare to the above.
>
> can you tell how memory will look like at 0x1000 address - you can't
> in LE it will look one way in BE byteswapped.
>
> > This is what KVM_EXIT_MMIO emulates. So just by knowing the ABI
> > definition and having a pointer to the structure you need to be able to
> > tell me "value, len, address".
> >
> >> > (2) the API between kernel and userspace needs to define
> >> > the semantics of mmio.data, ie how to map between
> >> > "x byte wide transaction with value v" and the array,
> >> > and that is primarily what this conversation is about
> >> > (3) the only choice which is both (a) sensible and (b)
> >> > not breaking existing usage is to say "the array is
> >> > in host-kernel-byte-order"
> >> > (4) PPC CPUs in BE mode and ARM CPUs in BE mode are not
> >> > the same, because in the ARM case it is doing an
> >> > internal-to-CPU byteswap, and in the PPC case it is not
> >>
> >> That is one of the key disconnects. I'll go find real examples
> >> in ARM LE, ARM BE, and PPC BE Linux kernel. Just for
> >> everybody sake's here is summary of the disconnect:
> >>
> >> If we have the same h/w connected to memory bus in ARM
> >> and PPC systems and we have the following three pieces
> >> of code that work with r0 having same device same
> >> register address:
> >>
> >> 1. ARM LE word write of 0x04030201:
> >> setend le
> >> mov r1, #0x04030201
> >> str r1, [r0]
> >>
> >> 2. ARM BE word write of 0x01020304:
> >> setend be
> >> mov r1, #0x01020304
> >> str r1, [r0]
> >>
> >> 3. PPC BE word write of 0x01020304:
> >> lis r1,0x102
> >> ori r1,r1,0x304
> >> stw r1,0(r0)
> >>
> >> I claim that h/w will see the same data on bus lines in all
> >> three cases, and h/w would acts the same in all three
> >> cases. Peter says that ARM BE and PPC BE case h/w
> >> would act differently.
> >>
> >> If anyone else can offer opinion on that while I am looking
> >> for real examples that would be great.
> >>
> >
> > I really don't think listing all these examples help.
>
> I think Peter is wrong in his understanding how real
> BE PPC kernel drivers work with h/w mapped devices. Going
> with such misunderstanding to suggest how it should hold
> info in emulated mmio case is quite strange.
>
> > You need to focus
> > on the key points that Peter listed in his previous mail.
> >
> > I tried in our chat to ask you this questions:
> >
> > vcpu_data_host_to_guest() is handling a read from an emulated device.
> > All the info you have is:
> > (1) len of memory access
> > (2) mmio.data pointer
> > (3) destination register
> > (4) host CPU endianness
> > (5) guest CPU endianness
> >
> > Based on this information alone, you need to decide whether you do a
> > byteswap or not before loading the hardware register upon returning to
> > the guest.
> >
> > You will find it impossible to answer, because you don't know the layout
> > of mmio.data, and that is the thing we are trying to solve.
>
> Actually I am not arguing with above. I agree that
> meaning of mmio.data should be better clarified.
Progress, \o/
>
> I propose my clarification as array of bytes at
> phys_addr address on BE-8,
> byte invariant, memory bus.
> That unambiguously
> describes data bus signals in case of BE-8 memory
> bus. Please look at
>
> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0290g/ch06s05s01.html
>
> first two columns LE and BE-8, If one will specify all
> values for A0, A1, ... A7 it will define all bus signals.
> Note that is the only endian agnostic way to describe data
> bus signals. If one would try to describe them in
> half-word[s], word[s], double word one need to tell what
> endianity of those integers (other columns in document
> table).
This is a reference to an arm1136 processor specification, which does
not support virtualization. We are discussing a generic kernel
interface documentation, I'm afraid it's not useful.
>
> Peter claims that "I don't understand how h/w bus works".
> I disagree with that. I gave pointer on document that describes
> how BE-8, byte invariant, memory bus works. I would
> appreciate pointer to document, section and page that
> describes Peter's memory bus operation understanding.
I choose to trust that Peter understands very well how a h/w bus works.
I am not sure the documentation you request exists in public.
I, however, understand how KVM works, and I understand how the
kernel<->user ABI works, and you still haven't been able to express your
proposal in a concise, understandable, generic maner that works for a
KVM ABI specification, unfortunately.
>
> I pointed that Peter's proposal would have the following issue:
> BE qemu will have to act differently depending on CPU
> type while emulating the same device. If Peter's proposal is
> accepted n qemu code should do something like:
>
> #ifdef WORD_BIGENDIAN
> #ifdef __PPC_
> do one thing
> #else
> do another
> #endif
> #endif
No, your device has an operation: write32(u32 val)
That's it, Peter suggests having proper device containers in QEMU that
translate from bus native endianness to the device native endianness.
>
> there reason for that because the same device write in mmio
> will look like this:
>
> ARM LE mmio.data[] = { 0x01, 0x02, 0x03, 0x04 }
> ARM BE mmio.data[] = { 0x04, 0x03, 0x02, 0x01 }
> PPC LE mmio.data[] = { 0x04, 0x03, 0x02, 0x01 }
> PPC BE mmio.data[] = { 0x01, 0x02, 0x03, 0x04 }
>
> for ARM BE and PPC BE arrays are different even
> it is just BE case, so code would need to '#if ARM'
> thing
I don't understand this, sorry.
>
> If you follow my proposal to clarify mmio.data[] meaning
> mmio.data[] array will look the same in all 4 cases,
> compatible with current usage.
I still don't know what your proposal is, "array will look the same in
all 4 cases" is not a definition that I can use to interpret the data
written.
The semantics you need to be able to describe is that of the memory
operation: for example, store a word, not store an array of bytes.
>
> If Peter's proposal is adopted ARM BE and PPC LE cases
> would be penalized with excessive back and forth
> byteswaps. That is possible to avoid with my proposal.
>
I would take 50 byteswaps with a clear ABI any day over an obscure
standard that can avoid a single hardware-on-register instruction. This
is about designing a clean software interface, not about building an
optimized integrated stack.
Unfortunately, this is going nowhere, so I think we need to stop this
thread. As you can see I have sent a patch as a clarification to the
ABI, if it's merged we can move on with more important tasks.
Thanks,
-Christoffer
- Re: [Qemu-devel] KVM and variable-endianness guest CPUs, (continued)
- Re: [Qemu-devel] KVM and variable-endianness guest CPUs, Peter Maydell, 2014/01/22
- Re: [Qemu-devel] KVM and variable-endianness guest CPUs, Victor Kamensky, 2014/01/22
- Re: [Qemu-devel] KVM and variable-endianness guest CPUs, Peter Maydell, 2014/01/22
- Re: [Qemu-devel] KVM and variable-endianness guest CPUs, Victor Kamensky, 2014/01/22
- Re: [Qemu-devel] KVM and variable-endianness guest CPUs, Peter Maydell, 2014/01/23
- Re: [Qemu-devel] KVM and variable-endianness guest CPUs, Victor Kamensky, 2014/01/23
- Re: [Qemu-devel] KVM and variable-endianness guest CPUs, Peter Maydell, 2014/01/23
- Re: [Qemu-devel] KVM and variable-endianness guest CPUs, Victor Kamensky, 2014/01/23
- Re: [Qemu-devel] KVM and variable-endianness guest CPUs, Christoffer Dall, 2014/01/23
- Re: [Qemu-devel] KVM and variable-endianness guest CPUs, Victor Kamensky, 2014/01/23
- Re: [Qemu-devel] KVM and variable-endianness guest CPUs,
Christoffer Dall <=
- Re: [Qemu-devel] KVM and variable-endianness guest CPUs, Victor Kamensky, 2014/01/23
- Re: [Qemu-devel] [Qemu-ppc] KVM and variable-endianness guest CPUs, Benjamin Herrenschmidt, 2014/01/27
- Re: [Qemu-devel] [Qemu-ppc] KVM and variable-endianness guest CPUs, Christoffer Dall, 2014/01/27
- Re: [Qemu-devel] [Qemu-ppc] KVM and variable-endianness guest CPUs, Benjamin Herrenschmidt, 2014/01/27
- Re: [Qemu-devel] KVM and variable-endianness guest CPUs, Victor Kamensky, 2014/01/23
- Re: [Qemu-devel] [Qemu-ppc] KVM and variable-endianness guest CPUs, Benjamin Herrenschmidt, 2014/01/27
- Re: [Qemu-devel] [Qemu-ppc] KVM and variable-endianness guest CPUs, Benjamin Herrenschmidt, 2014/01/27
- Re: [Qemu-devel] [Qemu-ppc] KVM and variable-endianness guest CPUs, Benjamin Herrenschmidt, 2014/01/27
- Re: [Qemu-devel] [Qemu-ppc] KVM and variable-endianness guest CPUs, Peter Maydell, 2014/01/27
- Re: [Qemu-devel] [Qemu-ppc] KVM and variable-endianness guest CPUs, Benjamin Herrenschmidt, 2014/01/27