qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] How to implement different endian accesses per MMU page


From: Richard Henderson
Subject: Re: [Qemu-devel] How to implement different endian accesses per MMU page?
Date: Tue, 15 Aug 2017 11:10:18 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1

[CC Peter re MemTxAttrs below]

On 08/15/2017 09:38 AM, Mark Cave-Ayland wrote:
> Working through an incorrect endian issue on qemu-system-sparc64, it has
> become apparent that at least one OS makes use of the IE (Invert Endian)
> bit in the SPARCv9 MMU TTE to map PCI memory space without the
> programmer having to manually endian-swap accesses.
> 
> In other words, to quote the UltraSPARC specification: "if this bit is
> set, accesses to the associated page are processed with inverse
> endianness from what is specified by the instruction (big-for-little and
> little-for-big)".
> 
> Looking through various bits of code, I'm trying to get a feel for the
> best way to implement this in an efficient manner. From what I can see
> this could be solved using an additional MMU index, however I'm not
> overly familiar with the memory and softmmu subsystems.

No, it can't be solved with an MMU index.

> Can anyone point me in the right direction as to what would be the best
> way to implement this feature within QEMU?

It's definitely tricky.

We definitely need some TLB_FLAGS_MASK bit set so that we're forced through the
memory slow path.  There is no other way to bypass the endianness that we've
already encoded from the target instruction.

Given the tlb_set_page_with_attrs interface, I would think that we need a new
bit in MemTxAttrs, so that the target/sparc tlb_fill (and subroutines) can pass
along the TTE bit for the given page.

We have an existing problem in softmmu_template.h,

    /* ??? Note that the io helpers always read data in the target
       byte ordering.  We should push the LE/BE request down into io.  */
    res = glue(io_read, SUFFIX)(env, mmu_idx, index, addr, retaddr);
    res = TGT_BE(res);

We do not want to add a third(!) byte swap along the i/o path.  We need to
collapse the two that we have already before considering this one.

This probably takes the form of:

(1) Replacing the "int size" argument with "TCGMemOp memop" for
      a) io_{read,write}x in accel/tcg/cputlb.c,
      b) memory_region_dispatch_{read,write} in memory.c,
      c) adjust_endianness in memory.c.
    This carries size+sign+endianness down to the next level.

(2) In memory.c, adjust_endianness,

     if (memory_region_wrong_endianness(mr)) {
-        switch (size) {
+        memop ^= MO_BSWAP;
+    }
+    if (memop & MO_BSWAP) {

    For extra credit, re-arrange memory_region_wrong_endianness
    to something more explicit -- "wrong" isn't helpful.

(3) In tlb_set_page_with_attrs, notice attrs.byte_swap and set
    a new TLB_FORCE_SLOW bit within TLB_FLAGS_MASK.

(4) In io_{read,write}x, if iotlbentry->attrs.byte_swap is set,
    then memop ^= MO_BSWAP.



r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]