[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] How to implement different endian accesses per MMU page
From: |
Mark Cave-Ayland |
Subject: |
Re: [Qemu-devel] How to implement different endian accesses per MMU page? |
Date: |
Wed, 1 Nov 2017 19:15:39 +0000 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 |
On 15/08/17 19:10, Richard Henderson wrote:
> [CC Peter re MemTxAttrs below]
>
> On 08/15/2017 09:38 AM, Mark Cave-Ayland wrote:
>> Working through an incorrect endian issue on qemu-system-sparc64, it has
>> become apparent that at least one OS makes use of the IE (Invert Endian)
>> bit in the SPARCv9 MMU TTE to map PCI memory space without the
>> programmer having to manually endian-swap accesses.
>>
>> In other words, to quote the UltraSPARC specification: "if this bit is
>> set, accesses to the associated page are processed with inverse
>> endianness from what is specified by the instruction (big-for-little and
>> little-for-big)".
>>
>> Looking through various bits of code, I'm trying to get a feel for the
>> best way to implement this in an efficient manner. From what I can see
>> this could be solved using an additional MMU index, however I'm not
>> overly familiar with the memory and softmmu subsystems.
>
> No, it can't be solved with an MMU index.
>
>> Can anyone point me in the right direction as to what would be the best
>> way to implement this feature within QEMU?
>
> It's definitely tricky.
>
> We definitely need some TLB_FLAGS_MASK bit set so that we're forced through
> the
> memory slow path. There is no other way to bypass the endianness that we've
> already encoded from the target instruction.
>
> Given the tlb_set_page_with_attrs interface, I would think that we need a new
> bit in MemTxAttrs, so that the target/sparc tlb_fill (and subroutines) can
> pass
> along the TTE bit for the given page.
>
> We have an existing problem in softmmu_template.h,
>
> /* ??? Note that the io helpers always read data in the target
> byte ordering. We should push the LE/BE request down into io. */
> res = glue(io_read, SUFFIX)(env, mmu_idx, index, addr, retaddr);
> res = TGT_BE(res);
>
> We do not want to add a third(!) byte swap along the i/o path. We need to
> collapse the two that we have already before considering this one.
>
> This probably takes the form of:
>
> (1) Replacing the "int size" argument with "TCGMemOp memop" for
> a) io_{read,write}x in accel/tcg/cputlb.c,
> b) memory_region_dispatch_{read,write} in memory.c,
> c) adjust_endianness in memory.c.
> This carries size+sign+endianness down to the next level.
>
> (2) In memory.c, adjust_endianness,
>
> if (memory_region_wrong_endianness(mr)) {
> - switch (size) {
> + memop ^= MO_BSWAP;
> + }
> + if (memop & MO_BSWAP) {
>
> For extra credit, re-arrange memory_region_wrong_endianness
> to something more explicit -- "wrong" isn't helpful.
Finally I've had a bit of spare time to experiment with this approach,
and from what I can see there are currently 2 issues:
1) Using TCGMemOp in memory.c means it is no longer accelerator agnostic
For the moment I've defined a separate MemOp in memory.h and provided a
mapping function in io_{read,write}x to map from TCGMemOp to MemOp and
then pass that into memory_region_dispatch_{read,write}.
Other than not referencing TCGMemOp in the memory API, another reason
for doing this was that I wasn't convinced that all the MO_ attributes
were valid outside of TCG. I do, of course, strongly defer to other
people's knowledge in this area though.
2) The above changes to adjust_endianness() fail when
memory_region_dispatch_{read,write} are called recursively
Whilst booting qemu-system-sparc64 I see that
memory_region_dispatch_{read,write} get called recursively - once via
io_{read,write}x and then again via flatview_read_continue() in exec.c.
The net effect of this is that we perform the bswap correctly at the
tail of the recursion, but then as we travel back up the stack we hit
memory_region_dispatch_{read,write} once again causing a second bswap
which means the value is returned with the incorrect endian again.
My understanding from your softmmu_template.h comment above is that the
memory API should do the endian swapping internally allowing the removal
of the final TGT_BE/TGT_LE applied to the result, or did I get this wrong?
> (3) In tlb_set_page_with_attrs, notice attrs.byte_swap and set
> a new TLB_FORCE_SLOW bit within TLB_FLAGS_MASK.
>
> (4) In io_{read,write}x, if iotlbentry->attrs.byte_swap is set,
> then memop ^= MO_BSWAP.
ATB,
Mark.
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: [Qemu-devel] How to implement different endian accesses per MMU page?,
Mark Cave-Ayland <=