qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 2/3] s390: implement pci instructions


From: Alexander Graf
Subject: Re: [Qemu-devel] [PATCH 2/3] s390: implement pci instructions
Date: Wed, 12 Nov 2014 10:22:48 +0100
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:31.0) Gecko/20100101 Thunderbird/31.2.0


On 12.11.14 10:19, Frank Blaschka wrote:
> On Wed, Nov 12, 2014 at 10:08:19AM +0100, Alexander Graf wrote:
>>
>>
>> On 12.11.14 09:49, Frank Blaschka wrote:
>>> On Tue, Nov 11, 2014 at 04:24:24PM +0100, Alexander Graf wrote:
>>>>
>>>>
>>>> On 11.11.14 15:08, Frank Blaschka wrote:
>>>>> On Tue, Nov 11, 2014 at 01:51:25PM +0100, Alexander Graf wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Am 11.11.2014 um 13:39 schrieb Frank Blaschka <address@hidden>:
>>>>>>>
>>>>>>>> On Tue, Nov 11, 2014 at 01:16:04PM +0100, Alexander Graf wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>> On 11.11.14 13:10, Frank Blaschka wrote:
>>>>>>>>>> On Mon, Nov 10, 2014 at 04:56:21PM +0100, Alexander Graf wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> On 10.11.14 15:20, Frank Blaschka wrote:
>>>>>>>>>>> From: Frank Blaschka <address@hidden>
>>>>>>>>>>>
>>>>>>>>>>> This patch implements the s390 pci instructions in qemu. It allows
>>>>>>>>>>> to access and drive pci devices attached to the s390 pci bus.
>>>>>>>>>>> Because of platform constrains devices using IO BARs are not
>>>>>>>>>>> supported. Also a device has to support MSI/MSI-X to run on s390.
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Frank Blaschka <address@hidden>
>>>>>>>>>>> ---
>>>>>>>>>>> target-s390x/Makefile.objs |   2 +-
>>>>>>>>>>> target-s390x/kvm.c         |  52 ++++
>>>>>>>>>>> target-s390x/pci_ic.c      | 753 
>>>>>>>>>>> +++++++++++++++++++++++++++++++++++++++++++++
>>>>>>>>>>> target-s390x/pci_ic.h      | 335 ++++++++++++++++++++
>>>>>>>>>>> 4 files changed, 1141 insertions(+), 1 deletion(-)
>>>>>>>>>>> create mode 100644 target-s390x/pci_ic.c
>>>>>>>>>>> create mode 100644 target-s390x/pci_ic.h
>>>>>>>>>>>
>>>>>>>>
>>>>>>>> [...]
>>>>>>>>
>>>>>>>>>>> +int kvm_pcilg_service_call(S390CPU *cpu, struct kvm_run *run)
>>>>>>>>>>> +{
>>>>>>>>>>> +    CPUS390XState *env = &cpu->env;
>>>>>>>>>>> +    S390PCIBusDevice *pbdev;
>>>>>>>>>>> +    uint8_t r1 = (run->s390_sieic.ipb & 0x00f00000) >> 20;
>>>>>>>>>>> +    uint8_t r2 = (run->s390_sieic.ipb & 0x000f0000) >> 16;
>>>>>>>>>>> +    PciLgStg *rp;
>>>>>>>>>>> +    uint64_t offset;
>>>>>>>>>>> +    uint64_t data;
>>>>>>>>>>> +    uint8_t len;
>>>>>>>>>>> +
>>>>>>>>>>> +    cpu_synchronize_state(CPU(cpu));
>>>>>>>>>>> +
>>>>>>>>>>> +    if (env->psw.mask & PSW_MASK_PSTATE) {
>>>>>>>>>>> +        program_interrupt(env, PGM_PRIVILEGED, 4);
>>>>>>>>>>> +        return 0;
>>>>>>>>>>> +    }
>>>>>>>>>>> +
>>>>>>>>>>> +    if (r2 & 0x1) {
>>>>>>>>>>> +        program_interrupt(env, PGM_SPECIFICATION, 4);
>>>>>>>>>>> +        return 0;
>>>>>>>>>>> +    }
>>>>>>>>>>> +
>>>>>>>>>>> +    rp = (PciLgStg *)&env->regs[r2];
>>>>>>>>>>> +    offset = env->regs[r2 + 1];
>>>>>>>>>>> +
>>>>>>>>>>> +    pbdev = s390_pci_find_dev_by_fh(rp->fh);
>>>>>>>>>>> +    if (!pbdev) {
>>>>>>>>>>> +        DPRINTF("pcilg no pci dev\n");
>>>>>>>>>>> +        setcc(cpu, ZPCI_PCI_LS_INVAL_HANDLE);
>>>>>>>>>>> +        return 0;
>>>>>>>>>>> +    }
>>>>>>>>>>> +
>>>>>>>>>>> +    len = rp->len & 0xF;
>>>>>>>>>>> +    if (rp->pcias < 6) {
>>>>>>>>>>> +        if ((8 - (offset & 0x7)) < len) {
>>>>>>>>>>> +            program_interrupt(env, PGM_OPERAND, 4);
>>>>>>>>>>> +            return 0;
>>>>>>>>>>> +        }
>>>>>>>>>>> +        MemoryRegion *mr = 
>>>>>>>>>>> pbdev->pdev->io_regions[rp->pcias].memory;
>>>>>>>>>>> +        io_mem_read(mr, offset, &data, len);
>>>>>>>>>>> +    } else if (rp->pcias == 15) {
>>>>>>>>>>> +        if ((4 - (offset & 0x3)) < len) {
>>>>>>>>>>> +            program_interrupt(env, PGM_OPERAND, 4);
>>>>>>>>>>> +            return 0;
>>>>>>>>>>> +        }
>>>>>>>>>>> +        data =  pci_host_config_read_common(
>>>>>>>>>>> +                   pbdev->pdev, offset, 
>>>>>>>>>>> pci_config_size(pbdev->pdev), len);
>>>>>>>>>>> +
>>>>>>>>>>> +        switch (len) {
>>>>>>>>>>> +        case 1:
>>>>>>>>>>> +            break;
>>>>>>>>>>> +        case 2:
>>>>>>>>>>> +            data = cpu_to_le16(data);
>>>>>>>>>>> +            break;
>>>>>>>>>>> +        case 4:
>>>>>>>>>>> +            data = cpu_to_le32(data);
>>>>>>>>>>> +            break;
>>>>>>>>>>> +        case 8:
>>>>>>>>>>> +            data = cpu_to_le64(data);
>>>>>>>>>>> +            break;
>>>>>>>>>>
>>>>>>>>>> Why? Also, this is wrong. cpu_to_le64 convert between host endianness
>>>>>>>>>> and LE. So if you're running this on an LE host, you won't swap the
>>>>>>>>>> value and get a broken result.
>>>>>>>>>>
>>>>>>>>>> If you know that the value is always swapped, use bswapxx().
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Actually the code is right and required for a big endian host :-)
>>>>>>>>> pcilg/pcistg provide access to the PCI config space which is defined
>>>>>>>>> as PCI byte order (little endian). Since pci_host_config_read_common 
>>>>>>>>> does
>>>>>>>>> already a le to cpu conversion we have to convert back to PCI byte 
>>>>>>>>> order.
>>>>>>>>> Doing an unconditional swap would be a bug on a little endian host.
>>>>>>>>
>>>>>>>> Why would it be a bug? The value you end up writing is contents of a
>>>>>>>> register and thus doesn't have endianness. So if QEMU was an LE 
>>>>>>>> process,
>>>>>>>
>>>>>>> No, the s390 guest executing pcilg instruction expects to receive 
>>>>>>> config space data
>>>>>>> in PCI byte order.
>>>>>>>
>>>>>>>> the value of data would be identical as on a BE QEMU before your swab.
>>>>>>>> After the swab, it would be bswap'ed on BE, but not LE. So LE hosts 
>>>>>>>> break.
>>>>>>>>
>>>>>>>
>>>>>>> Again on BE endian host we do the swap because of 
>>>>>>> pci_host_config_read_common does
>>>>>>> read the value and do a byte swap for that value, but we need PCI byte 
>>>>>>> order not BE here.
>>>>>>>
>>>>>>> On LE host pci_host_config_read_common does not do a byte swap so we do 
>>>>>>> not have to
>>>>>>> convert back to PCI byte order.
>>>>>>
>>>>>> We maintain the PCI config space always in LE byte order in memory, 
>>>>>> that's why there is a bwap in its read function. The return result of 
>>>>>> the read function however is always the same, regardless of LE or BE 
>>>>>> host. If I do a read of size 4, I will always get 0x1, not 0x01000000 
>>>>>> returned.
>>>>>>
>>>>>> So now you need to convert that 0x1 into a 0x01000000 manually here 
>>>>>> because some architect thought that registers have endianness (which 
>>>>>> they don't). But you need to do it always, even on an LE host, because 
>>>>>> the pci config space return value is identical on LE and BE.
>>>>>>
>>>>> so you tell me pci_host_config_read_common does not end up in 
>>>>> pci_default_read_config?
>>>>>
>>>>> uint32_t pci_default_read_config(PCIDevice *d,
>>>>>                                  uint32_t address, int len)
>>>>> {
>>>>>     uint32_t val = 0;
>>>>>
>>>>>     memcpy(&val, d->config + address, len);
>>>>>     return le32_to_cpu(val);
>>>>> }
>>>>>
>>>>> What did I miss?
>>>>
>>>> That's exactly where you end up in - and it's there to convert from the
>>>> PCI config space backing storage to a native number.
>>>>
>>>> Imagine you write 0x12345678 at offset 0. Because PCI config space is
>>>> defined to be LE, in the PCI config space memory this gets stored as
>>>>
>>>> 78 56 34 12
>>>>
>>>> The reason we do the internal storage of the config space that way is
>>>> that it's (in some PCI implementations) legal to access with single byte
>>>> granularities. So you could do a pci_config_read(offset = 1) which
>>>> should return 0x56.
>>>>
>>>> However, that means we completely nullify any effect of host endianness
>>>> in the PCI config layer already. So if you do pci_config_write(offset =
>>>> 0, size = 4, value = 0x12345678), the contents of d->config will always
>>>> be identical, regardless of host endianness. The same holds true for
>>>> pci_config_read(offset = 0, size = 4). It will always return 0x12345678.
>>>>
>>>
>>> I understood this from the beginning and I completely agree to this.
>>>  
>>>> In your code, you swab that value again. I assume there's a reason
>>>> you're swapping it and that it's the way the architecture expects it
>>>
>>> Yes, s390 pcilg architecture states:
>>> Data in the PCI configuration space are treated
>>> as being in little-endian byte ordering
>>>
>>>> (mind to point me to the respective spec so I can verify?). But if the
>>>> architecture expects it, then it expects it regardless of host
>>>> endianness. The contents of regs[r1] should always be 0x78563412, no
>>>> matter whether we're in an LE or a BE environment.
>>>>
>>>> Does that make sense now?
>>>>
>>> Absolutely lets make an example for qemu running on BE and LE
>>>
>>> byte order    config space backing   pci_default_read_config   pcilg (with 
>>> cpu_to_le)
>>> BE            0x78563412             0x12345678                0x78563412
>>> LE            0x78563412             0x78563412                0x78563412
>>
>> No, pci_default_read_config() always returns 0x12345678 because it
>> returns a register, not memory.
>>
> 
> You mean implementation of pci_default_read_config is broken?
> If it should return a register it should not do "return le32_to_cpu(val);"

It has to, to convert from memory (after memcpy) to an actual register
value. Look at the value list in Paolo's email - I really have no idea
how to explain it any better.


Alex



reply via email to

[Prev in Thread] Current Thread [Next in Thread]