[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] msi/msix: added API to set MSI message address
From: |
Alexey Kardashevskiy |
Subject: |
Re: [Qemu-devel] [PATCH] msi/msix: added API to set MSI message address and data |
Date: |
Thu, 19 Jul 2012 10:32:40 +1000 |
User-agent: |
Mozilla/5.0 (X11; Linux i686; rv:13.0) Gecko/20120614 Thunderbird/13.0.1 |
On 19/07/12 01:23, Michael S. Tsirkin wrote:
> On Wed, Jul 18, 2012 at 11:17:12PM +1000, Alexey Kardashevskiy wrote:
>> On 18/07/12 22:43, Michael S. Tsirkin wrote:
>>> On Thu, Jun 21, 2012 at 09:39:10PM +1000, Alexey Kardashevskiy wrote:
>>>> Added (msi|msix)_set_message() functions.
>>>>
>>>> Currently msi_notify()/msix_notify() write to these vectors to
>>>> signal the guest about an interrupt so the correct values have to
>>>> written there by the guest or QEMU.
>>>>
>>>> For example, POWER guest never initializes MSI/MSIX vectors, instead
>>>> it uses RTAS hypercalls. So in order to support MSIX for virtio-pci on
>>>> POWER we have to initialize MSI/MSIX message from QEMU.
>>>>
>>>> Signed-off-by: Alexey Kardashevskiy <address@hidden>
>>>
>>> So guests do enable MSI through config space, but do
>>> not fill in vectors?
>>
>> Yes. msix_capability_init() calls arch_setup_msi_irqs() which does
>> everything it needs to do (i.e. calls hypervisor) before
>> msix_capability_init() writes PCI_MSIX_FLAGS_ENABLE to the PCI_MSIX_FLAGS
>> register.
>>
>> These vectors are the PCI bus addresses, the way they are set is specific
>> for a PCI host controller, I do not see why the current scheme is a bug.
>
> I won't work with any real PCI device, will it? Real pci devices expect
> vectors to be written into their memory.
Yes. And the hypervisor does this. On POWER (at least book3s - server powerpc,
the whole config space kitchen is hidden behind RTAS (kind of bios). For the
guest, this RTAS is implemented in hypervisor, for the host - in the system
firmware. So powerpc linux does not have to have PHB drivers. Kinda cool.
Usual powerpc server is running without the host linux at all, it is running a
hypervisor called pHyp. And every guest knows that it is a guest, there is no
full machine emulation, it is para-virtualization. In power-kvm, we replace
that pHyp with the host linux and now QEMU plays a hypervisor role. Some day We
will move the hypervisor to the host kernel completely (?) but now it is in
QEMU.
>>> Very strange. Are you sure it's not
>>> just a guest bug? How does it work for other PCI devices?
>>
>> Did not get the question. It works the same for every PCI device under POWER
>> guest.
>
> I mean for real PCI devices.
>
>>> Can't we just fix guest drivers to program the vectors properly?
>>>
>>> Also pls address the comment below.
>>
>> Comment below.
>>
>>> Thanks!
>>>
>>>> ---
>>>> hw/msi.c | 13 +++++++++++++
>>>> hw/msi.h | 1 +
>>>> hw/msix.c | 9 +++++++++
>>>> hw/msix.h | 2 ++
>>>> 4 files changed, 25 insertions(+)
>>>>
>>>> diff --git a/hw/msi.c b/hw/msi.c
>>>> index 5233204..cc6102f 100644
>>>> --- a/hw/msi.c
>>>> +++ b/hw/msi.c
>>>> @@ -105,6 +105,19 @@ static inline uint8_t msi_pending_off(const
>>>> PCIDevice* dev, bool msi64bit)
>>>> return dev->msi_cap + (msi64bit ? PCI_MSI_PENDING_64 :
>>>> PCI_MSI_PENDING_32);
>>>> }
>>>>
>>>> +void msi_set_message(PCIDevice *dev, MSIMessage msg)
>>>> +{
>>>> + uint16_t flags = pci_get_word(dev->config + msi_flags_off(dev));
>>>> + bool msi64bit = flags & PCI_MSI_FLAGS_64BIT;
>>>> +
>>>> + if (msi64bit) {
>>>> + pci_set_quad(dev->config + msi_address_lo_off(dev), msg.address);
>>>> + } else {
>>>> + pci_set_long(dev->config + msi_address_lo_off(dev), msg.address);
>>>> + }
>>>> + pci_set_word(dev->config + msi_data_off(dev, msi64bit), msg.data);
>>>> +}
>>>> +
>>>
>>> Please add documentation. Something like
>>>
>>> /*
>>> * Special API for POWER to configure the vectors through
>>> * a side channel. Should never be used by devices.
>>> */
>>
>>
>> It is useful for any para-virtualized environment I believe, is not it?
>> For s390 as well. Of course, if it supports PCI, for example, what I am not
>> sure it does though :)
>
> I expect the normal guest to program the address into MSI register using
> config accesses, same way that it enables MSI/MSIX.
> Why POWER does it differently I did not yet figure out but I hope
> this weirdness is not so widespread.
In para-virt I would expect the guest not to touch config space at all. At
least it should use one interface rather than two but this is how it is.
>>>> bool msi_enabled(const PCIDevice *dev)
>>>> {
>>>> return msi_present(dev) &&
>>>> diff --git a/hw/msi.h b/hw/msi.h
>>>> index 75747ab..6ec1f99 100644
>>>> --- a/hw/msi.h
>>>> +++ b/hw/msi.h
>>>> @@ -31,6 +31,7 @@ struct MSIMessage {
>>>>
>>>> extern bool msi_supported;
>>>>
>>>> +void msi_set_message(PCIDevice *dev, MSIMessage msg);
>>>> bool msi_enabled(const PCIDevice *dev);
>>>> int msi_init(struct PCIDevice *dev, uint8_t offset,
>>>> unsigned int nr_vectors, bool msi64bit, bool
>>>> msi_per_vector_mask);
>>>> diff --git a/hw/msix.c b/hw/msix.c
>>>> index ded3c55..5f7d6d3 100644
>>>> --- a/hw/msix.c
>>>> +++ b/hw/msix.c
>>>> @@ -45,6 +45,15 @@ static MSIMessage msix_get_message(PCIDevice *dev,
>>>> unsigned vector)
>>>> return msg;
>>>> }
>>>>
>>>> +void msix_set_message(PCIDevice *dev, int vector, struct MSIMessage msg)
>>>> +{
>>>> + uint8_t *table_entry = dev->msix_table_page + vector *
>>>> PCI_MSIX_ENTRY_SIZE;
>>>> +
>>>> + pci_set_quad(table_entry + PCI_MSIX_ENTRY_LOWER_ADDR, msg.address);
>>>> + pci_set_long(table_entry + PCI_MSIX_ENTRY_DATA, msg.data);
>>>> + table_entry[PCI_MSIX_ENTRY_VECTOR_CTRL] &=
>>>> ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
>>>> +}
>>>> +
>>>> /* Add MSI-X capability to the config space for the device. */
>>>> /* Given a bar and its size, add MSI-X table on top of it
>>>> * and fill MSI-X capability in the config space.
>>>> diff --git a/hw/msix.h b/hw/msix.h
>>>> index 50aee82..26a437e 100644
>>>> --- a/hw/msix.h
>>>> +++ b/hw/msix.h
>>>> @@ -4,6 +4,8 @@
>>>> #include "qemu-common.h"
>>>> #include "pci.h"
>>>>
>>>> +void msix_set_message(PCIDevice *dev, int vector, MSIMessage msg);
>>>> +
>>>> int msix_init(PCIDevice *pdev, unsigned short nentries,
>>>> MemoryRegion *bar,
>>>> unsigned bar_nr, unsigned bar_size);
>>>> --
>>>> 1.7.10
>>>>
>>>> ps. double '-' and git version is an end-of-patch scissor as I read
>>>> somewhere, cannot recall where exactly :)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 21/06/12 20:56, Jan Kiszka wrote:
>>>>> On 2012-06-21 12:50, Alexey Kardashevskiy wrote:
>>>>>> On 21/06/12 20:38, Jan Kiszka wrote:
>>>>>>> On 2012-06-21 12:28, Alexey Kardashevskiy wrote:
>>>>>>>> On 21/06/12 17:39, Jan Kiszka wrote:
>>>>>>>>> On 2012-06-21 09:18, Alexey Kardashevskiy wrote:
>>>>>>>>>>
>>>>>>>>>> agrhhh. sha1 of the patch changed after rebasing :)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Added (msi|msix)_(set|get)_message() function for whoever might
>>>>>>>>>> want to use them.
>>>>>>>>>>
>>>>>>>>>> Currently msi_notify()/msix_notify() write to these vectors to
>>>>>>>>>> signal the guest about an interrupt so the correct values have to
>>>>>>>>>> written there by the guest or QEMU.
>>>>>>>>>>
>>>>>>>>>> For example, POWER guest never initializes MSI/MSIX vectors, instead
>>>>>>>>>> it uses RTAS hypercalls. So in order to support MSIX for virtio-pci
>>>>>>>>>> on
>>>>>>>>>> POWER we have to initialize MSI/MSIX message from QEMU.
>>>>>>>>>>
>>>>>>>>>> As only set* function are required by now, the "get" functions were
>>>>>>>>>> added
>>>>>>>>>> or made public for a symmetry.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Alexey Kardashevskiy <address@hidden>
>>>>>>>>>> ---
>>>>>>>>>> hw/msi.c | 29 +++++++++++++++++++++++++++++
>>>>>>>>>> hw/msi.h | 2 ++
>>>>>>>>>> hw/msix.c | 11 ++++++++++-
>>>>>>>>>> hw/msix.h | 3 +++
>>>>>>>>>> 4 files changed, 44 insertions(+), 1 deletion(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/hw/msi.c b/hw/msi.c
>>>>>>>>>> index 5233204..9ad84a4 100644
>>>>>>>>>> --- a/hw/msi.c
>>>>>>>>>> +++ b/hw/msi.c
>>>>>>>>>> @@ -105,6 +105,35 @@ static inline uint8_t msi_pending_off(const
>>>>>>>>>> PCIDevice* dev, bool msi64bit)
>>>>>>>>>> return dev->msi_cap + (msi64bit ? PCI_MSI_PENDING_64 :
>>>>>>>>>> PCI_MSI_PENDING_32);
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> +MSIMessage msi_get_message(PCIDevice *dev)
>>>>>>>>>
>>>>>>>>> MSIMessage msi_get_message(PCIDevice *dev, unsigned vector)
>>>>>>>>
>>>>>>>>
>>>>>>>> Who/how/why is going to calculate the vector here?
>>>>>>>>
>>>>>>>>>
>>>>>>>>>> +{
>>>>>>>>>> + uint16_t flags = pci_get_word(dev->config + msi_flags_off(dev));
>>>>>>>>>> + bool msi64bit = flags & PCI_MSI_FLAGS_64BIT;
>>>>>>>>>> + MSIMessage msg;
>>>>>>>>>> +
>>>>>>>>>> + if (msi64bit) {
>>>>>>>>>> + msg.address = pci_get_quad(dev->config +
>>>>>>>>>> msi_address_lo_off(dev));
>>>>>>>>>> + } else {
>>>>>>>>>> + msg.address = pci_get_long(dev->config +
>>>>>>>>>> msi_address_lo_off(dev));
>>>>>>>>>> + }
>>>>>>>>>> + msg.data = pci_get_word(dev->config + msi_data_off(dev,
>>>>>>>>>> msi64bit));
>>>>>>>>>
>>>>>>>>> And I have this here in addition:
>>>>>>>>>
>>>>>>>>> unsigned int nr_vectors = msi_nr_vectors(flags);
>>>>>>>>> ...
>>>>>>>>>
>>>>>>>>> if (nr_vectors > 1) {
>>>>>>>>> msg.data &= ~(nr_vectors - 1);
>>>>>>>>> msg.data |= vector;
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> See PCI spec and existing code.
>>>>>>>>
>>>>>>>>
>>>>>>>> What for? I really do not get it why someone might want to read
>>>>>>>> something but not real value.
>>>>>>>> What PCI code should I look?
>>>>>>>
>>>>>>> I'm not sure what your use case for reading the message is. For KVM
>>>>>>> device assignment it is preparing an alternative message delivery path
>>>>>>> for MSI vectors. And for this we will need vector notifier support for
>>>>>>> MSI as well. You can check the MSI-X code for corresponding use cases of
>>>>>>> msix_get_message.
>>>>>>
>>>>>>> And when we already have msi_get_message, another logical use case is
>>>>>>> msi_notify. See msix.c again.
>>>>>>
>>>>>> Aaaa.
>>>>>>
>>>>>> I have no case for reading the message. All I need is writing. And I
>>>>>> want it public as I want to use
>>>>>> it from hw/spapr_pci.c. You suggested to add reading, I added "get" to
>>>>>> be _symmetric_ to "set"
>>>>>> ("get" returns what "set" wrote). You want a different thing which I can
>>>>>> do but it is not
>>>>>> msi_get_message(), it is something like msi_prepare_message(MSImessage
>>>>>> msg) or
>>>>>> msi_set_vector(uint16_t data) or simply internal kitchen of msi_notify().
>>>>>>
>>>>>> Still can do what you suggested, it just does not seem right.
>>>>>
>>>>> It is right - when looking at it from a different angle. ;)
>>>>>
>>>>> I don't mind if you add msi_get_message now or leave this to me. Likely
>>>>> the latter is better as you have no use case for msi_get_message (and
>>>>> also msix_get_message!) outside of their modules, thus we should not
>>>>> export those functions anyway.
--
Alexey
- Re: [Qemu-devel] [PATCH] msi/msix: added API to set MSI message address and data, Michael S. Tsirkin, 2012/07/18
- Re: [Qemu-devel] [PATCH] msi/msix: added API to set MSI message address and data, Alexey Kardashevskiy, 2012/07/18
- Re: [Qemu-devel] [PATCH] msi/msix: added API to set MSI message address and data, Michael S. Tsirkin, 2012/07/18
- Re: [Qemu-devel] [PATCH] msi/msix: added API to set MSI message address and data,
Alexey Kardashevskiy <=
- Re: [Qemu-devel] [PATCH] msi/msix: added API to set MSI message address and data, Michael S. Tsirkin, 2012/07/19
- Re: [Qemu-devel] [PATCH] msi/msix: added API to set MSI message address and data, Alexey Kardashevskiy, 2012/07/19
- Re: [Qemu-devel] [PATCH] msi/msix: added API to set MSI message address and data, Michael S. Tsirkin, 2012/07/19
- Re: [Qemu-devel] [PATCH] msi/msix: added API to set MSI message address and data, Alexey Kardashevskiy, 2012/07/19
- Re: [Qemu-devel] [PATCH] msi/msix: added API to set MSI message address and data, Michael S. Tsirkin, 2012/07/19
- [Qemu-devel] [PATCH] msi/msix: added API to set MSI message address and data, Alexey Kardashevskiy, 2012/07/18
- Re: [Qemu-devel] [PATCH] msi/msix: added API to set MSI message address and data, Michael S. Tsirkin, 2012/07/19