[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2] pci: Assert that capabilities never overlap
From: |
Markus Armbruster |
Subject: |
Re: [PATCH v2] pci: Assert that capabilities never overlap |
Date: |
Thu, 29 Sep 2022 12:55:22 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) |
Akihiko Odaki <akihiko.odaki@daynix.com> writes:
> On Mon, Sep 5, 2022 at 6:26 PM Markus Armbruster <armbru@redhat.com> wrote:
>>
>> Akihiko Odaki <akihiko.odaki@daynix.com> writes:
>>
>> > On Fri, Sep 2, 2022 at 7:23 PM Markus Armbruster <armbru@redhat.com> wrote:
>> >>
>> >> Akihiko Odaki <akihiko.odaki@daynix.com> writes:
>> >>
>> >> > pci_add_capability appears most PCI devices. Its error handling required
>> >> > lots of code, and led to inconsistent behaviors such as:
>> >> > - passing error_abort
>> >> > - passing error_fatal
>> >> > - asserting the returned value
>> >> > - propagating the error to the caller
>> >> > - skipping the rest of the function
>> >> > - just ignoring
>> >> >
>> >> > The code generating errors in pci_add_capability had a comment which
>> >> > says:
>> >> >> Verify that capabilities don't overlap. Note: device assignment
>> >> >> depends on this check to verify that the device is not broken.
>> >> >> Should never trigger for emulated devices, but it's helpful for
>> >> >> debugging these.
>> >> >
>> >> > Indeed vfio has some code that passes capability offsets and sizes from
>> >> > a physical device, but it explicitly pays attention so that the
>> >> > capabilities never overlap.
>> >>
>> >> I can't see that at a glance. Can you give me a clue?
>> >>
>> >> > Therefore, we can always assert that
>> >> > capabilities never overlap when pci_add_capability is called, resolving
>> >> > these inconsistencies.
>> >> >
>> >> > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
>> >>
>> >
>> > Looking at vfio_add_std_cap(), and vfio_add_ext_cap() it seems that
>> > they are clipping the size of capabilities so that they do not
>> > overlap, if I read it correctly.
>>
>> If we want to deal gracefully with buggy physical devices, we need to
>> treat pdev->config[] as untrusted input.
>>
>> As far as I can tell:
>>
>> * vfio_add_capabilities() replicates the physical device's capabilities
>> (starting at pdev->config[PCI_CAPABILITY_LIST]) in the virtual device.
>>
>> * vfio_add_std_cap() is a helper to add the tail starting at
>> pdev->config[pos].
>>
>> Could the physical device's capabilities overlap? If yes, what would
>> happen before and after your series?
>>
>
> When the capabilities overlap, vfio_std_cap_max_size() and
> vfio_ext_cap_max_size(), called by vfio_add_std_cap(),
> vfio_add_ext_cap() should clip the size of capabilities. Comments in
> vfio_add_std_cap() and vfio_add_ext_cap() say: "Since QEMU doesn't
> actually handle many of the config accesses, exact size doesn't seem
> worthwhile."
Weird :)
Back to your patch. Its core is dumbing down of pci_add_capability() so
it can't fail anymore. Instead it aborts on overlapping capabilities.
Note that it already aborts when running out of PCI config space.
The remainder of the patch simplifies callers accordingly. Some callers
ignore the error before the patch, some abort on error, and some pass it
on to their callers. Too much for me to review in detail. We can talk
about ways to split it up if that's desired.
My one concern is device assignment, which should treat the device's
capabilities as untrusted input. If you're right in that such
capabilities get truncated before they reach pci_add_capability(), then
your change from failure to abort is okay.
I'd like to leave checking your correctness argument to actual PCI
experts.