qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PULL for-2.4 0/7] update ipxe roms, fix efi support


From: Laszlo Ersek
Subject: Re: [Qemu-devel] [PULL for-2.4 0/7] update ipxe roms, fix efi support
Date: Wed, 22 Jul 2015 00:58:59 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0

On 07/21/15 18:10, Stefan Hajnoczi wrote:
> On Tue, Jul 21, 2015 at 3:28 PM, Paolo Bonzini <address@hidden> wrote:
>> On 21/07/2015 16:25, Peter Maydell wrote:
>>>>>
>>>>> Well, it did say "This pull finally fixes the efi boot support.  ipxe is
>>>>> updated to the latest master, two non-upstream commits needed to make
>>>>> efi work are added on top, and the build process is tweaked a bit".
>>> Right, but if you want us to switch from "we just mirror
>>> upstream ipxe" to "we have our own ipxe" then it's probably
>>> better to start with that, rather than by submitting a pullreq
>>> that can't be applied until we switch our ipxe workflow.
>>>
>>> Do we really need this for 2.4, or can we wait and sort
>>> this out afterwards? It feels a bit late in the release
>>> cycle to start doing this kind of thing to me.
>>
>> Well, it is a bugfix.  shim.efi doesn't work with upstream iPXE, and
>> hence doesn't work with the ROMs currently distributed by QEMU (Fedora
>> applies the patches already).
>>
>> The patches have missed 2.3 and 2.4 because Gerd has been sending them
>> again upstream every month.
>>
>> That said, I see your point.  It's probably not of utmost importance as
>> long as OVMF remains non-free and hence not shipped by most distributions.
> 
> With regards to git.qemu.org mirroring:
> I could update the ipxe.git mirror URL on git.qemu.org to Gerd's
> public repo and change the description to indicate that this includes
> out-of-tree patches.  Please let me know if you'd like to go ahead.
> 
> I'd prefer it if we don't ship a patched ipxe since Paolo mentions
> Fedora already has a fix in place.

RHEL doesn't (yet). In fact these QEMU patches are the result of a long
(and mostly fruitless) effort to get the ipxe patches upstream -- for
RHEL we prefer *something* to point at as "upstream".

> Instead of propagating that fix
> into QEMU, let's focus on ipxe upstream to solve the problem.

How's this for focus:

(1) [PATCH] efi_snp: improve compliance with the
            EFI_SIMPLE_NETWORK_PROTOCOL spec
    http://thread.gmane.org/gmane.network.ipxe.devel/3799
    Date: 2015-Jan-27
    feedback: none

(2) [PATCH] efi_snp: improve compliance with the
            EFI_SIMPLE_NETWORK_PROTOCOL spec
    http://thread.gmane.org/gmane.network.ipxe.devel/3955
    Date: 2015-Feb-10
    feedback: zero

(3) [PATCH] [efi] make load file protocol optional
    http://thread.gmane.org/gmane.network.ipxe.devel/3815
    Date: 2015-Feb-11
    feedback: the patch destroys the user's ability to use
              most features of iPXE
    our point: we don't care about most features of iPXE, we just need
               it for EFI drivers (Simple Network Protocol instances)
    result: nothing

(4) [RESENT PATCH 0/2] efi boot fixes.
    http://thread.gmane.org/gmane.network.ipxe.devel/3854
    Date: 2015-Mar-10
    feedback: zilch

(5) [RESEND PATCH] efi_snp: improve compliance with the
                   EFI_SIMPLE_NETWORK_PROTOCOL spec
    http://thread.gmane.org/gmane.network.ipxe.devel/3934
    Date: 2015-Apr-14
    feedback: nada

(6) [PATCH] efi_snp: improve compliance with the
            EFI_SIMPLE_NETWORK_PROTOCOL spec
    http://thread.gmane.org/gmane.network.ipxe.devel/4096
    Date: 2015-Jun-10
    feedback: null

So, let us *not* focus on getting these into upstream ipxe. The mailing
list is a black hole. The upstream maintainer behaves completely
unpredictably. For example, look at this one example:

  [PATCH 0/2] virtio-net shutdown fix for ExitBootServices()
  http://thread.gmane.org/gmane.network.ipxe.devel/3918
  Date: 2015-Apr-10

For this submission, the maintainer provided good feedback, then decided
to rewrite the patches (which was a good call, no doubt about it), then
went ahead and committed that patch, and even gave credit:

commit 755d2b8f6be681a2e620534b237471b75f28ed8c
Author: Michael Brown <address@hidden>
Date:   Mon Apr 13 12:06:59 2015 +0100

    [efi] Ensure drivers are disconnected when ExitBootServices() is
    called
    [...]
    Originally-fixed-by: Laszlo Ersek <address@hidden>
    Tested-by: Laszlo Ersek <address@hidden>
    Signed-off-by: Michael Brown <address@hidden>

(Therefore this patch is actually covered by the upstream rebase in
[PULL 1/7] here.)

And then, *just one day* after this commit, Gerd submits (5) -- and
complete silence.

> I've reviewed the ipxe-devel archives and see that patches have been
> on the list for weeks/months without replies.  I didn't see ping
> emails though so maybe it just requires a bit of poking via email or
> IRC.

Rather than pings, Gerd kept resending the actual patches. That's a
strictly stronger kind of "reminder". So no, pings are useless.

Regarding IRC, let me quote the following from freenode | #ipxe, date
2015-Jan-23 (ie. four days before posting (1)). The first quote is about
the patch in (3) -- the implementation looked differently at that point,
but the behavior was identical:

<mcb30>  Patches to modify our LOAD_FILE_PROTOCOL to support loading
         arbitrary files would be fine
<lersek> mcb30, thanks. :)
<mcb30>  but there's no way I'm turning UEFI iPXE back into being a dumb
         SNP driver with the appalling UEFI UX
<mcb30>  Consider what happens when a user attempts normal network boot
         and something goes wrong (e.g. the file doesn't exist)
<mcb30>  iPXE displays a UI with a meaningful error message and a shell
         allowing you to troubleshoot
<lersek> the UEFI boot option fails and the boot option processing
         continues.
<mcb30>  UEFI simply displays a dot on an otherwise blank screen and
         then returns to the boot menu
<lersek> yes
<lersek> if there are no more UEFI boot options to try
<lersek> I understand your point fully
<mcb30>  A dot on a blank screen is not a meaningful error message
<lersek> we have different goals here
<mcb30>  Understood
<lersek> the dot is not meant as an error message, it's progress info
         AFAIR
<mcb30>  In which case, a total absence of an error message is not a
         meaningful error message
<lersek> correct
<lersek> but it's consistent with the handling of other (not necessarily
         network related) boot options when they fail
<lersek> your main goal is to provide a nice iPXE experience, while we
         need some UEFI NIC drivers (for OVMF) in qemu-kvm guests
<lersek> so I thought I'd try to check with upstream developers before
         keeping these patches downstream only
<lersek> thanks!
<mcb30>  I would suggest adding code to support downloading arbitrary
         files via LOAD_FILE_PROTOCOL
<mcb30>  which would avoid a UX downgrade when network-booting a
         qemu-kvm guest
<mcb30>  (i.e. when not going through grub)

So, this is consistent with the feedback we'd actually get later for
(3). It makes no sense to struggle more with this patch.

Then, regarding the other patch (the one in (1)):

<lersek> mcb30, what about the second patch?
<mcb30>  lersek: sorry; I mssed the second patch
<mcb30>  Looking...
<lersek> mcb30, thanks -- my first msg in this channel was quite long &
         crowded
<mcb30>  lersek: looks okish.  I need to check the code again; it's been
         a long time since I looked at that.  In particular, is there
         any functional difference between the existing
         tx_count_interrupts (used I think to determine how many TX
         completions to return) and the producer/consumer counters that
         you have added?  If tx_count_interrupts is now redundant
         because the producer/consumer already include that information
         then it should probably be
<mcb30>  removed
<lersek> mcb30, that's a good question. The UEFI spec emphasizes that
         clearing / retrieving a recycled txbuf does not clear interrupt
         status if the caller doesn't ask for that too, and vice versa
         -- if the caller only asks about interrupt status, then no
         recycled txbuf is returned
<lersek> so they should remain independent
<lersek> but how exactly you calculate the interrupt status when the
         caller *does* ask for it, that's anyone's best guess
<lersek> the spec is unclear about it
<lersek> in OVMF's builtin virtio-net driver I think I just check if
         there are any pending outgoing & incoming packets, and set the
         corresponding bits
<lersek> I don't use a counter
<lersek> but it's very obscure indeed and no caller I know of seems to
         care about interrupt status
<lersek> which is why I didn't touch that part of the code
<lersek> I just kept those two things independent, because their
         independence *is* specified
<mcb30>  ok
<mcb30>  That makes sense
<lersek> (more precisely, re OVMF's builtin virtio-net driver, the RX
         intr bit is reported when more stuff is available for
         reception, and the TX intr bit is reported if we have
         transmitted at least one buffer that the caller queued with
         Transmit)
<lersek> mcb30, so can you pick up the 2nd patch from bugzilla, or
         should I send it to the list? (Actually if you deem the patch
         appropriate I can only send it to the list, because current
         master iPXE looks a bit different from our fork in RHEL, and
         the 2nd patch takes different forms accordingly.)

I got no further replies here. Based on the apparently positive feedback
on IRC, four days later I posted (1). No feedback.

> We either need to figure out how to get attention upstream

Good luck with that. In my educated experience: completely unpredictable
maintainer behavior, not a real community project.

> or work
> with others to add upstream maintainers.

When we can't get the maintainer's attention for our patches, and when
the maintainer tends to rewrite even those patches he more or less
likes, how do you propose we convince him to give *push access* to
random people?

> I see that Hannes Reinecke
> also has patches on ipxe-devel that look ignored, so Gred and Laszlo
> are not the only ones struggling to get patches upstream into ipxe.

I've said it several times (on other lists too), and I'll say it again:
ipxe is not an "open process" community project at this point. The last
half year, as Paolo indicated, and as I proved above, has been ample
experience.

--*--

On the technical level, let me summarize: we needed three patches in
total to get UEFI boot working:

#1 efi_snp: improve compliance with the EFI_SIMPLE_NETWORK_PROTOCOL spec
#2 [efi] make load file protocol optional
#3 [efi] Ensure drivers are disconnected when ExitBootServices() is
   called

Wrt. #1, the maintainer expressed agreement on IRC, but never replied to
patch emails.

Wrt. #2, the maintainer expressed strong disagreement (due to "user
interface" concerns) on both IRC and later on the mailing list.
Therefore the idea of upstreaming this patch is dead in the water. The
maintainer would accept an alternative that would take a huge
development effort, and would be useless for virtualization purposes
(ie. PXE-booting with OVMF in QEMU).

Wrt. #3, the maintainer decided to look at the patch, rewrite it, and
commit it. For some unfathomable reason. Maybe because he was Cc'd
directly on the patch email. I don't know. (The ipxe-devel list has
absolutely minimal traffic, why wouldn't he read it without explicit Cc?!)

This PULL is about getting #3 via rebase, and #1 and #2 as downstream
patches.

Thanks
Laszlo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]