[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH for 2.5 1/1] e1000: fix hang of win2k12 shutdown
From: |
Jason Wang |
Subject: |
Re: [Qemu-devel] [PATCH for 2.5 1/1] e1000: fix hang of win2k12 shutdown with flood ping |
Date: |
Wed, 2 Dec 2015 13:06:22 +0800 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 |
On 12/01/2015 05:38 PM, Denis V. Lunev wrote:
> On 12/01/2015 06:31 AM, Jason Wang wrote:
>>
>> On 11/30/2015 02:22 PM, Denis V. Lunev wrote:
>>> On 11/30/2015 08:58 AM, Jason Wang wrote:
>>>> On 11/27/2015 07:42 PM, Denis V. Lunev wrote:
>>>>> On 11/27/2015 09:50 AM, Denis V. Lunev wrote:
>>>>>> On 11/27/2015 09:48 AM, Denis V. Lunev wrote:
>>>>>>> e1000 driver in Win2k12 is really well rotten. It 100% hangs on
>>>>>>> shutdown
>>>>>>> of UP VM under flood ping. The guest checks card state and
>>>>>>> reinjects
>>>>>>> itself interrupt in a loop. This is fatal for UP machine.
>>>>>>>
>>>>>>> There is no good way to fix this misbehavior but to kludge it. The
>>>>>>> emulation has interrupt throttling register aka ITR which limits
>>>>>>> interrupt rate and allows the guest to proceed this phase.
>>>>>>> There is no problem with this kludge for Linux guests - it
>>>>>>> adjust the
>>>>>>> value of it itself.
>>>>>>>
>>>>>>> On the other hand according to the initial research in
>>>>>>> commit e9845f0985f088dd01790f4821026df0afba5795
>>>>>>> Author: Vincenzo Maffione <address@hidden>
>>>>>>> Date: Fri Aug 2 18:30:52 2013 +0200
>>>>>>>
>>>>>>> e1000: add interrupt mitigation support
>>>>>>>
>>>>>>> ...
>>>>>>>
>>>>>>> Interrupt mitigation boosts performance when the guest
>>>>>>> suffers
>>>>>>> from
>>>>>>> an high interrupt rate (i.e. receiving short UDP packets at
>>>>>>> high packet
>>>>>>> rate). For some numerical results see the following link
>>>>>>> http://info.iet.unipi.it/~luigi/papers/20130520-rizzo-vm.pdf
>>>>>>>
>>>>>>> this should also boost performance a bit.
>>>>>>>
>>>>>>> See https://bugzilla.redhat.com/show_bug.cgi?id=874406 for
>>>>>>> additional
>>>>>>> details.
>>>>>>>
>>>>>>> Signed-off-by: Denis V. Lunev <address@hidden>
>>>>>>> CC: Vincenzo Maffione <address@hidden>
>>>>>>> CC: Stefan Hajnoczi <address@hidden>
>>>>>>> ---
>>>>>>> hw/net/e1000.c | 3 +++
>>>>>>> 1 file changed, 3 insertions(+)
>>>>>>>
>>>>>>> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
>>>>>>> index c877e06..0af528f 100644
>>>>>>> --- a/hw/net/e1000.c
>>>>>>> +++ b/hw/net/e1000.c
>>>>>>> @@ -447,6 +447,9 @@ static void e1000_reset(void *opaque)
>>>>>>> e1000_link_down(d);
>>>>>>> }
>>>>>>> + /* Throttle interrupts to allow poor Win 2012 to
>>>>>>> shutdown */
>>>>>>> + d->mac_reg[ITR] = 250;
>>>>>>> +
>>>>>>> /* Some guests expect pre-initialized RAH/RAL (AddrValid
>>>>>>> flag
>>>>>>> + MACaddr) */
>>>>>>> d->mac_reg[RA] = 0;
>>>>>>> d->mac_reg[RA + 1] = E1000_RAH_AV;
>>>>>> Intel manual says about ITR that " A initial suggested range is
>>>>>> 651-5580 (28Bh - 15CCh)."
>>>>>> Should we use something other than 250? :)
>>>>>>
>>>>>> http://www.intel.com/content/www/us/en/embedded/products/networking/pci-pci-x-family-gbe-controllers-software-dev-manual.html
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Den
>>>>> Jason, can you look to this?
>>>>>
>>>>> I have rechecked MAINTAINERs file and found that
>>>>> I have missed you here. Sorry :(
>>>>>
>>>>> Den
>>>>>
>>>> No problem.
>>>>
>>>> But I have a question. What if ITR is disabled?
>>>>
>>> On behalf of guest I do not think that this is really true.
>>> In this case the guest should set it to a real value and
>>> after that clear it. This is not the case - my patch
>>> applies on a reset only, i.e. the guest do not care at all
>>> on this and the value lives "as is". I think that real card
>>> behaves in a similar way, it could not generate interrupts
>>> with the speed of any hypervisor, i.e. there is natural
>>> limitation which allows to bypass this problem or there
>>> is a default value.
>>>
>>> On behalf of QEMU the question is still here. Fortunately
>>> the handle (mitigation flag) is on by default. I think that
>>> it exists to preserve compatibility with QEMU 1.6
>>> In a real life nobody will turn it off until the person is
>>> know what he is doing ;)
>>>
>>> Den
>> Ok, apply to my -net with minor tweaks and adding a TODO in the comment.
>>
>> We've met several similar issues in the past, need to consider a
>> complete solution in the future otherwise we may still hit something
>> like this in the future.
>>
>> Thanks
> thank you.
>
> Can you pls clarify, will it go to 2.5 or no?
>
> Den
It will go to 2.5. Plan to include this in my last pull request for 2.5.
Thanks