qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] Fix buffer run out in eepro100.


From: Bo Yang
Subject: Re: [Qemu-devel] [PATCH] Fix buffer run out in eepro100.
Date: Wed, 29 Aug 2012 08:55:13 +0800
User-agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:14.0) Gecko/20120713 Thunderbird/14.0

On 08/28/2012 06:59 PM, Stefan Hajnoczi wrote:
> On Tue, Aug 28, 2012 at 7:23 AM, Bo Yang <address@hidden> wrote:
>> According
>> to liunux driver's implementation, the descriptor with EL bit set
>> must not be touched by hardware, usually, the buffer size of this
>> descriptor is set to 0.
> 
> Please describe the bug you are seeing and how to reproduce it.  

Unfortunately, I cannot reproduce it. It is reported by QA, I used the
QA's test environment to debug here.

It's
> not clear to me that your patch implements the behavior specified in
> the datasheet.  I have included relevant quotes from the datasheet:
> 
> http://www.intel.com/content/dam/doc/manual/8255x-10-100-mbps-ethernet-controller-software-dev-manual.pdf
> 
> "Table 55. CU Activities Performed at the End of Execution" shows that
> the EL bit takes effect at end of command execution.  I think this
> means the descriptor *will* be processed by the hardware.
> 
> The following documents the behavior when the descriptor's buffer size is 0:
> 
> "6.4.3.3.1 Configuring the Next RFD
> [...]
> 3. Initiates a receive DMA if the size of the data field in the RFD is
> greater than zero. The receive
> DMA is initiated with the address of the first byte of the destination
> address to the byte count
> specified by the RFD.
> 4. Forces a receive DMA completion if the size of the data field in
> the RFD is zero."
> 
>> diff --git a/hw/eepro100.c b/hw/eepro100.c
>> index 50d117e..e0efd96 100644
>> --- a/hw/eepro100.c
>> +++ b/hw/eepro100.c
>> @@ -1619,8 +1619,13 @@ static const MemoryRegionOps eepro100_ops = {
>>  static int nic_can_receive(NetClientState *nc)
>>  {
>>      EEPRO100State *s = DO_UPCAST(NICState, nc, nc)->opaque;
>> +    ru_state_t state;
>>      TRACE(RXTX, logout("%p\n", s));
>> -    return get_ru_state(s) == ru_ready;
>> +    state = get_ru_state(s);
>> +    if (state == ru_no_resources) {
>> +       eepro100_rnr_interrupt(s);
>> +    }
>> +    return state == ru_ready;
>>  #if 0
>>      return !eepro100_buffer_full(s);
>>  #endif
> 
> This is a boolean function, it shouldn't have side-effects.
> 
> Why did you add the eepro100_rnr_interrupt() call here when it's also
> called below from nic_receive()?

This is needed. Let's consider tap with e100.
When rx descriptor is EL with size 0. e100 enter ru_no_resources state
and rnr interrupt is raised. Then, driver in guest handles the
interrupt, however, there can be out of memory condition in guest, so
this time the rx descriptor filling fails, so no rx descriptor is ready.

Have a look at tap, tap_can_send is ioh->fd_read_poll, tap_send is
ioh->fd_read, tap_can_send calls nic_can_receive(), tap_send() call
nic_receive(), if tap_can_send returns 0, the tap fd will not be added
to read set, and tap_send() will never be called, so nic_receive is
never called, no more rnr interrupt will be raised. then the network
just stall. So we need interrupt to notify the guest to prepare rx
descriptor again.

> 
>> @@ -1732,6 +1737,15 @@ static ssize_t nic_receive(NetClientState *nc, const 
>> uint8_t * buf, size_t size)
>>                   &rx, sizeof(eepro100_rx_t));
>>      uint16_t rfd_command = le16_to_cpu(rx.command);
>>      uint16_t rfd_size = le16_to_cpu(rx.size);
>> +    /* don't touch the rx descriptor with EL set. */
>> +    if (rfd_command & COMMAND_EL) {
>> +        /* EL bit is set, so this was the last frame. */
>> +        logout("receive: Running out of frames\n");
>> +        set_ru_state(s, ru_no_resources);
>> +        s->statistics.rx_resource_errors++;
>> +       eepro100_rnr_interrupt(s);
>> +       return -1;
>> +    }
>>      if (size > rfd_size) {
>>          logout("Receive buffer (%" PRId16 " bytes) too small for data "
>> @@ -1767,11 +1781,6 @@ static ssize_t nic_receive(NetClientState *nc, const 
>> uint8_t * buf, size_t size)
>>      s->statistics.rx_good_frames++;
>>      eepro100_fr_interrupt(s);
>>      s->ru_offset = le32_to_cpu(rx.link);
>> -    if (rfd_command & COMMAND_EL) {
>> -        /* EL bit is set, so this was the last frame. */
>> -        logout("receive: Running out of frames\n");
>> -        set_ru_state(s, ru_suspended);
>> -    }
> 
> "6.4.3.3.3 Completion of Reception" describes how this should work.  I
> think the rnr interrupt should be raised here and we should enter the
> no resources state.

I think you're right here. To comply to the specification of e100. We
should handle the rx descriptor with EL, and then enter ru_no_resources
state, raises rnr interrupt.

therefore,

set_ru_state(s, ru_no_resources);
eepro100_rnr_interrupt(s);

after handle rx descriptor with EL bit.

> 
> Stefan
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]