lwip-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[lwip-devel] [bug #24212] Deadlocked tcp_retransmit due to exceeded pcb-


From: Tamas Somogyi
Subject: [lwip-devel] [bug #24212] Deadlocked tcp_retransmit due to exceeded pcb->cwnd
Date: Fri, 21 Nov 2008 10:42:17 +0000
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.9.0.4) Gecko/2008102920 Firefox/3.0.4

Follow-up Comment #4, bug #24212 (project lwip):

Hi,

I have similar problem around the retransmission.

The scenario is the following:
I have a socket wherein my server device sends message to a client in every
200ms and the client acknowledges the received packets. See also the attached
log ‘9700-9746.1.3.0.pcap’. But sometimes I get [TCP Previous segment
lost] (frame 22), which quickly leads to some [TCP Retransmission], [TCP Dup
ACK], and finally the socket shuts down (around frame 43). Note also that
there’s always a [TCP Window update] around when the crash occurs.

Symptoms:
Parallel to retransmitting the packages, snd_queuelen of the related pcb goes
from the normal level (0-7 in my application) step by step up to 70-80 in 5-6
seconds till the socket closes. I had to increase the default of
TCP_SND_QUEUELEN (=64) to double size, otherwise my application together with
LwIP wouldn’t survive, but this is just a symptomatic treatment. Moreover
lwip_send blocks the calling thread even if I set MSG_DONTWAIT, because
tcpip_apimsg is locked until the packet has been sent.

Remarks:
1) Sometimes it survives the [TCP Previous segment lost] event, after 5-10
retransmissions it runs well again, see also the log ‘750-850.1.3.0.pcap’.
Perhaps that’s because no TCP window update at that point.
2) I get this error sometimes in 5-10 minutes, sometimes only in 2-3 hours.
But in slower 10Mbit/s network it usually comes in 5-10 minutes.
3) My platform is based on Texas Instruments TMS320C6711 DSP and Critical
Link’s MityDSP board using LwIP 1.3.0 stable, but the problem comes also
with the latest code. It is interesting that using earlier version of LwIP
doesn’t crash: after some retransmission it runs again well without closing
the socket. So it looks like a regression in LwIP.

Unfortunately applying Hans Jörg’s and/or Oleg’s fixes related to
reorganizing pcb->unsent and pcb->unacked lists still producing the same
error. I tried with some modifications as well: fully sorting in
tcp_reorder_segments not just bringing the smallest seq forward or sorting
unacked list in tcp_rexmit – but I get the error even faster.

Questions:
1. What causes [TCP Previous segment lost] event? I can imagine some
transient network failure or maybe if packet sniffer misses one or more
packets – but would be more interesting if LwIP generated wrong sequence
number in TCP header…
2. For me it seems that on retransmission, the “handshake” of
acknowledging the retransmitted packages cannot be established with the peer
properly. Could you please give me some explanation how should it work and why
the order of unsent and unacknowledged packets are important?
3. Can you please give me some advise what to do to fix or to work around
this bug? In my application a transient problem or a few delayed or lost
packages is not a big problem unlike closing the sockets or freezing the stack
for seconds.
4. Can you please let me know whether somebody is working on this bug? If so,
can you give an estimated time of fix?
As this is a job stopper issue for my project, I’m going to investigate,
but I’m not familiar with internals of TCP, therefore I would appreciate any
help or suggestion.

Best regards, Tamas



(file #16889, file #16890)
    _______________________________________________________

Additional Item Attachment:

File name: 750-850.1.3.0.pcap             Size:15 KB
File name: 9700-9746.1.3.0.pcap           Size:6 KB


    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?24212>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/





reply via email to

[Prev in Thread] Current Thread [Next in Thread]