lwip-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[lwip-users] Re: TCP_SEG Leak ...


From: Thomas Catalino
Subject: [lwip-users] Re: TCP_SEG Leak ...
Date: Sat, 1 Dec 2007 13:17:37 -0500

We have now verified that there is a TCP_SEG leak in lwip in the scenerio described below.

Briefly, the problem occurs as follows on a TCP socket:

- lwip misses a packet with data (lost in the network, no FIN in the packet)
- lwip receives a packet with no data and the FIN flag set
- sender retries the data packet -- this time with a FIN flag set

In this scenerio lwip puts the segment with no data and only the FIN on the ooseq queue. When the data packet is received (with the FIN) the connection closes down (normally) but when tcp_pcb_purge() is called to flush segs nothing is done because this routine assumes that these queues are clean for pcb's passed to it that are in the CLOSED state or TIME_WAIT state. I think our pcb is CLOSED at this point.

Interestingly if the scenerio is slightly different everything is fine -- if we get the FIN, then the data packet arrives (out of order) all is fine -- the difference is that in the above scenerio the sender retries the packet with the FIN flag set this time.

Questions:

- Is this scenerio (by the sender) legal? Seems to be based on the RFC -- it states that segments can be repackaged.
- What is the right fix? Easy thing to do is to flush the ooseq queue in tcp_pcb_purge() for a CLOSED socket, but I think the flaw is somewhere up in the ooseq processing.

I don't see anything in the CVS head that would seem to address this problem, so it looks to me that it's still an issue in the current revision of lwip.

Help / advice is greatly appreciated as we will be upgrading to 1.3 when it's available -- we would like to make sure this issue is addressed there too -- assuming we're on the right track.

Thanks,
Tom



On Nov 20, 2007 10:26 PM, Thomas Catalino < address@hidden> wrote:

We seem to be losing a TCP_SEG every so often when communicating over a noisy wireless link. Our lwip was taken at some point between stable releases 1.10 and 1.11. Since that time we have installed some of the important fixes to the stack, we are currently working to re-structure our port so we can easily update to 1.2 (and 1.3 when it becomes available).  We have examined the source for relevant fixes that have already been addressed in the baseline that might resolve our issue, but we have found none.

Here is what is going on ... this issue exhibits itself in our implementation of an HTTP client, but believe this to be a generic TCP issue. Our lwip is on the client performing a GET operation to a remote server on the other side of a high bit-error rate wireless network interface.

At some point during this scenario we lose a TCP_SEG ...

Client < ---- > Server

SYN Seq 0, Ack 0, Len 0 ------->
<-------- SYN Seq 0, Ack 1, Len 0
PSH,ACK Seq 1, Ack 1, Len 148 (GET) ------->
<-------- ACK Seq 1, Ack 149, Len 0
<-------- FIN,ACK Seq 479, Ack 149, Len 0   (OOSEQ)
<-------- PSH,ACK Seq 1, Ack 149, Len 478   (OOSEQ AND LOST -- NEVER RECEIVED BY CLIENT)
ACK Seq 149, Ack 1, Len 0 ------->          (DUP ACK)

-- 3 second delay as Server times out --

<-------- FIN,PSH,ACK Seq 1, Ack 149, Len 478   (RESEND BY SERVER)
ACK Seq 149, Ack 480, Len 0 ------->         
ACK Seq 149, Ack 480, Win 2919, Len 0 ------->  (WINDOW UPDATE)
FIN, ACK Seq 149, Ack 480, Len 0 ------->         
<-------- ACK Seq 480, Ack 150, Len 0


The significant events that seem to be required in order for us to lose a TCP_SEG are:
Could the following occur?
  1. When the FIN,ACK is received by the client a TCP_SEG is allocated and stored as ooseq
  2. When the FIN,PSH,ACK is received by the client (resend of the data after missing the data in the PSH,ACK from the server, but this time the server added the FIN), lwip pushes the data up to the application and sees the FIN in the packet, but does not dequeue the seg stored with the previous packet containing the FIN. 
  3. The socket closes down and the seg is lost (the length 1 seg containing the first FIN attempt).
Thoughts?

Thanks,
Tom








reply via email to

[Prev in Thread] Current Thread [Next in Thread]