lwip-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[lwip-devel] Out-of-order segments in half-closed connections


From: Ben Hastings
Subject: [lwip-devel] Out-of-order segments in half-closed connections
Date: Tue, 21 Apr 2009 16:49:51 -0400

I’ve found what I think is an issue with the TCP state machine, but I was hoping to get some feedback before declaring it a bug with lwip.  The fix appears to make lwip handle out-of-order segments in half-closed connections correctly, but I may very well be overlooking some other scenario that this would break.

 

After an active-close, the receive callback is usually called with pcb->state == TIME_WAIT after receiving the remote FIN.  Because the other end of a tcp connection can still send data after tcp_close is called, I am not freeing the tcp_arg data or setting the receive callback to NULL until receiving the FIN.  The problem I’m having happens under heavy network traffic when I have closed the connection and the segment before the remote FIN is lost.  In this case the receive callback is never called because of the missing data, but the connection closes anyway.

 

So, it looks like lwip transitions to TIME_WAIT upon receiving a FIN from the other side, regardless of whether the FIN segment is in order.  At this point, any “lost” data that is then resent by the remote side get’s ACK’ed but never delivered to the application.

 

The issue appears to be resolved by copying the same transition criteria for the ESTABLISHED state to FIN_WAIT_1 and FIN_WAIT_2.  A patch for the 1.3.0-STABLE version of tcp_in.c is below.

 

@@ -640,8 +640,8 @@

     }

     break;

   case FIN_WAIT_1:

-    tcp_receive(pcb);

-    if (flags & TCP_FIN) {

+    accepted_inseq = tcp_receive(pcb);

+    if ((flags & TCP_FIN) && accepted_inseq) {

       if (flags & TCP_ACK && ackno == pcb->snd_nxt) {

         LWIP_DEBUGF(TCP_DEBUG,

           ("TCP connection closed %"U16_F" -> %"U16_F".\n", inseg.tcphdr->src, inseg.tcphdr->dest));

@@ -659,8 +659,8 @@

     }

     break;

   case FIN_WAIT_2:

-    tcp_receive(pcb);

-    if (flags & TCP_FIN) {

+    accepted_inseq = tcp_receive(pcb);

+    if ((flags & TCP_FIN) && accepted_inseq) {

       LWIP_DEBUGF(TCP_DEBUG, ("TCP connection closed %"U16_F" -> %"U16_F".\n", inseg.tcphdr->src, inseg.tcphdr->dest));

       tcp_ack_now(pcb);

       tcp_pcb_purge(pcb);

 

 

Here’s a packet capture (from 10.0.1.10) showing the problem.  Lwip is running on 10.0.0.98.

   8220 255.720272  10.0.1.10             10.0.0.98             TCP      3839 > 80 [SYN] Seq=0 Win=65535 Len=0 MSS=1460

   8226 255.721360  10.0.0.98             10.0.1.10             TCP      80 > 3839 [SYN, ACK] Seq=0 Ack=1 Win=2048 Len=0 MSS=256

   8227 255.721367  10.0.1.10             10.0.0.98             TCP      3839 > 80 [ACK] Seq=1 Ack=1 Win=65535 [TCP CHECKSUM INCORRECT] Len=0

   8228 255.721403  10.0.1.10             10.0.0.98             HTTP     GET /style.css HTTP/1.1

   8229 255.721408  10.0.1.10             10.0.0.98             HTTP     Continuation or non-HTTP traffic

   8234 255.722043  10.0.0.98             10.0.1.10             TCP      80 > 3839 [ACK] Seq=1 Ack=257 Win=2048 Len=0

   8235 255.722579  10.0.0.98             10.0.1.10             HTTP     Continuation or non-HTTP traffic

   8236 255.722583  10.0.0.98             10.0.1.10             HTTP     Continuation or non-HTTP traffic

   8237 255.722589  10.0.1.10             10.0.0.98             TCP      3839 > 80 [ACK] Seq=341 Ack=513 Win=65535 [TCP CHECKSUM INCORRECT] Len=0

   8239 255.723393  10.0.0.98             10.0.1.10             HTTP     Continuation or non-HTTP traffic

   8240 255.723398  10.0.0.98             10.0.1.10             HTTP     Continuation or non-HTTP traffic

   8241 255.723399  10.0.0.98             10.0.1.10             HTTP     Continuation or non-HTTP traffic

   8242 255.723405  10.0.1.10             10.0.0.98             TCP      3839 > 80 [ACK] Seq=341 Ack=1281 Win=65535 [TCP CHECKSUM INCORRECT] Len=0

   8243 255.724113  10.0.0.98             10.0.1.10             HTTP     Continuation or non-HTTP traffic

   8244 255.724117  10.0.0.98             10.0.1.10             HTTP     Continuation or non-HTTP traffic

   8245 255.724118  10.0.0.98             10.0.1.10             HTTP     Continuation or non-HTTP traffic

   8246 255.724124  10.0.1.10             10.0.0.98             TCP      3839 > 80 [ACK] Seq=341 Ack=2013 Win=65535 [TCP CHECKSUM INCORRECT] Len=0

   8247 255.724454  10.0.0.98             10.0.1.10             HTTP     Continuation or non-HTTP traffic

   8248 255.724898  10.0.0.98             10.0.1.10             HTTP     Continuation or non-HTTP traffic

   8249 255.724902  10.0.0.98             10.0.1.10             HTTP     Continuation or non-HTTP traffic

   8250 255.724904  10.0.0.98             10.0.1.10             HTTP     Continuation or non-HTTP traffic

   8251 255.724909  10.0.1.10             10.0.0.98             TCP      3839 > 80 [ACK] Seq=341 Ack=3037 Win=65535 [TCP CHECKSUM INCORRECT] Len=0

   8252 255.725587  10.0.0.98             10.0.1.10             HTTP     Continuation or non-HTTP traffic

   8253 255.725591  10.0.0.98             10.0.1.10             TCP      80 > 3839 [FIN, ACK] Seq=3249 Ack=257 Win=2048 Len=0

   8254 255.725597  10.0.1.10             10.0.0.98             TCP      3839 > 80 [ACK] Seq=341 Ack=3250 Win=65323 [TCP CHECKSUM INCORRECT] Len=0

   8255 255.725636  10.0.1.10             10.0.0.98             TCP      3839 > 80 [FIN, ACK] Seq=341 Ack=3250 Win=65323 [TCP CHECKSUM INCORRECT] Len=0

   8256 255.725930  10.0.0.98             10.0.1.10             TCP      [TCP Dup ACK 8253#1] 80 > 3839 [ACK] Seq=3250 Ack=257 Win=2048 Len=0

   8257 255.725935  10.0.1.10             10.0.0.98             HTTP     [TCP Out-Of-Order] Continuation or non-HTTP traffic

   8258 255.726268  10.0.0.98             10.0.1.10             TCP      80 > 3839 [ACK] Seq=3250 Ack=341 Win=2048 Len=0

   8259 258.080107  10.0.1.10             10.0.0.98             TCP      3839 > 80 [FIN, ACK] Seq=341 Ack=3250 Win=65323 [TCP CHECKSUM INCORRECT] Len=0

   8260 258.080419  10.0.0.98             10.0.1.10             TCP      80 > 3839 [ACK] Seq=3250 Ack=342 Win=2048 Len=0

 

TCP debugging looks like…

TCP connection request 3839 -> 80.

TCP connection established 3839 -> 80.

tcp_recved: recveived 256 bytes, wnd 2048 (0).

(((the next segment is lost)))

tcp_close: closing in State: ESTABLISHED

TCP connection closed 3839 -> 80.

tcp_pcb_purge

tcp_pcb_purge: data left on ->ooseq

(((receive callback never called)))

 

Thanks

Ben Hastings

 


reply via email to

[Prev in Thread] Current Thread [Next in Thread]