monit-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[monit-dev] [monit] r288 committed - Remove the experimental usleep - it


From: monit
Subject: [monit-dev] [monit] r288 committed - Remove the experimental usleep - it seems the echo request is external...
Date: Sun, 26 Sep 2010 22:55:07 +0000

Revision: 288
Author: martin2812
Date: Sun Sep 26 15:54:17 2010
Log: Remove the experimental usleep - it seems the echo request is external.

Theory no. 2: Testing farm is set for bi-directinal ping (both hosts pinging to each other), it is possible that the icmp request received instead of expected response is originated by the other host and since we have raw socket open with it and awaiting response, we receive his requests too, i.e. race condition where two hosts are pinging each other, both send echo request at once and expect reply, but instead of reply, request is received by application.

Setting icmp id to pid (which is used by standalone ping program too) + print out the id of unexpected echo request ... if it is from the other host's monit, we'll see its pid in log.



http://code.google.com/p/monit/source/detail?r=288

Modified:
 /trunk/net.c

=======================================
--- /trunk/net.c        Sun Sep 26 11:45:11 2010
+++ /trunk/net.c        Sun Sep 26 15:54:17 2010
@@ -702,7 +702,7 @@
   }
 #endif

-  id_out = (getpid() + time(NULL)) & 0xFFFF;
+  id_out = getpid() & 0xFFFF;
   icmpout = (struct icmp *)buf;
   for (i = 0; i < count; i++) {
     int j;
@@ -738,13 +738,6 @@
       continue;
     }

- /* Experimental: it seems monit sporadically reads its own ICMP echo request if the request is sent to the - * same host's network interface (such as for virtual hosts running on the same machine), whereas the raw - * socket seems to read it before target host gets the request. Need to investiagte it more, trying to delay - * the read to see if there will be difference (we see the transient 1/3 attempt failure sporadically on testing
-     * farm */
-    usleep(100);
-
     if (can_read(s, timeout)) {
       socklen_t size = sizeof(struct sockaddr_in);

@@ -770,7 +763,9 @@
DEBUG("ICMP echo response %d/%d succeeded -- received id=%d sequence=%d response_time=%fs\n", i + 1, count, icmpin->icmp_id, icmpin->icmp_seq, response);
           break; // Wait for one response only
         } else
- LogError("ICMP echo response %d/%d error -- received id=%d (expected id=%d), received sequence=%d (expected sequence 0-%d)\n", i + 1, count, icmpin->icmp_id, id_out, icmpin->icmp_seq, count - 1); + LogError("ICMP echo response %d/%d error -- received id=%d (expected id=%d), received sequence=%d (expected sequence=%d)\n", i + 1, count, icmpin->icmp_id, id_out, icmpin->icmp_seq, count - 1);
+      } else if (icmpin->icmp_type == ICMP_ECHO) {
+ LogError("ICMP echo response %d/%d failed -- received echo request instead of expected response, source id=%d (mine id=%d) sequence=%d (mine sequence=%d)\n", i + 1, count, icmpin->icmp_id, id_out, icmpin->icmp_seq, count - 1);
       } else
LogError("ICMP echo response %d/%d failed -- invalid ICMP response type: %x (%s)\n", i + 1, count, icmpin->icmp_type, icmpin->icmp_type < 19 ? icmpnames[icmpin->icmp_type] : "unknown");
     } else



reply via email to

[Prev in Thread] Current Thread [Next in Thread]