|
From: | Timothy Brownawell |
Subject: | Re: [Monotone-devel] Stalled server |
Date: | Wed, 06 Jan 2010 08:17:03 -0600 |
User-agent: | Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.5) Gecko/20091204 Thunderbird/3.0 |
On 1/6/2010 7:21 AM, Richard Levitte wrote:
Since about a week, I've run a debug variant of the server on monotone.ca... It stalled today with the following backtrace: (gdb) bt #0 0xb7fe1424 in __kernel_vsyscall () #1 0xb7a055b1 in select () from /lib/i686/cmov/libc.so.6 #2 0x08396185 in Netxx::Probe_impl::probe (this=0xac85c50, timeout=..., rt=2) at netxx/probe_select.cxx:221 #3 0x0839df72 in Netxx::Socket::writable (this=0xac85c48, timeout=...) at netxx/socket.cxx:301 #4 0x0839e282 in Netxx::Socket::write (this=0xac85c48, buffer=0xcbd799c, length=451, timeout=...) at netxx/socket.cxx:178 #5 0x08398e4f in Netxx::Stream::write (this=0xb4bf720, buffer=0xcbd799c, length=451) at netxx/stream.cxx:132
When we get a new connection, it does get set non-blocking but we also tell netxx to give it a long timeout, which it does by select()ing on it. So apparently we get a spurious 'writeable' from the main select() loop, and then the select() when trying to write times out.
I think it just needs something other than 'timeout' below, but I need to run to work right now.
$ grep -n -A1 -B6 "new Netxx::Stream" network/listener.cc 65- // 'false' here means not to revert changes when the SockOpt 66- // goes out of scope. 67- Netxx::SockOpt socket_options(client.get_socketfd(), false); 68- socket_options.set_non_blocking(); 69- 70- shared_ptr<Netxx::Stream> str =71: shared_ptr<Netxx::Stream>(new Netxx::Stream(client.get_socketfd(),
72- timeout));
[Prev in Thread] | Current Thread | [Next in Thread] |