monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] monotone server hangs


From: Hugo Cornelis
Subject: Re: [Monotone-devel] monotone server hangs
Date: Tue, 29 Sep 2009 12:07:00 -0500

Overnight we upgraded our server to a Debian system with a kernel
version 2.6.26.  This seems to have solved the problem.

Thanks for all your help.


Hugo


On Thu, Sep 24, 2009 at 1:15 PM, Hugo Cornelis <address@hidden> wrote:
> On Thu, Sep 24, 2009 at 1:33 AM, Stephen Leake
> <address@hidden> wrote:
>> Hugo Cornelis <address@hidden> writes:
>>
>>> From time to time, we reinstall a developer PC from scratch, recreate
>>> the appropriate directory layout and pull the monotone repositories
>>> from scratch.  Unfortunately pulling a large monotone repository for
>>> the first time often hangs the monotone server.  This error seems to
>>> happen in 90% of the trials, although it does not happen always.
>>> There are no error messages from either side.  Restarting the monotone
>>> server and interrupting the client allows one to retry to pull the
>>> repository.
>>>
>>> We would appreciate any help to solve this problem.
>>
>> Not a direct solution to the problem, but a good workaround is to
>> simply use scp to copy the monotone database on a "from scratch" setup.
>>
>
> We just had the same problem on a partial pull.  When 'it' hangs, the
> server is in a select() system call, the client is idle (don't know
> what system call is).
>
> Here is the strace output on the server:
>
> address@hidden monotone-0.45]# strace -p 1871
> Process 1871 attached - interrupt to quit
> [ Process PID=1871 runs in 32 bit mode. ]
> select(11, [9 10], [9 10], [9 10], {20945, 988000}) = 2 (in [10], out
> [10], left {20671, 308000})
> select(11, [10], NULL, NULL, {21600, 0}) = 1 (in [10], left {21600, 0})
> recv(10, 0xffd733e9, 262143, 0)         = -1 ETIMEDOUT (Connection timed out)
> write(3, "mtn-ns-sli: peer 129.111.247.96:"..., 88) = 88
> select(11, NULL, [10], NULL, {21600, 0}
>
> The first select() system call is where it hangs for some time, and
> then it continues with the second select() system call.  I am not an
> expert on select() programming, but it seems to say that data is ready
> to be read from the socket, resulting in the call to recv(), but this
> call times out.  The write() system call is for logging ?
>
> So after some time, the connection on the server side seems to timeout
> (but the server is not ready, see below).  At the client side, the
> connection still hangs.  The client was a MAC in this case.
>
> I did a new pull on a different (linux) machine.  This connection
> hangs at the client side with the following output:
>
>
> [12:54] (0,8) ~ $ mtn --db
> /local_home/local_home/hugo/neurospaces_project/MTN/ns-sli.mtn pull
> --ticker=count repo-genesis3.cbi.utsa.edu:4692 "*"
> mtn: doing anonymous pull; use -kKEYNAME if you need authentication
> mtn: connecting to repo-genesis3.cbi.utsa.edu:4692
>
> for 'netstat -nap' I get the following output for this process:
>
>
> tcp        0      0 129.111.247.65:56980    129.115.117.89:4692
> ESTABLISHED814/mtn
>
> and for strace:
>
> [12:58] (0,9) ~ $ strace -p 814
> Process 814 attached - interrupt to quit
> select(7, [6], [], [6], {21388, 352000}
>
> This client is now 'hanging' for about 20 minutes.
>
> The strace output on the server side did not change during these
> events, which I assume means that the first client is still blocking
> the server.
>
> Anyone knows how to continue from here?
>
>
> --
>
> Hugo
>
>
> --
>
>                    Hugo Cornelis Ph.D.
>
>              Neurospaces Project Architect
>                http://www.neurospaces.org/
>
>                  Research Imaging Center
>   University of Texas Health Science Center at San Antonio
>                    7703 Floyd Curl Drive
>                 San Antonio, TX  78284-6240
>
>                    Phone: 210 567 8112
>                      Fax: 210 567 8152
>



-- 

Hugo


--

                    Hugo Cornelis Ph.D.

              Neurospaces Project Architect
                http://www.neurospaces.org/

                  Research Imaging Center
   University of Texas Health Science Center at San Antonio
                    7703 Floyd Curl Drive
                 San Antonio, TX  78284-6240

                    Phone: 210 567 8112
                      Fax: 210 567 8152




reply via email to

[Prev in Thread] Current Thread [Next in Thread]