[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] Possible bug in monitor code
From: |
Luiz Capitulino |
Subject: |
Re: [Qemu-devel] Possible bug in monitor code |
Date: |
Fri, 24 Jan 2014 09:14:39 -0500 |
On Fri, 24 Jan 2014 12:14:12 +0200
Stratos Psomadakis <address@hidden> wrote:
> On 01/23/2014 08:28 PM, Luiz Capitulino wrote:
> > On Thu, 23 Jan 2014 17:33:33 +0200
> > Stratos Psomadakis <address@hidden> wrote:
> >
> >> On 01/23/2014 03:54 PM, Luiz Capitulino wrote:
> >>> On Thu, 23 Jan 2014 08:44:02 -0500
> >>> Luiz Capitulino <address@hidden> wrote:
> >>>
> >>>> On Thu, 23 Jan 2014 19:23:51 +0800
> >>>> Fam Zheng <address@hidden> wrote:
> >>>>
> >>>>> Bcc:
> >>>>> Subject: Re: [Qemu-devel] Possible bug in monitor code
> >>>>> Reply-To:
> >>>>> In-Reply-To: <address@hidden>
> >>>>>
> >>>>> On Thu, 01/23 12:17, Stratos Psomadakis wrote:
> >>>>>> On 01/23/2014 05:07 AM, Fam Zheng wrote:
> >>>>>>> On Wed, 01/22 17:53, Stratos Psomadakis wrote:
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> we've encountered a weird issue regarding monitor (qmp and hmp)
> >>>>>>>> behavior
> >>>>>>>> with qemu-1.7 (and qemu-1.5). The following steps will reproduce the
> >>>>>>>> issue:
> >>>>>>>>
> >>>>>>>> 1) Client A connects to qmp socket with socat
> >>>>>>>> 2) Client A gets greeting message {"QMP": {"version": ..}
> >>>>>>>> 3) Client A waits (select on the socket's fd)
> >>>>>>>> 4) Client B tries to connect to the *same* qmp socket with socat
> >>>> QMP/HMP can only handle a single client per connection, so this is
> >>>> not supported. What you could do is to create multiple QMP/HMP instances
> >>>> on the command-line, but this has never been fully tested so we don't
> >>>> officially support this either (although it may work).
> >>>>
> >>>> Now, not gracefully failing on step 4 is a real bug here. I think the
> >>>> best thing to do would be to close client's B connection. Patches are
> >>>> welcome :)
> >>> Thinking about this again, I think this should already be the case
> >>> (as I don't believe chardev code is sitting on accept()). So maybe
> >>> you triggered a race on chardev code or broke something there.
> >> Yes, the chardev code doesn't accept any more connections until the
> >> currently active connection is closed.
> >>
> >> However, when a new client tries to connect (while there's another qmp /
> >> hmp connection active) the kernel, as long as there's room in the socket
> >> backlog, will return to the client that the connection has been
> >> successfully established and will queue the connection to be accepted
> >> later, when qmp actually finishes with its active connection and
> >> re-executes accept().
> >>
> >> The problem arises if the new client closes the connection before the
> >> older one is finished. When qmp runs accept() again, it will get a
> >> socket fd for a client who has now closed the connection. At this point,
> >> the monitor control event handler will get triggered and try to write /
> >> flush the greeting message to the client. The client however has closed
> >> its socket so the write will fail with SIGPIPE / EPIPE. Neither the
> >> qemu-char nor the monitor code seem to handle this situation.
> >>
> >> Btw, have you tried to reproduce it?
> > Not yet, I may have some time tomorrow. How reproducible is it for you?
>
> We can trigger it (by following the steps described in the first mail)
> consistently.
>
> > Another question: have you tried to reproduce with an old qemu version
> > (say v1.0) to see if this bug always existed? If the bug was introduced
> > in some recent QEMU version you could try to bisect it.
>
> v1.1 is not affected. I checked the code and it seems the monitor code
> has been refactored since v1.1.
>
> > Maybe you could try to reproduce with a different subsystem so that we
> > can rule out or confirm monitor's involvement? Like -serial?
>
> It's actually a fault of the monitor_flush() function. As far as I can
> understand, monitor_flush() calls qemu_chr_fe_write() and doesn't handle
> all of the return codes / error cases properly (as I described in a
> previous mail). If you check the function, you'll see that the final
> case (where it set ups a watch / callback) always assumes an EAGAIN /
> EWOULDBLOCK error.
>
> If you can verify / confirm that this is the case and that the patch
> sent resolves the issue in a sane / correct way, I'll resubmit it
> properly (with git-format-patch, a git log msg etc).
I'll check that later today.
- [Qemu-devel] Possible bug in monitor code, Stratos Psomadakis, 2014/01/22
- Re: [Qemu-devel] Possible bug in monitor code, Fam Zheng, 2014/01/23
- Re: [Qemu-devel] Possible bug in monitor code, Luiz Capitulino, 2014/01/23
- Re: [Qemu-devel] Possible bug in monitor code, Luiz Capitulino, 2014/01/23
- Re: [Qemu-devel] Possible bug in monitor code, Stratos Psomadakis, 2014/01/23
- Re: [Qemu-devel] Possible bug in monitor code, Luiz Capitulino, 2014/01/23
- Re: [Qemu-devel] Possible bug in monitor code, Stratos Psomadakis, 2014/01/24
- Re: [Qemu-devel] Possible bug in monitor code,
Luiz Capitulino <=
- Re: [Qemu-devel] Possible bug in monitor code, Apollon Oikonomopoulos, 2014/01/24