[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] Possible bug in monitor code
From: |
Luiz Capitulino |
Subject: |
Re: [Qemu-devel] Possible bug in monitor code |
Date: |
Fri, 24 Jan 2014 18:48:22 -0500 |
On Wed, 22 Jan 2014 17:53:52 +0200
Stratos Psomadakis <address@hidden> wrote:
> Hi,
>
> we've encountered a weird issue regarding monitor (qmp and hmp) behavior
> with qemu-1.7 (and qemu-1.5). The following steps will reproduce the issue:
>
> 1) Client A connects to qmp socket with socat
> 2) Client A gets greeting message {"QMP": {"version": ..}
> 3) Client A waits (select on the socket's fd)
> 4) Client B tries to connect to the *same* qmp socket with socat
> 5) Client B does *NOT* get any greating message
> 6) Client B waits (select on the socket's fd)
> 7) Client B closes connection (kill socat)
> 8) Client A quits too
> 9) Client C connects to qmp socket
> 10) Client C gets *two* greeting messages!!!
>
> After some investigation, we traced it down to the monitor_flush()
> function in monitor.c. Specifically, when a second client connects to
> the qmp (client B), while another one is already using it (client A), we
> get the following from stracing the second client (client B):
>
> connect(3, {sa_family=AF_FILE, path="foo.mon"}, 9) = 0
> getsockname(3, {sa_family=AF_FILE, NULL}, [2]) = 0
> select(4, [0 3], [1 3], [], NULL) = 2 (out [1 3])
> select(4, [0 3], [], [], NULL
>
> So, the connect() syscall from client B succeeds, although client B
> connection has not yet been accepted by the qmp server (it's still in
> the backlog of the qmp listening socket).
>
> After killing client B and then client A, we see the following when
> stracing the qemu proc:
>
> 22363 accept4(6, {sa_family=AF_FILE, NULL}, [2], SOCK_CLOEXEC) = 9
> 22363 fcntl(9, F_GETFL) = 0x2 (flags O_RDWR)
> 22363 fcntl(9, F_SETFL, O_RDWR|O_NONBLOCK) = 0
> 22363 fstat(9, {st_mode=S_IFSOCK|0777, st_size=0, ...}) = 0
> 22363 fcntl(9, F_GETFL) = 0x802 (flags
> O_RDWR|O_NONBLOCK)
> 22363 write(9, "{\"QMP\": {\"version\": {\"qemu\": {\"m"..., 127) =
> -1 EPIPE (Broken pipe)
> 22363 --- SIGPIPE (Broken pipe) @ 0 (0) ---
>
> The qmp server / qemu accepts the connection from client B (who has now
> closed the connection) and tries to write the greeting message to the
> socket fd. This results in write returning an error (EPIPE).
>
> The monitor_flush() function doesn't seem to handle this case (write
> error). Instead, it adds a watch / handler to retry the write operation.
> Thus, mon->outbuf is not cleaned up properly, which results in duplicate
> greeting messages for the next client to connect.
>
> The following seems to do the trick.
>
> diff --git a/monitor.c b/monitor.c
> index 845f608..5622f20 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -288,8 +288,8 @@ void monitor_flush(Monitor *mon)
>
> if (len && !mon->mux_out) {
> rc = qemu_chr_fe_write(mon->chr, (const uint8_t *) buf, len);
> - if (rc == len) {
> - /* all flushed */
> + if ((rc < 0 && errno != EAGAIN) || (rc == len)) {
> + /* all flushed or error */
> QDECREF(mon->outbuf);
> mon->outbuf = qstring_new();
> return;
>
> Comments?
I can reproduce the problem very easily and I can't think of a better
way to fix it.
The right thing to do, I guess, would be to move the error up and kill
the connection if there's any. But last time I checked the chardev
layer did not have an API to kill a connection...
- Re: [Qemu-devel] Possible bug in monitor code, Fam Zheng, 2014/01/23
- Re: [Qemu-devel] Possible bug in monitor code, Luiz Capitulino, 2014/01/23
- Re: [Qemu-devel] Possible bug in monitor code, Luiz Capitulino, 2014/01/23
- Re: [Qemu-devel] Possible bug in monitor code, Stratos Psomadakis, 2014/01/23
- Re: [Qemu-devel] Possible bug in monitor code, Luiz Capitulino, 2014/01/23
- Re: [Qemu-devel] Possible bug in monitor code, Stratos Psomadakis, 2014/01/24
- Re: [Qemu-devel] Possible bug in monitor code, Luiz Capitulino, 2014/01/24