[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v3 02/10] main-loop: switch to g_poll() on POSIX
From: |
Laszlo Ersek |
Subject: |
Re: [Qemu-devel] [PATCH v3 02/10] main-loop: switch to g_poll() on POSIX hosts |
Date: |
Mon, 04 Feb 2013 15:37:39 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:10.0.12) Gecko/20130108 Thunderbird/10.0.12 |
Comments in-line.
On 02/04/13 13:12, Stefan Hajnoczi wrote:
> Use g_poll(3) instead of select(2). Well, this is kind of a cheat.
> It's true that we're now using g_poll(3) on POSIX hosts but the *_fill()
> and *_poll() functions are still using rfds/wfds/xfds.
>
> We've set the scene to start converting *_fill() and *_poll() functions
> step-by-step until no more rfds/wfds/xfds users remain. Then we'll drop
> the temporary gpollfds_from_select() and gpollfds_to_select() functions
> and be left with native g_poll(2).
>
> On Windows things are a little crazy: convert from rfds/wfds/xfds to
> GPollFDs, back to rfds/wfds/xfds, call select(2), rfds/wfds/xfds back to
> GPollFDs, and finally back to rfds/wfds/xfds again. This is only
> temporary and keeps the Windows build working through the following
> patches. We'll drop this excessive conversion later and be left with a
> single GPollFDs -> select(2) -> GPollFDs sequence that allows Windows to
> use select(2) while the rest of QEMU only knows about GPollFD.
>
> Signed-off-by: Stefan Hajnoczi <address@hidden>
> ---
> main-loop.c | 135
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++----
> 1 file changed, 127 insertions(+), 8 deletions(-)
I assume this is complex because the pre-patch situation is overly
complex as well:
slirp -> fd_set
iohandler -> fd_set
unix windows
--------------------------------- --------------------------------------------
before after before after
-------------- ----------------- --------------------- ---------------------
glib -> fd_set glib -> fd_set glib -> GPollFD glib -> GPollFD
fd_set -> GPollFD WaitObj -> GPollFD WaitObj -> GPollFD
SELECT G_POLL G_POLL (glib/WaitObj) G_POLL (glib/WaitObj)
fd_set -> GPollFD
GPollFD -> fd_set
SELECT (slirp/ioh.) SELECT (slirp/ioh.)
(I'm ignoring the after-select / after-poll operations; they (should)
mirror the pre-select / pre-poll ones.)
No idea why the windows version has been mixing g_poll() and select().
I'd hope that this series kills select() for uniformity's sake, but the
00/10 subject and the commit msg for this patch indicate otherwise.
"main-loop.c" is full of global variables which makes it a *pain* to read.
>
> diff --git a/main-loop.c b/main-loop.c
> index d0d8fe4..f1dcd14 100644
> --- a/main-loop.c
> +++ b/main-loop.c
> @@ -117,6 +117,8 @@ void qemu_notify_event(void)
> aio_notify(qemu_aio_context);
> }
>
> +static GArray *gpollfds;
> +
> int qemu_init_main_loop(void)
> {
> int ret;
> @@ -133,6 +135,7 @@ int qemu_init_main_loop(void)
> return ret;
> }
>
> + gpollfds = g_array_new(FALSE, FALSE, sizeof(GPollFD));
> qemu_aio_context = aio_context_new();
> src = aio_get_g_source(qemu_aio_context);
> g_source_attach(src, NULL);
> @@ -146,6 +149,62 @@ static GPollFD poll_fds[1024 * 2]; /* this is probably
> overkill */
> static int n_poll_fds;
> static int max_priority;
>
> +/* Load rfds/wfds/xfds into gpollfds. Will be removed a few commits later.
> */
> +static void gpollfds_from_select(void)
> +{
> + int fd;
> + for (fd = 0; fd <= nfds; fd++) {
"nfds" is an example of a global variable being global ("file scope")
for no good reason. Anyway it does seem to have inclusive meaning
("highest file descriptor in all sets").
> + int events = 0;
> + if (FD_ISSET(fd, &rfds)) {
> + events |= G_IO_IN | G_IO_HUP | G_IO_ERR;
> + }
("=" would have sufficed)
> + if (FD_ISSET(fd, &wfds)) {
> + events |= G_IO_OUT | G_IO_ERR;
> + }
> + if (FD_ISSET(fd, &xfds)) {
> + events |= G_IO_PRI;
> + }
> + if (events) {
> + GPollFD pfd = {
> + .fd = fd,
> + .events = events,
> + };
> + g_array_append_val(gpollfds, pfd);
> + }
> + }
> +}
(I don't like "gpollfds" being global, but) this seems OK. It matches
the glib docs and also How the Mapping Should Be Done (TM).
This function corresponds to the "unix / after / fd_set -> GPollFD"
entry of the diagram, and as such it is composed with "glib -> fd_set"
(glib_select_fill()). I'll note that glib_select_fill() sets "xfds" for
G_IO_ERR but not G_IO_PRI. (According to SUSv4, in our case xfds is the
logical-or of "error pending" and "OOB/urgent pending".)
We can assume that each entry stored by g_main_context_query() will have
at least one of IN and OUT set in "events". Further assuming that glib
follows its own documentation, that implies that G_IO_ERR will be set
unconditionally (because it always accompanies each of IN and OUT).
glib_select_fill() will map that to xfds, which we then map here to
G_IO_PRI. The total effect is that G_IO_PRI is set on all file
descriptors, always, even if we only try to write.
I think ultimately support for OOB data should be killed throughout, as
a policy, including xfds & G_IO_PRI. Pending errors are returned by
read()/write()/etc just fine; see eg. libvirt commit d19149dd.
> +
> +/* Store gpollfds revents into rfds/wfds/xfds. Will be removed a few commits
> + * later.
> + */
> +static void gpollfds_to_select(int ret)
> +{
> + int i;
> +
> + FD_ZERO(&rfds);
> + FD_ZERO(&wfds);
> + FD_ZERO(&xfds);
> +
> + if (ret <= 0) {
> + return;
> + }
ie. for g_poll() timeouts, errors & signal interrupts, we clear the sets
and that's it. OK.
> +
> + for (i = 0; i < gpollfds->len; i++) {
> + int fd = g_array_index(gpollfds, GPollFD, i).fd;
> + int revents = g_array_index(gpollfds, GPollFD, i).revents;
> +
> + if (revents & (G_IO_IN | G_IO_HUP | G_IO_ERR)) {
> + FD_SET(fd, &rfds);
> + }
> + if (revents & (G_IO_OUT | G_IO_ERR)) {
> + FD_SET(fd, &wfds);
> + }
> + if (revents & G_IO_PRI) {
> + FD_SET(fd, &xfds);
> + }
> + }
> +}
Looks good.
> +
> #ifndef _WIN32
> static void glib_select_fill(int *max_fd, fd_set *rfds, fd_set *wfds,
> fd_set *xfds, uint32_t *cur_timeout)
> @@ -212,22 +271,22 @@ static void glib_select_poll(fd_set *rfds, fd_set
> *wfds, fd_set *xfds,
>
> static int os_host_main_loop_wait(uint32_t timeout)
> {
> - struct timeval tv, *tvarg = NULL;
> int ret;
>
> glib_select_fill(&nfds, &rfds, &wfds, &xfds, &timeout);
>
> - if (timeout < UINT32_MAX) {
> - tvarg = &tv;
> - tv.tv_sec = timeout / 1000;
> - tv.tv_usec = (timeout % 1000) * 1000;
> - }
> -
> if (timeout > 0) {
> qemu_mutex_unlock_iothread();
> }
>
> - ret = select(nfds + 1, &rfds, &wfds, &xfds, tvarg);
> + /* We'll eventually drop fd_set completely. But for now we still have
> + * *_fill() and *_poll() functions that use rfds/wfds/xfds.
> + */
> + gpollfds_from_select();
> +
> + ret = g_poll((GPollFD *)gpollfds->data, gpollfds->len, timeout);
This exploits for the timeout parameter that (int)UINT32_MAX equals -1
on all supported platforms.
Timeout values in [ INT_MAX+1, UINT_MAX32-1 ] lose their meaning, but I
guess that's no problem in practice.
> +
> + gpollfds_to_select(ret);
>
> if (timeout > 0) {
> qemu_mutex_lock_iothread();
> @@ -327,6 +386,55 @@ void qemu_fd_register(int fd)
> FD_CONNECT | FD_WRITE | FD_OOB);
> }
>
> +static int pollfds_fill(GArray *pollfds, fd_set *rfds, fd_set *wfds,
> + fd_set *xfds)
> +{
> + int nfds = -1;
"nfds" shadows the global "nfds". The return value of pollfds_fill() is
then assigned to the global "nfds". Not very elegant (but it's all
rooted in the existense of the global nfds).
> + int i;
> +
> + for (i = 0; i < pollfds->len; i++) {
> + GPollFD *pfd = &g_array_index(pollfds, GPollFD, i);
> + int fd = pfd->fd;
> + int events = pfd->events;
> + if (events & (G_IO_IN | G_IO_HUP | G_IO_ERR)) {
> + FD_SET(fd, rfds);
> + nfds = MAX(nfds, fd);
> + }
> + if (events & (G_IO_OUT | G_IO_ERR)) {
> + FD_SET(fd, wfds);
> + nfds = MAX(nfds, fd);
> + }
> + if (events & G_IO_PRI) {
> + FD_SET(fd, xfds);
> + nfds = MAX(nfds, fd);
> + }
> + }
> + return nfds;
> +}
> +
> +static void pollfds_poll(GArray *pollfds, int nfds, fd_set *rfds,
> + fd_set *wfds, fd_set *xfds)
> +{
The "nfds" parameter both shadows the global variable and is unused in
the function.
> + int i;
> +
> + for (i = 0; i < pollfds->len; i++) {
> + GPollFD *pfd = &g_array_index(pollfds, GPollFD, i);
> + int fd = pfd->fd;
> + int revents = 0;
> +
> + if (FD_ISSET(fd, rfds)) {
> + revents |= G_IO_IN | G_IO_HUP | G_IO_ERR;
> + }
("=" would have sufficed)
> + if (FD_ISSET(fd, wfds)) {
> + revents |= G_IO_OUT | G_IO_ERR;
> + }
> + if (FD_ISSET(fd, xfds)) {
> + revents |= G_IO_PRI;
> + }
> + pfd->revents |= revents & pfd->events;
I find this |= vs. = more confusing than the other two. "pfd->revents"
is zero here.
> + }
> +}
> +
> static int os_host_main_loop_wait(uint32_t timeout)
> {
> GMainContext *context = g_main_context_default();
> @@ -382,12 +490,22 @@ static int os_host_main_loop_wait(uint32_t timeout)
> * improve socket latency.
> */
>
> + /* This back-and-forth between GPollFDs and select(2) is temporary.
> We'll
> + * drop it in a couple of patches, I promise :).
> + */
> + gpollfds_from_select();
> + FD_ZERO(&rfds);
> + FD_ZERO(&wfds);
> + FD_ZERO(&xfds);
> + nfds = pollfds_fill(gpollfds, &rfds, &wfds, &xfds);
> if (nfds >= 0) {
> select_ret = select(nfds + 1, &rfds, &wfds, &xfds, &tv0);
> if (select_ret != 0) {
> timeout = 0;
> + pollfds_poll(gpollfds, nfds, &rfds, &wfds, &xfds);
This function shouldn't be invoked for "select_ret == -1" (in case
that's possible here); "revents" will end up a copy of "events":
"On failure, the objects pointed to by the readfds, writefds, and
errorfds arguments shall not be modified."
> }
> }
> + gpollfds_to_select(select_ret || g_poll_ret);
I can see this logical-or mimics the return value below; however I think
it's not robust. If one of the operands is -1, and the other is
non-positive, then gpollfds_to_select() should not traverse the
GPollFDs, but it will.
... Actually, "g_poll_ret" should have absolutely no effect on
gpollfds_to_select(): "g_poll_ret" corresponds to the "poll_fds" array,
which covers the glib-internal file descriptors and the WaitObjects.
Here (= in the windows case), "gpollfds" covers only slirp & iohandler,
thus only "select_ret" should be passed in.
>
> return select_ret || g_poll_ret;
I find this return value unfathomable (how is (-1||0) equivalent to
(1||1)?), but it's out of scope for now.
> }
> @@ -403,6 +521,7 @@ int main_loop_wait(int nonblocking)
> }
>
> /* poll any events */
> + g_array_set_size(gpollfds, 0); /* reset for new iteration */
> /* XXX: separate device handlers from system ones */
> nfds = -1;
> FD_ZERO(&rfds);
I can't decide (yet) if any of the above is important; probably not.
>From a quick look at the tree at the series' end, most of it seems to be
gone by then. I'll let you decide.
Reviewed-by: Laszlo Ersek <address@hidden>
- [Qemu-devel] [PATCH v3 00/10] main-loop: switch to g_poll(3) on POSIX hosts, Stefan Hajnoczi, 2013/02/04
- [Qemu-devel] [PATCH v3 01/10] main-loop: fix select_ret uninitialized variable warning, Stefan Hajnoczi, 2013/02/04
- [Qemu-devel] [PATCH v3 03/10] main-loop: switch POSIX glib integration to GPollFD, Stefan Hajnoczi, 2013/02/04
- [Qemu-devel] [PATCH v3 02/10] main-loop: switch to g_poll() on POSIX hosts, Stefan Hajnoczi, 2013/02/04
- Re: [Qemu-devel] [PATCH v3 02/10] main-loop: switch to g_poll() on POSIX hosts,
Laszlo Ersek <=
- Re: [Qemu-devel] [PATCH v3 02/10] main-loop: switch to g_poll() on POSIX hosts, Stefan Hajnoczi, 2013/02/04
- Re: [Qemu-devel] [PATCH v3 02/10] main-loop: switch to g_poll() on POSIX hosts, Laszlo Ersek, 2013/02/04
- Re: [Qemu-devel] [PATCH v3 02/10] main-loop: switch to g_poll() on POSIX hosts, Paolo Bonzini, 2013/02/04
- Re: [Qemu-devel] [PATCH v3 02/10] main-loop: switch to g_poll() on POSIX hosts, Fabien Chouteau, 2013/02/04
- Re: [Qemu-devel] [PATCH v3 02/10] main-loop: switch to g_poll() on POSIX hosts, Paolo Bonzini, 2013/02/04
[Qemu-devel] [PATCH v3 06/10] iohandler: switch to GPollFD, Stefan Hajnoczi, 2013/02/04
[Qemu-devel] [PATCH v3 05/10] slirp: switch to GPollFD, Stefan Hajnoczi, 2013/02/04