Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work wi

From:	Corey Bryant
Subject:	Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets
Date:	Wed, 25 Jul 2012 23:57:37 -0400
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120615 Thunderbird/13.0.1



On 07/25/2012 03:43 PM, Eric Blake wrote:

On 07/23/2012 07:08 AM, Corey Bryant wrote:

When qemu_open is passed a filename of the "/dev/fdset/nnn"
format (where nnn is the fdset ID), an fd with matching access
mode flags will be searched for within the specified monitor
fd set.  If the fd is found, a dup of the fd will be returned
from qemu_open.

Each fd set has a reference count.  The purpose of the reference
count is to determine if an fd set contains file descriptors that
have open dup() references that have not yet been closed.  It is
incremented on qemu_open and decremented on qemu_close.  It is
not until the refcount is zero that file desriptors in an fd set
can be closed.  If an fd set has dup() references open, then we
must keep the other fds in the fd set open in case a reopen
of the file occurs that requires an fd with a different access
mode.

+++ b/monitor.c
@@ -2551,6 +2551,91 @@ static void monitor_fdsets_set_in_use(Monitor *mon, bool 
in_use)
      }
  }

+void monitor_fdset_increment_refcount(Monitor *mon, int64_t fdset_id)
+{
+    mon_fdset_t *mon_fdset;
+
+    if (!mon) {
+        return;
+    }


Am I reading this code right by stating that 'if there is no monitor, we
don't increment the refcount'?  How does a monitor reattach affect
things?  Or am I missing something fundamental about the cases when
'mon==NULL' will exist?


Yes you're reading this correctly.

I'm pretty sure that mon will only be NULL if QEMU is started without amonitor.

If QEMU has a monitor, and libvirt disconnects it's connection to theqemu monitor, then I believe mon will remain non-NULL.

I'll plan on testing this out to verify though. (I'm out most of thisweek and will be back full time starting next Tues.)

+int monitor_fdset_get_fd(Monitor *mon, int64_t fdset_id, int flags)
+{
+    mon_fdset_t *mon_fdset;
+    mon_fdset_fd_t *mon_fdset_fd;
+    int mon_fd_flags;
+
+    if (!mon) {
+        errno = ENOENT;
+        return -1;
+    }
+
+    QLIST_FOREACH(mon_fdset, &mon->fdsets, next) {
+        if (mon_fdset->id != fdset_id) {
+            continue;
+        }
+        QLIST_FOREACH(mon_fdset_fd, &mon_fdset->fds, next) {
+            if (mon_fdset_fd->removed) {
+                continue;
+            }
+
+            mon_fd_flags = fcntl(mon_fdset_fd->fd, F_GETFL);
+            if (mon_fd_flags == -1) {
+                return -1;


This says we fail on the first fcntl() failure, instead of trying other
fds in the set.  Granted, an fcntl() failure is probably the sign of a
bigger bug (such as closing an fd at the wrong point in time), so I
guess trying to go on doesn't make much sense once we already know we
are hosed.

I think I'll stick with it the way it is. If fcntl() fails we mighthave a tainted fd set so I think we should fail.

+            }
+
+            switch (flags & O_ACCMODE) {
+            case O_RDWR:
+                if ((mon_fd_flags & O_ACCMODE) == O_RDWR) {
+                    return mon_fdset_fd->fd;
+                }
+                break;
+            case O_RDONLY:
+                if ((mon_fd_flags & O_ACCMODE) == O_RDONLY) {
+                    return mon_fdset_fd->fd;
+                }
+                break;


Do we want to allow the case where the caller asked for O_RDONLY, but
the set only has O_RDWR?  After all, the caller is getting a compatible
subset of what the set offers.


I don't see a problem with it.

+            case O_WRONLY:
+                if ((mon_fd_flags & O_ACCMODE) == O_WRONLY) {
+                    return mon_fdset_fd->fd;
+                }
+                break;


Likewise, should we allow a caller asking for O_WRONLY when the set
provides only O_RDWR?


I don't see a problem with it.


+/*
+ * Dups an fd and sets the flags
+ */
+static int qemu_dup(int fd, int flags)
+{
+    int i;
+    int ret;
+    int serrno;
+    int dup_flags;
+    int setfl_flags[] = { O_APPEND, O_ASYNC, O_DIRECT, O_NOATIME,
+                          O_NONBLOCK, 0 };
+
+    if (flags & O_CLOEXEC) {
+        ret = fcntl(fd, F_DUPFD_CLOEXEC, 0);


F_DUPFD_CLOEXEC is required by POSIX but not implemented on all modern
OS yet; you probably need some #ifdef and/or configure guards.

Ok

+        if (ret == -1 && errno == EINVAL) {
+            ret = dup(fd);
+            if (ret != -1 && fcntl_setfl(ret, O_CLOEXEC) == -1) {


You _can't_ call F_SETFL with O_CLOEXEC.  O_CLOEXEC only causes open()
to set FD_CLOEXEC; thereafter, including in the case of this dup, what
you want to do is instead set FD_CLOEXEC via F_SETFD (aka call
qemu_set_cloexec, not fcntl_setfl).

I know, this is a mistake. I'm planning to replace fcntl_setfl() herewith qemu_set_cloexec().

+                goto fail;
+            }
+        }
+    } else {
+        ret = dup(fd);
+    }
+
+    if (ret == -1) {
+        goto fail;
+    }
+
+    dup_flags = fcntl(ret, F_GETFL);
+    if (dup_flags == -1) {
+        goto fail;
+    }
+
+    if ((flags & O_SYNC) != (dup_flags & O_SYNC)) {
+        errno = EINVAL;
+        goto fail;
+    }
+
+    /* Set/unset flags that we can with fcntl */
+    i = 0;
+    while (setfl_flags[i] != 0) {
+        if (flags & setfl_flags[i]) {
+            dup_flags = (dup_flags | setfl_flags[i]);
+        } else {
+            dup_flags = (dup_flags & ~setfl_flags[i]);
+        }
+        i++;
+    }


Rather than looping one bit at a time, you should do this as a mask
operation.


I agree.

+
+    if (fcntl(ret, F_SETFL, dup_flags) == -1) {
+        goto fail;
+    }
+
+    /* Truncate the file in the cases that open() would truncate it */
+    if (flags & O_TRUNC ||
+            ((flags & (O_CREAT | O_EXCL)) == (O_CREAT | O_EXCL))) {
+        if (ftruncate(ret, 0) == -1) {
+            goto fail;
+        }
+    }
+
+    qemu_set_cloexec(ret);


If we're going to blindly set FD_CLOEXEC at the end of the day, rather
than try to honor O_CLOEXEC, then why not simplify the beginning of this
function:


This call to qemu_set_cloexec() was a mistake.  I'm planning on removing it.


     ret = fcntl(fd, F_DUPFD_CLOEXEC, 0);
     if (ret == -1 && errno == EINVAL) {
         ret = dup(fd);
         if (ret != -1) {
             qemu_set_cloexec(ret);
         }
     }
     if (ret == -1) {
         goto fail;
     }

I'll plan on sticking with the existing code in the beginning of thisfunction with the modifications mentioned above.


--
Regards,
Corey

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets, (continued)
- [Qemu-devel] [PATCH v5 5/6] block: Convert close calls to qemu_close, Corey Bryant, 2012/07/23
- [Qemu-devel] [PATCH v5 1/6] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg, Corey Bryant, 2012/07/23
  - Re: [Qemu-devel] [PATCH v5 1/6] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg, Eric Blake, 2012/07/23
    - Re: [Qemu-devel] [PATCH v5 1/6] qemu-char: Add MSG_CMSG_CLOEXEC flag to recvmsg, Corey Bryant, 2012/07/23
- [Qemu-devel] [PATCH v5 4/6] block: Convert open calls to qemu_open, Corey Bryant, 2012/07/23
  - Re: [Qemu-devel] [PATCH v5 4/6] block: Convert open calls to qemu_open, Eric Blake, 2012/07/25
    - Re: [Qemu-devel] [PATCH v5 4/6] block: Convert open calls to qemu_open, Corey Bryant, 2012/07/25

Prev by Date: [Qemu-devel] [PATCH V5] VMWare PVSCSI paravirtual device implementation
Next by Date: [Qemu-devel] [PATCH] eventfd: making it thread safe
Previous by thread: Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets
Next by thread: Re: [Qemu-devel] [PATCH v5 6/6] block: Enable qemu_open/close to work with fd sets
Index(es):
- Date
- Thread