qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC] Replace posix-aio with custom thread pool


From: Gerd Hoffmann
Subject: Re: [Qemu-devel] [RFC] Replace posix-aio with custom thread pool
Date: Thu, 11 Dec 2008 17:11:08 +0100
User-agent: Thunderbird 2.0.0.18 (X11/20081119)

Andrea Arcangeli wrote:
>>   * It can't handle block allocation.  Kernel handles that by doing
>>     such writes synchronously via VFS layer (instead of the separate
>>     aio code paths).  Leads to horrible performance and bug reports
>>     such as "installs on sparse files are very slow".
> 
> I think here you mean O_DIRECT regardless of aio/sync API,

Yes.  But kernel aio requires O_DIRECT, so aio users are affected
nevertheless.

> So in kernels that don't support IOCB_CMD_READV/WRITEV, we've simply
> to an array of iocb through io_submit (i.e. to conver the iov into a
> vector of iocb, instead of a single iocb pointing to the
> iov). Internally to io_submit a single dma command should be generated
> and the same sg list should be built the same as if we used
> READV/WRITEV. In theory READV/WRITEV should be just a cpu saving
> feature, it shouldn't influence disk bandwidth. If it does, it means
> the bio layer is broken and needs fixing.

Havn't tested that.  Could be it isn't a big problem, extra code size
for the two modes aside.

>   > > ahem: http://www.daemon-systems.org/man/preadv.2.html > >
> 
> Too bad nobody implemented it yet...

Kernel side looks easy, attached patch + syscall table windup in all
archs ...

cheers,
  Gerd
diff --git a/fs/read_write.c b/fs/read_write.c
index 969a6d9..d1ea2fd 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -701,6 +701,54 @@ sys_writev(unsigned long fd, const struct iovec __user 
*vec, unsigned long vlen)
        return ret;
 }
 
+asmlinkage ssize_t sys_preadv(unsigned int fd, const struct iovec __user *vec,
+                              unsigned long vlen, loff_t pos)
+{
+       struct file *file;
+       ssize_t ret = -EBADF;
+       int fput_needed;
+
+       if (pos < 0)
+               return -EINVAL;
+
+       file = fget_light(fd, &fput_needed);
+       if (file) {
+               ret = -ESPIPE;
+               if (file->f_mode & FMODE_PREAD)
+                       ret = vfs_readv(file, vec, vlen, &pos);
+               fput_light(file, fput_needed);
+       }
+
+       if (ret > 0)
+               add_rchar(current, ret);
+       inc_syscr(current);
+       return ret;
+}
+
+asmlinkage ssize_t sys_pwritev(unsigned int fd, const struct iovec __user *vec,
+                              unsigned long vlen, loff_t pos)
+{
+       struct file *file;
+       ssize_t ret = -EBADF;
+       int fput_needed;
+
+       if (pos < 0)
+               return -EINVAL;
+
+       file = fget_light(fd, &fput_needed);
+       if (file) {
+               ret = -ESPIPE;
+               if (file->f_mode & FMODE_PWRITE)
+                       ret = vfs_writev(file, vec, vlen, &pos);
+               fput_light(file, fput_needed);
+       }
+
+       if (ret > 0)
+               add_wchar(current, ret);
+       inc_syscw(current);
+       return ret;
+}
+
 static ssize_t do_sendfile(int out_fd, int in_fd, loff_t *ppos,
                           size_t count, loff_t max)
 {

reply via email to

[Prev in Thread] Current Thread [Next in Thread]