[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] RFC: guest-side retrieval of fw_cfg file
From: |
Gabriel L. Somlo |
Subject: |
Re: [Qemu-devel] RFC: guest-side retrieval of fw_cfg file |
Date: |
Tue, 14 Jul 2015 15:24:39 -0400 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
On Tue, Jul 14, 2015 at 07:48:30PM +0100, Richard W.M. Jones wrote:
> > > > /* read chunk of given fw_cfg blob (caller responsible for
> > > > sanity-check) */
> > > > static inline void fw_cfg_read_blob(uint16_t select,
> > > > void *buf, loff_t pos, size_t
> > > > count)
> > > > {
> > > > mutex_lock(&fw_cfg_dev_lock);
> > > > outw(select, FW_CFG_PORT_CTL);
> > > > while (pos-- > 0)
> > > > inb(FW_CFG_PORT_DATA);
> > > > insb(FW_CFG_PORT_DATA, buf, count);
> > > > mutex_unlock(&fw_cfg_dev_lock);
> > > > }
> > >
> > > How slow is this?
> >
> > Well, I think each outw() and inb() will result in a vmexit, with
> > userspace handling emulation, so much slower comparatively than
> > inserting into a list (hence mutex here, vs. spinlock there).
>
> I wonder if using a string instruction (ie. rep insb etc) would be
> faster. On x86, qemu specifically optimizes these. Maybe GCC turns
> the above into a string instruction?
After some digging...
The insb call is indeed implemented as a "rep ins" in the kernel, and
rep appears to be optimized on the host/kvm side, so we might be in
luck.
The "while (pos-- > 0) inb(FW_CFG_PORT_DATA);" portion is there just
in case, since most of the time pos==0 and we don't need to skip any
bytes from the given fw_cfg blob before getting to the optimized insb.
I guess partial interleaved raw reads of different blobs are
*theoretically* possible, but I expect in practice they'll be
rather unlikely...
Thanks,
--Gabriel
> The reason I note all this is because there has been an ongoing
> discussion about the slowness of fw_cfg. Starting in 2010 in fact:
>
> https://lists.gnu.org/archive/html/qemu-devel/2010-07/msg00962.html
> https://lists.gnu.org/archive/html/qemu-devel/2011-10/msg00996.html
>
> On aarch64 kernel loading is really slow because it can only transfer
> (IIRC) 8 bytes at a time, and there are no string instructions we can
> use to speed it up.
>
> A long time ago I wrote a memcpy and a "pseudo-DMA" interface for
> fw_cfg, but they were both roundly rejected as you can find in the
> archives.
Re: [Qemu-devel] RFC: guest-side retrieval of fw_cfg file, Laszlo Ersek, 2015/07/14
Re: [Qemu-devel] RFC: guest-side retrieval of fw_cfg file, Matt Fleming, 2015/07/15