[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [libvirt] IO accounting overhaul
From: |
Benoît Canet |
Subject: |
Re: [Qemu-devel] [libvirt] IO accounting overhaul |
Date: |
Fri, 5 Sep 2014 16:56:34 +0200 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
The Friday 05 Sep 2014 à 16:30:31 (+0200), Kevin Wolf wrote :
> Am 01.09.2014 um 13:41 hat Markus Armbruster geschrieben:
> > Benoît Canet <address@hidden> writes:
> >
> > > The Monday 01 Sep 2014 à 11:52:00 (+0200), Markus Armbruster wrote :
> > >> Cc'ing libvirt following Stefan's lead.
> > >>
> > >> Benoît Canet <address@hidden> writes:
> > >> > /* the following would compute latecies for slices of 1 seconds then
> > >> > toss the
> > >> > * result and start a new slice. A weighted sumation of the instant
> > >> > latencies
> > >> > * could help to implement this.
> > >> > */
> > >> > 1s_read_average_latency
> > >> > 1s_write_average_latency
> > >> > 1s_flush_average_latency
> > >> >
> > >> > /* the former three numbers could be used to further compute a 1
> > >> > minute slice value */
> > >> > 1m_read_average_latency
> > >> > 1m_write_average_latency
> > >> > 1m_flush_average_latency
> > >> >
> > >> > /* the former three numbers could be used to further compute a 1 hours
> > >> > slice value */
> > >> > 1h_read_average_latency
> > >> > 1h_write_average_latency
> > >> > 1h_flush_average_latency
> > >>
> > >> This is something like "what we added to total_FOO_time in the last
> > >> completed 1s / 1m / 1h time slice divided by the number of additions".
> > >> Just another way to accumulate the same raw data, thus no worries.
> > >>
> > >> > /* 1 second average number of requests in flight */
> > >> > 1s_read_queue_depth
> > >> > 1s_write_queue_depth
> > >> >
> > >> > /* 1 minute average number of requests in flight */
> > >> > 1m_read_queue_depth
> > >> > 1m_write_queue_depth
> > >> >
> > >> > /* 1 hours average number of requests in flight */
> > >> > 1h_read_queue_depth
> > >> > 1h_write_queue_depth
>
> I don't think I agree with putting fixed time periods like 1 s/min/h
> into qemu. What you need there is policy and we should probably make
> it configurable.
>
> Do we need accounting for multiple time periods at the same time or
> would it be enough to have one and make its duration an option?
>
> > > Optionally collecting the same data for each BDS of the graph.
> >
> > If that's the case, keeping the shared infrastructure in the block layer
> > makes sense.
> >
> > BDS member acct then holds I/O stats for the BDS. We currently use it
> > for something else: I/O stats of the device model backed by this BDS.
> > That needs to move elsewhere. Two places come to mind:
> >
> > 1. BlockBackend, when it's available (I resumed working on it last week
> > for a bit). Superficially attractive, because it's close to what we
> > have now, but then we have to deal with what to do when the backend
> > gets disconnected from its device model, then connected to another
> > one.
> >
> > 2. The device models that actually implement I/O accounting. Since
> > query-blockstats names a backend rather than a device model, we need
> > a BlockDevOps callback to fetch the stats. Fetch fails when the
> > callback is null. Lets us distinguish "no stats yet" and "device
> > model can't do stats", thus permits a QMP interface that doesn't lie.
> >
> > Right now, I like (2) better.
>
> So let's say I have some block device, which is attached to a guest
> device for a while, but then I detach it and continue using it in a
> different place (maybe another guest device or a block job). Should we
> really reset all counters in query-blockstats to 0?
>
> I think as I user I would be surprised about this, because I still refer
> to it by the same name (the device_name, which will be in the BB), so
> it's the same thing for me and the total requests include everything
> that was ever issued against it.
It all depends on where you are in the user food chain.
If you are a cloud end user you are interested in the stats associated with
the hardware of your /dev/vdx because that's what iostat allow you to see.
>
> > > -API wize I think about adding
> > > bdrv_acct_invalid() and
> > > bdrv_acct_failed() and systematically issuing a bdrv_acct_start() asap.
> >
> > Complication: partial success. Example:
> >
> > 1. Guest requests a read of N sectors.
> >
> > 2. Device model calls
> > bdrv_acct_start(s->bs, &req->acct, N * BDRV_SECTOR_SIZE, BDRV_ACCT_READ)
> >
> > 3. Device model examines the request, and deems it valid.
> >
> > 4. Device model passes it to the block layer.
> >
> > 5. Block layer does its thing, but for some reason only M < N sectors
> > can be read. Block layer returns M.
>
> No, it returns -errno.
>
> > 6. What's the device model to do now? Both bdrv_acct_failed() and
> > bdrv_acct_done() would be wrong.
> >
> > Should the device model account for a read of size M? This ignores
> > the partial failure.
> >
> > Should it split the read into a successful and a failed part for
> > accounting purposes? This miscounts the number of reads.
>
> I think we should simply account it as a failed request because this is
> what it will look like for the guest. If you want the partial data that
> was internally issued, you need to look at different statistics
> (probably those of bs->file).
>
> Kevin
- Re: [Qemu-devel] IO accounting overhaul, Markus Armbruster, 2014/09/01
- Re: [Qemu-devel] [libvirt] IO accounting overhaul, Benoît Canet, 2014/09/01
- Re: [Qemu-devel] [libvirt] IO accounting overhaul, Markus Armbruster, 2014/09/01
- Re: [Qemu-devel] [libvirt] IO accounting overhaul, Benoît Canet, 2014/09/01
- Re: [Qemu-devel] [libvirt] IO accounting overhaul, Markus Armbruster, 2014/09/02
- Re: [Qemu-devel] [libvirt] IO accounting overhaul, Kevin Wolf, 2014/09/05
- Re: [Qemu-devel] [libvirt] IO accounting overhaul,
Benoît Canet <=
- Re: [Qemu-devel] [libvirt] IO accounting overhaul, Benoît Canet, 2014/09/05
- Re: [Qemu-devel] [libvirt] IO accounting overhaul, Benoît Canet, 2014/09/05
- Re: [Qemu-devel] [libvirt] IO accounting overhaul, Markus Armbruster, 2014/09/08
- Re: [Qemu-devel] [libvirt] IO accounting overhaul, Kevin Wolf, 2014/09/08