qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-block] Jobs 2.0 QAPI [RFC]


From: John Snow
Subject: [Qemu-block] Jobs 2.0 QAPI [RFC]
Date: Wed, 16 Dec 2015 19:50:08 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0

In working through a prototype to enable multiple block jobs. A few
problem spots in our API compatibility become apparent.

In a nutshell, old Blockjobs rely on the "device" to identify the job,
which implies:

1) A job is always attached to a device / the root node of a device
2) There can only be one job per device
3) A job can only affect one device at a time
4) Once created, a job will always be associated with that device.

All four of these are wrong and something we need to change, so
principally the "device" addressing scheme for Jobs needs to go and we
need a new ID addressing scheme instead.

While we're at it, though, Jeff Cody is cooking up a new permissions
system for the block layer and we are considering the idea of a generic
job system for QEMU.

So we need to retool our QMP commands. Now's the time to design a good
clean break.

==== qmp_query_block_jobs ====

Currently, this loops through every BDS we have and sees if it has a job
attached or not. If yes, it appends its info into a flat list and
reports the list back.

This is easily extendable to an arbitrary number of jobs; we just append
more jobs into the list.

Instead of identifying a job by its device, we will now need to identify
a job based on its ID.

e.g.

[{"device": "drive0", ...}, {"device": "drive1", ...}, ...]

becomes:

[{"device": "drive0", "id": "#job102", ...},
 {"device": "drive1", "id": "#job529", ...}, ...]

However, if Jeff Cody's work on Block Job permissions progresses, device
may not always be reportable anymore. This job may not be attached to
any one device, not attached to a root node, or associated with multiple
graphs entirely.

Let's consider using a new command. I'm not sure if this is strictly
possible with current QAPI, but Eric will yell at me if it isn't --
forgive the wiggly pseudo-specification.

query-jobs

Will return a list of all jobs.

query-jobs sys="block"

Will return a list of all block jobs. This will be the only valid
subsystem to start with -- we don't have non-block jobs yet and it may
be some time before we do.

Each subsystem will have a sys= enumeration, and the jobs for that
particular subsystem can subclass the default return type. The QAPI
commands can be defined to accept a union of the subclasses so we don't
have to specify in advance which type each command accepts, but we can
allow the responsible subsystem to interpret it on demand dynamically.

The base type can be something along the lines of:

{ 'struct': 'JobInfo',
  'data': {
    'type': JobTypeEnum,
    'sys': JobSysEnum,
    'id': str,
    'busy': bool,
    'paused': bool,
    'ready': bool
  }
}

And the BlockJobInfo can inherit/extend, looking like this:

{ 'struct': 'BlockJobInfo',
  'base': JobInfo
  'data': {
    'len': int,
    'offset': int,
    'speed': 'int',
    'io-status': BlockDeviceIoStatus,
    'devices': <see below>,
  }
}

'devices' will now report a more nuanced view of this Job's associated
nodes in our graph, as in the following:

{ "devices": [
  { "name": "drive0",
    "nodes": [
      { "id": "#node123",
        "permissions": "+RW-X"
      },
      { "id": "#node456",
        "permissions": "+R"
      }
    ]
  }
}

This lets us show exactly what nodes we are touching, what BlockBackends
they belong to, and what permissions are involved. Unambiguously. You
can use your own imagination for a permissions string that will make
sense -- This depends on Jeff Cody's work primarily.

The new command gives a chance to make a clean break and any users of
this new command won't be confused about the information they receive.

The legacy qmp-query-block-jobs command can continue to operate
providing a best-effort approximation of the data it finds. The legacy
command will *abort* and return error if it detects any job that was
launched by a new-style command, e.g. if it detects there is more than
one job attached to any node under a single BlockBackend -- clearly the
user has been using new features -- and should abort announcing this
command is deprecated.

If the graph looks reasonably plain, it can report back the information
it always has, except that it will now also report the "id" field in
addition to be perfectly unambiguous about how to issue commands to this
job, like I outlined in the first paragraph of this section.


==== block-job-cancel ====

This command can remain as a legacy command. Provided the "device"
provided has precisely one job attached, we can allow this command to
cancel it. If more complicated configurations are found, abort and
encourage the user to use a new API.

We can refuse to augment block-job-cancel; forcing users who want to
specify IDs to cancel to use the new API.

We can add a new "job-cancel" command which removes the block specificity.

We can introduce sub-classed arguments as needed for e.g. BlockJob
cancellations:

{ 'command': 'job-cancel',
  'data': { 'id': str,
            '*force': bool,
            'options': JobCancelOpts
  }
}

The idea is that JobCancelOpts would be a subclassable type that can be
passed directly to the subsystem's implementation of the cancel
operation for further interpretation, so different subsystems can get
relevant arguments as needed. We don't currently have any such need for
BlockJobCancelOpts, but it can't hurt to be prepared for it.

The type can be interpreted through the lens of whichever type the job
we've identified expects to see. (Block Jobs expect to see
BlockJobCancelOpts, etc.)


==== block-job-pause, block-job-resume, block-job-complete ====

Same story as above. Introduce job-pause, job-resume and job-complete
with subclassible job structures we can use for the block subsystem.

Legacy commands continue to operate only as long as the "device"
argument is unambiguous to decipher.


==== block-job-set-speed ====

This one is interesting since it definitely does apply only to Block Jobs.

We can augment it and limp it along, allowing e.g.

{ 'data':
  { '*device': 'str',
    '*id': 'str',
    'speed': 'int'
  }
}

Where we follow the "One of ID or Device but not both nor either"
pattern that we've applied in the past.

Or, we can say "If there isn't one job per device, reject the command
and use the new API"

where the new API is:

job-set-option :
  { 'id': str,
    'options': {
      '*etc': ...,
      '*etc2': ...,
    }
  }

Similar to the other commands, the idea would be to take the subclassed
options structure that belongs to that Job's Subclass (e.g.
BlockJobOptions) but allow any field inside to be absent.

Then we could have commands like this:

job-set-option \
arguments={ 'id': '#job789',
            'options': {
              'speed': 10000
            }
}

And so on. These would remain a flexible way of setting any various
post-launch options a job would need, and the relevant job-subsystem can
be responsible for rejecting certain options, etc.



I believe that covers all the existing jobs interface in the QMP, and
that should be sufficiently vague and open-ended enough to behave well
with a generic jobs system if we roll one out in the future.

Eric, Markus: Please inform me of all the myriad ways my fever dreams
for QAPI are wrong. :~)


Thanks,
--js






reply via email to

[Prev in Thread] Current Thread [Next in Thread]