[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 4/4] scsi-disk: Always report RESERVATION_CONFLICT to guest
From: |
Kevin Wolf |
Subject: |
Re: [PATCH 4/4] scsi-disk: Always report RESERVATION_CONFLICT to guest |
Date: |
Mon, 29 Jul 2024 14:20:01 +0200 |
Am 29.07.2024 um 13:55 hat Paolo Bonzini geschrieben:
> On Mon, Jul 29, 2024 at 11:47 AM Kevin Wolf <kwolf@redhat.com> wrote:
> > RESERVATION_CONFLICT is not a backend error, but indicates that the
> > guest tried to make a request that it isn't allowed to execute. Pass the
> > error to the guest so that it can decide what to do with it.
>
> This is only true of scsi-block (though your patch is okay here -
> scsi-disk would see an EBADE and go down the ret < 0 path).
Right, in the scsi-disk case, we probably do want to consider it a
host-side error because the guest can't see or influence what happens on
the backend.
I can change the commit message accordingly.
> In general, for scsi-block I'd expect people to use report instead of
> stop. I agree that this is the best behavior for the case where you
> have a pr-manager, but it may also be better to stop the VM if a
> pr-manager has not been set up. That's probably a bit hackish, so I
> guess it's okay to add a FIXME or TODO comment instead?
Apparently both oVirt and Kubevirt unconditionally use the stop policy,
so I'm afraid in this case we must acknowledge that our expectations
don't match reality.
If I understand correctly, not having a pr-manager could mean that QEMU
itself is sufficiently privileged and then the same logic would apply.
But even if it means that we can't change any persistent reservations
from the VM, what use would stopping the VM be? You would run into the
exact case I'm describing in the commit message: You try to resume the
VM and it immediately stops again because the request still doesn't get
through. Or do you expect the host admin to take some manual action
then?
And what would you do about the Windows cluster validation case that
intentionally sends a request which reservations don't and shouldn't
allow? There is nothing on the host side to fix there. The guest is only
happy when it gets an error back.
> > - if (status == CHECK_CONDITION) {
> > + switch (status) {
> > + case CHECK_CONDITION:
> > req_has_sense = true;
> > error = scsi_sense_buf_to_errno(r->req.sense,
> > sizeof(r->req.sense));
> > - } else {
> > + break;
> > + case RESERVATION_CONFLICT:
> > + /* Don't apply the error policy, always report to the guest */
>
> This is the only case where you get error == 0. Maybe remove it from
> the initializer, and set it here?
Not sure why the initialiser was added in the first place, but yes, I
can do that.
Kevin
> On Mon, Jul 29, 2024 at 11:47 AM Kevin Wolf <kwolf@redhat.com> wrote:
> >
> > RESERVATION_CONFLICT is not a backend error, but indicates that the
> > guest tried to make a request that it isn't allowed to execute. Pass the
> > error to the guest so that it can decide what to do with it.
> >
> > Without this, if we stop the VM in response to a RESERVATION_CONFLICT,
> > it can happen that the VM cannot be resumed any more because every
> > attempt to resume it immediately runs into the same error and stops the
> > VM again.
> >
> > One case that expects RESERVATION_CONFLICT errors to be visible in the
> > guest is running the validation tests in Windows 2019's Failover Cluster
> > Manager, which intentionally tries to execute invalid requests to see if
> > they are properly rejected.
> >
> > Buglink: https://issues.redhat.com/browse/RHEL-50000
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---
> > hw/scsi/scsi-disk.c | 15 +++++++++++----
> > 1 file changed, 11 insertions(+), 4 deletions(-)
> >
> > diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
> > index 69a195177e..e173b238de 100644
> > --- a/hw/scsi/scsi-disk.c
> > +++ b/hw/scsi/scsi-disk.c
> > @@ -235,11 +235,17 @@ static bool scsi_handle_rw_error(SCSIDiskReq *r, int
> > ret, bool acct_failed)
> > } else {
> > /* A passthrough command has completed with nonzero status. */
> > status = ret;
> > - if (status == CHECK_CONDITION) {
> > + switch (status) {
> > + case CHECK_CONDITION:
> > req_has_sense = true;
> > error = scsi_sense_buf_to_errno(r->req.sense,
> > sizeof(r->req.sense));
> > - } else {
> > + break;
> > + case RESERVATION_CONFLICT:
> > + /* Don't apply the error policy, always report to the guest */
> > + break;
> > + default:
> > error = EINVAL;
> > + break;
> > }
> > }
> >
> > @@ -249,8 +255,9 @@ static bool scsi_handle_rw_error(SCSIDiskReq *r, int
> > ret, bool acct_failed)
> > * are usually retried immediately, so do not post them to QMP and
> > * do not account them as failed I/O.
> > */
> > - if (req_has_sense &&
> > - scsi_sense_buf_is_guest_recoverable(r->req.sense,
> > sizeof(r->req.sense))) {
> > + if (!error || (req_has_sense &&
> > + scsi_sense_buf_is_guest_recoverable(r->req.sense,
> > +
> > sizeof(r->req.sense)))) {
> > action = BLOCK_ERROR_ACTION_REPORT;
> > acct_failed = false;
> > } else {
> > --
> > 2.45.2
> >
>
[PATCH 3/4] scsi-disk: Add warning comments that host_status errors take a shortcut, Kevin Wolf, 2024/07/29