[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] i386/kvm: Set return value after handling KVM_EXIT_HYPERCALL
From: |
Sean Christopherson |
Subject: |
Re: [PATCH] i386/kvm: Set return value after handling KVM_EXIT_HYPERCALL |
Date: |
Thu, 12 Dec 2024 14:11:16 -0800 |
On Thu, Dec 12, 2024, Paolo Bonzini wrote:
> On 12/12/24 20:13, Sean Christopherson wrote:
> > On Thu, Dec 12, 2024, Paolo Bonzini wrote:
> > > If ret is less than zero, will stop the VM anyway as
> > > RUN_STATE_INTERNAL_ERROR.
> > >
> > > If this has to be fixed in QEMU, I think there's no need to set anything
> > > if ret != 0; also because kvm_convert_memory() returns -1 on error and
> > > that's not how the error would be passed to the guest.
> > >
> > > However, I think the right fix should simply be this in KVM:
> > >
> > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > index 83fe0a78146f..e2118ba93ef6 100644
> > > --- a/arch/x86/kvm/x86.c
> > > +++ b/arch/x86/kvm/x86.c
> > > @@ -10066,6 +10066,7 @@ unsigned long __kvm_emulate_hypercall(struct
> > > kvm_vcpu *vcpu, unsigned long nr,
> > > }
> > > vcpu->run->exit_reason = KVM_EXIT_HYPERCALL;
> > > + vcpu->run->ret = 0;
> >
> > vcpu->run->hypercall.ret
> >
> > > vcpu->run->hypercall.nr = KVM_HC_MAP_GPA_RANGE;
> > > vcpu->run->hypercall.args[0] = gpa;
> > > vcpu->run->hypercall.args[1] = npages;
> > >
> > > While there is arguably a change in behavior of the kernel both with
> > > the patches in kvm-coco-queue and with the above one, _in practice_
> > > the above change is one that userspace will not notice.
> >
> > I agree that KVM should initialize "ret", but I don't think '0' is the right
> > value. KVM shouldn't assume userspace will successfully handle the
> > hypercall.
> > What happens if KVM sets vcpu->run->hypercall.ret to a non-zero value, e.g.
> > -KVM_ENOSYS?
>
> Unfortunately QEMU is never writing vcpu->run->hypercall.ret, so the guest
> sees -KVM_ENOSYS; this is basically the same bug that Binbin is fixing, just
> with a different value passed to the guest.
>
> In other words, the above one-liner is pulling the "don't break userspace"
> card.
But how is anything breaking userspace? QEMU needs to opt-in to intercepting
KVM_HC_MAP_GPA_RANGE, and this has been KVM's behavior since commit 0dbb11230437
("KVM: X86: Introduce KVM_HC_MAP_GPA_RANGE hypercall").
Ah, "ret" happens to be deep in the union and KVM zero allocates vcpu->run, so
QEMU gets lucky and "ret" happens to be zero because no other non-fatal
userspace
exit on x86 happens to need as many bytes. Hilarious.
FWIW, if TDX marshalls hypercall state into KVM's "normal" registers, then KVM's
shenanigans with vcpu->run->hypercall.ret might go away? Though regardless of
what happens on that front, I think it makes to explicitly initialize "ret" to
*something*.
I checked our VMM, and it does the right thing, so I don't have any objection
to explicitly zeroing "ret". Though it needs a comment explaining that it's a
terrible hack for broken userspace ;-)
- [PATCH] i386/kvm: Set return value after handling KVM_EXIT_HYPERCALL, Binbin Wu, 2024/12/11
- Re: [PATCH] i386/kvm: Set return value after handling KVM_EXIT_HYPERCALL, Yao Yuan, 2024/12/12
- Re: [PATCH] i386/kvm: Set return value after handling KVM_EXIT_HYPERCALL, Zhao Liu, 2024/12/12
- Re: [PATCH] i386/kvm: Set return value after handling KVM_EXIT_HYPERCALL, Paolo Bonzini, 2024/12/12
- Re: [PATCH] i386/kvm: Set return value after handling KVM_EXIT_HYPERCALL, Sean Christopherson, 2024/12/12
- Re: [PATCH] i386/kvm: Set return value after handling KVM_EXIT_HYPERCALL, Paolo Bonzini, 2024/12/12
- Re: [PATCH] i386/kvm: Set return value after handling KVM_EXIT_HYPERCALL,
Sean Christopherson <=
- Re: [PATCH] i386/kvm: Set return value after handling KVM_EXIT_HYPERCALL, Binbin Wu, 2024/12/12
- Re: [PATCH] i386/kvm: Set return value after handling KVM_EXIT_HYPERCALL, Binbin Wu, 2024/12/12
- Re: [PATCH] i386/kvm: Set return value after handling KVM_EXIT_HYPERCALL, Binbin Wu, 2024/12/12