[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] qga: ignore non present cpus when handling qmp_
From: |
Igor Mammedov |
Subject: |
Re: [Qemu-devel] [PATCH] qga: ignore non present cpus when handling qmp_guest_get_vcpus() |
Date: |
Thu, 6 Sep 2018 12:50:54 +0200 |
On Thu, 6 Sep 2018 12:26:12 +0200
Laszlo Ersek <address@hidden> wrote:
> On 09/06/18 11:49, Igor Mammedov wrote:
> > On Thu, 30 Aug 2018 17:51:13 +0200
> > Laszlo Ersek <address@hidden> wrote:
> >
> >> +Drew
> >>
> >> On 08/30/18 14:08, Igor Mammedov wrote:
> >>> If VM has VCPUs plugged sparselly (for example a VM started with
> >>> 3 VCPUs (cpu0, cpu1 and cpu2) and then cpu1 was hotunplugged so
> >>> only cpu0 and cpu2 are present), QGA will rise a error
> >>> error: internal error: unable to execute QEMU agent command
> >>> 'guest-get-vcpus':
> >>> open("/sys/devices/system/cpu/cpu1/"): No such file or directory
> >>> when
> >>> virsh vcpucount FOO --guest
> >>> is executed.
> >>> Fix it by ignoring non present CPUs when fetching CPUs status from sysfs.
> >>>
> >>> Signed-off-by: Igor Mammedov <address@hidden>
> >>> ---
> >>> qga/commands-posix.c | 4 +++-
> >>> 1 file changed, 3 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/qga/commands-posix.c b/qga/commands-posix.c
> >>> index 37e8a2d..2929872 100644
> >>> --- a/qga/commands-posix.c
> >>> +++ b/qga/commands-posix.c
> >>> @@ -2044,7 +2044,9 @@ static void transfer_vcpu(GuestLogicalProcessor
> >>> *vcpu, bool sys2vcpu,
> >>> vcpu->logical_id);
> >>> dirfd = open(dirpath, O_RDONLY | O_DIRECTORY);
> >>> if (dirfd == -1) {
> >>> - error_setg_errno(errp, errno, "open(\"%s\")", dirpath);
> >>> + if (!(sys2vcpu && errno == ENOENT)) {
> >>> + error_setg_errno(errp, errno, "open(\"%s\")", dirpath);
> >>> + }
> >>> } else {
> >>> static const char fn[] = "online";
> >>> int fd;
> >>>
> > [...]
> >
> >> I wonder if, instead of this patch, we should rework
> >> qmp_guest_get_vcpus(), to silently skip processors for which this
> >> dirpath ENOENT condition arises (i.e., return a shorter list of
> >> GuestLogicalProcessor objects).
> > would something like this on top of this patch do?
> >
> > diff --git a/qga/commands-posix.c b/qga/commands-posix.c
> > index 2929872..990bb80 100644
> > --- a/qga/commands-posix.c
> > +++ b/qga/commands-posix.c
> > @@ -2114,12 +2114,14 @@ GuestLogicalProcessorList
> > *qmp_guest_get_vcpus(Error **errp)
> > vcpu->logical_id = current++;
> > vcpu->has_can_offline = true; /* lolspeak ftw */
> > transfer_vcpu(vcpu, true, &local_err);
> > -
> > - entry = g_malloc0(sizeof *entry);
> > - entry->value = vcpu;
> > -
> > - *link = entry;
> > - link = &entry->next;
> > + if (errno == ENOENT) {
> > + g_free(vcpu);
> > + } else {
> > + entry = g_malloc0(sizeof *entry);
> > + entry->value = vcpu;
> > + *link = entry;
> > + link = &entry->next;
> > + }
> > }
> >
> > if (local_err == NULL) {
> >
> > [...]
> >
>
> To me that looks like the right approach, but the details should be
> polished a bit:
>
> - After we drop the vcpu object, "local_err" is still set, and would
> terminate the loop in the next iteration.
local_error is not set due 'if (!(sys2vcpu && errno == ENOENT))'
condition in the in the transfer_vcpu().
the thing is that in case of vcpu2sys direction ENOENT is hard error.
> - It seems like ENOENT can indeed only come from openat(), in
> transfer_vcpu(), however, it would be nice if we could grab the error
> code from the error object somehow, and not from the "errno" variable. I
> vaguely recall this is what error classes were originally invented for,
> but now we just use ERROR_CLASS_GENERIC_ERROR...
I've checked it and errno is preserved during error_setg_errno() call but
not saved in Error, so I've dropped that idea.
> How about this: we could add a boolean output param to transfer_vcpu(),
> called "fatal". Ignored when the function succeeds. When the function
> fails (seen from "local_err"), the loop consults "fatal". If the error
> is fatal, we act as before; otherwise, we drop the vcpu object, release
> -- and zero out -- "local_err" as well, and continue. I think this is
> more generic / safer than trying to infer the failure location from the
> outside.
It looked more uglier to me, so I've turned to libc style of reporting
(assuming that g_free() doesn't touch errno ever)
But if you prefer using extra parameter, I'll respin patch with it.
> I'm not quite up to date on structured error propagation in QEMU, so
> take the above with a grain of salt...
>
> Thanks,
> Laszlo