qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v5 2/5] guest agent: qemu-ga daemon


From: Luiz Capitulino
Subject: Re: [Qemu-devel] [PATCH v5 2/5] guest agent: qemu-ga daemon
Date: Tue, 21 Jun 2011 10:38:40 -0300

On Mon, 20 Jun 2011 18:40:38 -0500
Michael Roth <address@hidden> wrote:

> On 06/20/2011 09:16 AM, Luiz Capitulino wrote:
> > On Sun, 19 Jun 2011 14:00:30 -0500
> > Michael Roth<address@hidden>  wrote:
> >
> >> On 06/17/2011 10:25 PM, Luiz Capitulino wrote:
> >>> On Fri, 17 Jun 2011 16:25:32 -0500
> >>> Michael Roth<address@hidden>   wrote:
> >>>
> >>>> On 06/17/2011 03:13 PM, Luiz Capitulino wrote:
> >>>>> On Fri, 17 Jun 2011 14:21:31 -0500
> >>>>> Michael Roth<address@hidden>    wrote:
> >>>>>
> >>>>>> On 06/16/2011 01:42 PM, Luiz Capitulino wrote:
> >>>>>>> On Tue, 14 Jun 2011 15:06:22 -0500
> >>>>>>> Michael Roth<address@hidden>     wrote:
> >>>>>>>
> >>>>>>>> This is the actual guest daemon, it listens for requests over a
> >>>>>>>> virtio-serial/isa-serial/unix socket channel and routes them through
> >>>>>>>> to dispatch routines, and writes the results back to the channel in
> >>>>>>>> a manner similar to QMP.
> >>>>>>>>
> >>>>>>>> A shorthand invocation:
> >>>>>>>>
> >>>>>>>>       qemu-ga -d
> >>>>>>>>
> >>>>>>>> Is equivalent to:
> >>>>>>>>
> >>>>>>>>       qemu-ga -c virtio-serial -p 
> >>>>>>>> /dev/virtio-ports/org.qemu.guest_agent \
> >>>>>>>>               -p /var/run/qemu-guest-agent.pid -d
> >>>>>>>>
> >>>>>>>> Signed-off-by: Michael Roth<address@hidden>
> >>>>>>>
> >>>>>>> Would be nice to have a more complete description, like explaining 
> >>>>>>> how to
> >>>>>>> do a simple test.
> >>>>>>>
> >>>>>>> And this can't be built...
> >>>>>>>
> >>>>>>>> ---
> >>>>>>>>      qemu-ga.c              |  631 
> >>>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++
> >>>>>>>>      qga/guest-agent-core.h |    4 +
> >>>>>>>>      2 files changed, 635 insertions(+), 0 deletions(-)
> >>>>>>>>      create mode 100644 qemu-ga.c
> >>>>>>>>
> >>>>>>>> diff --git a/qemu-ga.c b/qemu-ga.c
> >>>>>>>> new file mode 100644
> >>>>>>>> index 0000000..df08d8c
> >>>>>>>> --- /dev/null
> >>>>>>>> +++ b/qemu-ga.c
> >>>>>>>> @@ -0,0 +1,631 @@
> >>>>>>>> +/*
> >>>>>>>> + * QEMU Guest Agent
> >>>>>>>> + *
> >>>>>>>> + * Copyright IBM Corp. 2011
> >>>>>>>> + *
> >>>>>>>> + * Authors:
> >>>>>>>> + *  Adam Litke<address@hidden>
> >>>>>>>> + *  Michael Roth<address@hidden>
> >>>>>>>> + *
> >>>>>>>> + * This work is licensed under the terms of the GNU GPL, version 2 
> >>>>>>>> or later.
> >>>>>>>> + * See the COPYING file in the top-level directory.
> >>>>>>>> + */
> >>>>>>>> +#include<stdlib.h>
> >>>>>>>> +#include<stdio.h>
> >>>>>>>> +#include<stdbool.h>
> >>>>>>>> +#include<glib.h>
> >>>>>>>> +#include<gio/gio.h>
> >>>>>>>> +#include<getopt.h>
> >>>>>>>> +#include<termios.h>
> >>>>>>>> +#include<syslog.h>
> >>>>>>>> +#include "qemu_socket.h"
> >>>>>>>> +#include "json-streamer.h"
> >>>>>>>> +#include "json-parser.h"
> >>>>>>>> +#include "qint.h"
> >>>>>>>> +#include "qjson.h"
> >>>>>>>> +#include "qga/guest-agent-core.h"
> >>>>>>>> +#include "qga-qmp-commands.h"
> >>>>>>>> +#include "module.h"
> >>>>>>>> +
> >>>>>>>> +#define QGA_VIRTIO_PATH_DEFAULT 
> >>>>>>>> "/dev/virtio-ports/org.qemu.guest_agent"
> >>>>>>>> +#define QGA_PIDFILE_DEFAULT "/var/run/qemu-va.pid"
> >>>>>>>> +#define QGA_BAUDRATE_DEFAULT B38400 /* for isa-serial channels */
> >>>>>>>> +#define QGA_TIMEOUT_DEFAULT 30*1000 /* ms */
> >>>>>>>> +
> >>>>>>>> +struct GAState {
> >>>>>>>> +    const char *proxy_path;
> >>>>>>>
> >>>>>>> Where is this used?
> >>>>>>>
> >>>>>>
> >>>>>> Nowhere actually. Will remove.
> >>>>>>
> >>>>>>>> +    JSONMessageParser parser;
> >>>>>>>> +    GMainLoop *main_loop;
> >>>>>>>> +    guint conn_id;
> >>>>>>>> +    GSocket *conn_sock;
> >>>>>>>> +    GIOChannel *conn_channel;
> >>>>>>>> +    guint listen_id;
> >>>>>>>> +    GSocket *listen_sock;
> >>>>>>>> +    GIOChannel *listen_channel;
> >>>>>>>> +    const char *path;
> >>>>>>>> +    const char *method;
> >>>>>>>> +    bool virtio; /* fastpath to check for virtio to deal with 
> >>>>>>>> poll() quirks */
> >>>>>>>> +    GACommandState *command_state;
> >>>>>>>> +    GLogLevelFlags log_level;
> >>>>>>>> +    FILE *log_file;
> >>>>>>>> +    bool logging_enabled;
> >>>>>>>> +};
> >>>>>>>> +
> >>>>>>>> +static void usage(const char *cmd)
> >>>>>>>> +{
> >>>>>>>> +    printf(
> >>>>>>>> +"Usage: %s -c<channel_opts>\n"
> >>>>>>>> +"QEMU Guest Agent %s\n"
> >>>>>>>> +"\n"
> >>>>>>>> +"  -c, --channel     channel method: one of unix-connect, 
> >>>>>>>> virtio-serial, or\n"
> >>>>>>>> +"                    isa-serial (virtio-serial is the default)\n"
> >>>>>>>> +"  -p, --path        channel path (%s is the default for 
> >>>>>>>> virtio-serial)\n"
> >>>>>>>> +"  -l, --logfile     set logfile path, logs to stderr by default\n"
> >>>>>>>> +"  -f, --pidfile     specify pidfile (default is %s)\n"
> >>>>>>>> +"  -v, --verbose     log extra debugging information\n"
> >>>>>>>> +"  -V, --version     print version information and exit\n"
> >>>>>>>> +"  -d, --daemonize   become a daemon\n"
> >>>>>>>> +"  -h, --help        display this help and exit\n"
> >>>>>>>> +"\n"
> >>>>>>>> +"Report bugs to<address@hidden>\n"
> >>>>>>>> +    , cmd, QGA_VERSION, QGA_VIRTIO_PATH_DEFAULT, 
> >>>>>>>> QGA_PIDFILE_DEFAULT);
> >>>>>>>> +}
> >>>>>>>> +
> >>>>>>>> +static void conn_channel_close(GAState *s);
> >>>>>>>> +
> >>>>>>>> +static const char *ga_log_level_str(GLogLevelFlags level)
> >>>>>>>> +{
> >>>>>>>> +    switch (level&     G_LOG_LEVEL_MASK) {
> >>>>>>>> +        case G_LOG_LEVEL_ERROR:
> >>>>>>>> +            return "error";
> >>>>>>>> +        case G_LOG_LEVEL_CRITICAL:
> >>>>>>>> +            return "critical";
> >>>>>>>> +        case G_LOG_LEVEL_WARNING:
> >>>>>>>> +            return "warning";
> >>>>>>>> +        case G_LOG_LEVEL_MESSAGE:
> >>>>>>>> +            return "message";
> >>>>>>>> +        case G_LOG_LEVEL_INFO:
> >>>>>>>> +            return "info";
> >>>>>>>> +        case G_LOG_LEVEL_DEBUG:
> >>>>>>>> +            return "debug";
> >>>>>>>> +        default:
> >>>>>>>> +            return "user";
> >>>>>>>> +    }
> >>>>>>>> +}
> >>>>>>>> +
> >>>>>>>> +bool ga_logging_enabled(GAState *s)
> >>>>>>>> +{
> >>>>>>>> +    return s->logging_enabled;
> >>>>>>>> +}
> >>>>>>>> +
> >>>>>>>> +void ga_disable_logging(GAState *s)
> >>>>>>>> +{
> >>>>>>>> +    s->logging_enabled = false;
> >>>>>>>> +}
> >>>>>>>> +
> >>>>>>>> +void ga_enable_logging(GAState *s)
> >>>>>>>> +{
> >>>>>>>> +    s->logging_enabled = true;
> >>>>>>>> +}
> >>>>>>>
> >>>>>>> Just to check I got this right, this is needed because of the fsfreeze
> >>>>>>> command, correct? Isn't it better to have a more descriptive name, 
> >>>>>>> like
> >>>>>>> fsfrozen?
> >>>>>>>
> >>>>>>> First I thought this was about a log file. Then I realized this was
> >>>>>>> probably about letting the user log in, but we don't seem to have this
> >>>>>>> functionality so I got confused.
> >>>>>>>
> >>>>>>
> >>>>>> Yup, this is currently due to fsfreeze support. From the perspective of
> >>>>>> the fsfreeze command the explicit "is_frozen" check makes more sense,
> >>>>>> but the reason it affects other RPCs is because because we can't log
> >>>>>> stuff in that state. If an RPC shoots itself in the foot by writing to 
> >>>>>> a
> >>>>>> frozen filesystem we don't really care so much, and up until recently
> >>>>>> that case was handled with a pthread_cancel timeout mechanism (was
> >>>>>> removed for the time being, will re-implement using a child process 
> >>>>>> most
> >>>>>> likely).
> >>>>>>
> >>>>>> What we don't want to do is give a host a way to bypass the expectation
> >>>>>> we set for guest owners by allowing RPCs to be logged. So that's what
> >>>>>> the check is based on, rather than lower level details like *why*
> >>>>>> logging is disabled. Also, I'd really like to avoid a precedence where 
> >>>>>> a
> >>>>>> single RPC can place restrictions on all the others, so the logging
> >>>>>> aspect seemed general enough that it doesn't necessarily provide that
> >>>>>> precedence.
> >>>>>>
> >>>>>> This has come up a few times without any real consensus. I can probably
> >>>>>> go either way, but I think the check for logging is easier to set
> >>>>>> expectations with: "if logging is important from an auditing
> >>>>>> perspective, don't execute this if logging is disabled". beyond that,
> >>>>>> same behavior/symantics you'd get with anything that causes a write() 
> >>>>>> on
> >>>>>> a filesystem in a frozen state: block.
> >>>>>
> >>>>> While I understand your arguments, I still find the current behavior
> >>>>> confusing: you issue a guest-open-file command and get a 
> >>>>> QgaLoggingFailed
> >>>>> error. The first question is: what does a log write fail have to do 
> >>>>> with me
> >>>>> opening a file? The second question is: what should I do? Try again? 
> >>>>> Give up?
> >>>>>
> >>>>
> >>>> I agree it's a better solution for the client there, but at the same 
> >>>> time:
> >>>>
> >>>> guest_privileged_ping():
> >>>>      if fsfreeze.status == FROZEN:
> >>>>        syslog("privileged ping, thought you should know")
> >>>>      else:
> >>>>        return "error, filesystem frozen"
> >>>>
> >>>> Seems silly to the client as well. "Why does a ping command care about
> >>>> the filesystem state?"
> >>>>
> >>>> Inability to log is the true error. That's also the case for the
> >>>> guest-file-open command.
> >>>
> >>> There's a difference. In the guest ping the inability to log is an
> >>> internal agent error, it's not interesting to the client. We could just
> >>> ignore the error and reply back (unless we define that 
> >>> guest-privileged-ping
> >>> requires writing to disk).
> >>>
> >>
> >> To clarify, these's 2 different types of errors here and their relation
> >> to fsfreeze is causing the separation to get munged a bit:
> >>
> >> 1) Logging/accounting errors: even though we try to make it clear
> >> wherever possible this solution is based on a "trusted" hypervisor that
> >> already has substantial privileges over guests (direct access to images,
> >> etc), one of our internal use cases is the ability for customers to
> >> properly account for actions initiated by the guest agent. Shutdowns,
> >> whats files were read/written, etc.
> >
> > This is fine.
> >
> >> For some commands, this accounting is not important. guest-ping, or
> >> guest-info, for instance, is uninteresting from a security accounting
> >> perspective. So logging is in fact optional and logging failures are
> >> silently ignored.
> >>
> >> guest-shutdown, guest-file-open, guest-fsfreeze, however, are important.
> >> So the QgaLoggingError we currently report really means "this command
> >> requires logging, but logging has been disabled". You're right that this
> >> is because of fsfreeze, but it's not because of the filesystem state.
> >> guest-file-open on a network mount that wasn't frozen would *still*, by
> >> design, report an error due to inability to log.
> >
> > Completely ignoring the file-system state and considering only what you
> > say about security, this doesn't make sense to me. Actually, I think it's
> > a flaw, because a client could open the daemon's log file and write garbage
> > on it.
> >
> > But, how does the ability to log protects you from anything? I would 
> > understand
> > it's importance if the user had to provide credentials to use the agent,
> > but s/he doesn't. It's like you were unable to use a linux box because 
> > syslogd
> > is not running.
> >
> > This looks like a misfeature to me. If you really want to provide it, then
> > you could provide a --enable-log-error which, when enabled, would make the
> > daemon not to execute *any* command if the it's able to log its actions.
> > Otherwise log errors are just ignored.
> >
> 
> Hmm, you're right. I'd thought that if you clobber the log file at least 
> the RPC that clobbered the log file would get logged, but since we only 
> log the guest-file-open, and we do that before the guest-file-write, 
> they could overwrite and fabricate the log file history and nothing 
> would be obviously suspicious.
> 
> Since we can't build any accounting around something like that I'm fine 
> with having the logging silently fail in a freeze state for now.
> 
> A better thought out accounting mechanism can be added as an optional 
> feature later if it comes to that.

Yes.

> >> For this case I do see your point about the error not being very helpful
> >> to users, which is why I think something like:
> >>
> >> "Guest agent failed to log RPC due to frozen filesystems"
> >>
> >> Is the best way to report it. If we need to fix up the error id to
> >> something like QgaLoggingToFrozenFilesystem I'd be fine with that.
> >>
> >> 2) EWOULDBLOCK or I/O errors due to writing to frozen filesystems: this
> >> is currently treated as a PEBKAC and not enforced/reported at all. I
> >> wouldn't be against disabling guest-file-write as a side-effect of
> >> freezing, but that's not totally correct. Writes to non-frozen
> >> filesystems like networked or sysfs mounts is technically still okay,
> >> for instance. We store path information in guest-file-open and use it to
> >> scan against fsfreeze's state info about frozen mounts to handle this a
> >> little better.
> >
> > Yes, I agree with you here.
> >
> 
> Treating I/O errors/deadlocks due to fsfreeze as a user issue (as things 
> are currently), or working around it? If the latter, a), b), or c)?

Option b) makes sense to me too.

> 
> >>
> >> Even then you still have RPCs like guest-shutdown that may do a syslog()
> >> that would cause the command to freeze, or in the future we may add the
> >> ability for the guest agent to execute user/distro-installed scripts
> >> that may/may not need to write to the filesystem. Some these might even
> >> be intended to run when the filesystem is frozen...cleanup scripts for
> >> databases and whatnot for snapshotting is something I've seen come up.
> >>
> >> We can
> >>
> >> a) be very restrictive, which I'm not totally against...my initial
> >> inclination was to disable everything except
> >> guest-fsfreeze-thaw/guest-fsfreeze-status while frozen, but this has
> >> some limitations that aren't as obscure/unlikely as one would hope.
> >>
> >> b) we can be completely unrestrictive about it and give users the same
> >> experiences they'd get at a command-line if they, or another user on the
> >> system, was messing with fsfreeze.
> >>
> >> c) try to capture some common cases, but make it clear that you can
> >> still shoot yourself in the foot using the guest agent on a frozen
> >> filesystem, evaluating when to enforce this in code on a case-by-case 
> >> basis.
> >>
> >> Personally I like b), mostly because it's the least work, but also
> >> because it's actually the easiest way to set expectations about fsfreeze.
> >>
> >>> The guest-file-write command, on the other hand, clearly requires to write
> >>> to disk, so a client would expect a EWOULDBLOCK error.
> >>>
> >>> EWOULDBLOCK looks good to me. We could define as general rule that
> >>> commands don't block, so clients should always expect a EWOULDBLOCK.
> >>>
> >>
> >> This falls under 2)
> >>
> >>>> There's nothing wrong with opening a read-only
> >>>> file on a frozen filesystem, it's the fact that we can't log the open
> >>>> that causes us to bail out..
> >>>
> >>> But why do we bail out? Why can't we just ignore the fact we can't log?
> >>>
> >>
> >> This falls under 1)
> >>
> >>>> So, what if we just munge the 2 to give the user the proper clues to fix
> >>>> things, and instead return an error like:
> >>>>
> >>>> "Guest agent failed to log RPC (is the filesystem frozen?)"?
> >>>
> >>> The 'desc' part of an error is a human error descriptor, clients shouldn't
> >>> parse it. The real error is the error class.
> >>>
> >>
> >> As mentioned above, we could change the error id itself to capture both
> >> the immediate error and the underlying error.
> >>
> >
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]