qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 1/2] Add virtagent file system freeze/thaw


From: Michael Roth
Subject: Re: [Qemu-devel] [PATCH 1/2] Add virtagent file system freeze/thaw
Date: Fri, 04 Feb 2011 10:51:07 -0600
User-agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.2.13) Gecko/20101207 Thunderbird/3.1.7

On 02/04/2011 05:03 AM, Jes Sorensen wrote:
On 02/03/11 18:41, Michael Roth wrote:
On 02/02/2011 02:48 AM, Jes Sorensen wrote:
Yes very much so[1] - one reason why it would be nice to have virtagent
use threads to execute the actual commands. We should probably add a
flag to agent commands indicating whether they issue disk I/O or not, so
we can block attempts to execute commands that do so, while the guest is
frozen.

**Warning, epic response**

For things like logging and i/o on a frozen system...I agree we'd need
some flag for these kinds of situations. Maybe a disable_logging()
flag....i really don't like this though... I'd imagine even syslogd()
could block virtagent in this type of situation, so that would need to
be disabled as well.

One way to resolve this would be to have the logging handled in it's own
thread, which uses non blocking calls to do the actual logging.
Obviously we'd have to use a non file system based communication method
between the main thread and the logging thread :)


I suspect syslogd() already buffers to some extent. But the problem there, as well the problem with having our own buffered logging implementation, is that we can't rely on being able to log an arbitrary number of messages. As some point that interface would need to either block, or stop dropping log messages.

If it blocks, we're deadlocked again. If it drops messages, it's trivial for someone to flood it with messages after the fsfreeze() to get back into a state where they can execute RPC without any accounting.

So a seperate logging thread doesn't buy us much, and what we come up with is rely gonna come down to syslogd:

1) if syslogd blocks, we must disable logging after fsfreeze(). Buffering, on syslogd's side or our own, only buys us time. If we disable logging, we must only allow absolutely required/benign RPCs. fsstatus/fsthaw are required, things like va.ping are benign, and potentially useful, in this particular situation. copyfile/shutdown/exec/etc should be considered "high-risk"...we want to make sure these are logged.

2) if syslogd doesnt block, it either drops messages at some point with no indication to us, or it drops them and provides some indication. If there's no indication, we must treat this the same way we treat 1), since we must assume that logging is effectively disabled. So only required or benign RPCs.

if we get some indication when a call to syslog() fails to log/buffer, we can allow all RPCs, but treat failures to log as a cue to immediately cleanup and exit the RPC. fsfreeze() under this circumstance will need to make sure it only logs after it does the unfreeze, else we may never be able to unfreeze at that point forward.

So the solution is dependent on syslogd's behavior. Ill have to do some testing to confirm...but I think the above covers the possibilities.

But doing so completely subverts our attempts and providing proper
accounting of what the agent is doing to the user. A user can freeze the
filesystem, knowing that logging would be disabled, then prod at
whatever he wants. So the handling should be something specific to
fsfreeze, with stricter requirements:

If a user calls fsfreeze(), we disable logging, but also disable the
ability to do anything other than fsthaw() or fsstatus(). This actually
solves the potential deadlocking problem for other RPCs as well...since
they cant be executed in the first place.

I disagree that we should block all calls, except for fsfreeze/fsstatus/
fsthaw in this case. There are other calls that could be valid in this
situations, so I think it needs to be evaluated on a case by case basis.

So a solution for these situations is still needed, and I'm starting to
agree that threads are needed, but I don't think we should do RPCs
concurrently (not sure if that's what is being suggested or not). At
least, there's no pressing reason for it as things currently stand
(there aren't currently any RPCs where fast response times are all that
important, so it's okay to serialize them behind previous RPCs, and
HMP/QMP are command at a time), and it's something that Im fairly
confident can be added if the need arises in the future.

Eventually I think we will need to be able to support concurrent RPC
calls. There can be situations where an operation takes a long time
while it is valid to be able to ping the guest agent to verify that it
is still alive etc.

Currently we're limited to 1 RPC at a time by the monitor/hmp/qmp (except for some niche cases where virtagent has multiple requests in flight). Ideally this will remain the interface we expose to users, but there could be situations where this isn't the case...for instance, other bits of qemu making direct calls into virtagent. I think we can add support for concurrent RPCs incrementally though...I've thought over the implementation details a bit and it seems to be fairly straightforward. I think we need to get the basic use cases down first though.


Cheers,
Jes





reply via email to

[Prev in Thread] Current Thread [Next in Thread]