|
From: | Eric Blake |
Subject: | Re: [PATCH 3/3] nbd: Use shutdown(SHUT_WR) after last item sent |
Date: | Fri, 27 Mar 2020 12:42:21 -0500 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 |
On 3/27/20 11:35 AM, Daniel P. Berrangé wrote:
On Fri, Mar 27, 2020 at 11:19:36AM -0500, Eric Blake wrote:Although the remote end should always be tolerant of a socket being arbitrarily closed, there are situations where it is a lot easier if the remote end can be guaranteed to read EOF even before the socket has closed. In particular, when using gnutls, if we fail to inform the remote end about an impending teardown, the remote end cannot distinguish between our closing the socket as intended vs. a malicious intermediary interrupting things, and may result in spurious error messages.Does this actually matter in the NBD case ? It has an explicit NBD command for requesting shutdown, and once that's processed, it is fine to just close the socket abruptly - I don't see a benefit to a TLS shutdown sequence on top.
You're right that the NBD protocol has ways for the client to advertise it will be shutting down, AND documents that the server must be robust to clients that just abruptly disconnect after that point. But we don't have control over all such servers, and there may very well be a server that logs an error on abrupt closure, where it would be silent if we did a proper gnutls_bye. Which is more important: maximum speed in disconnecting after we expressed intent, or maximum attempt at catering to all sorts of remote implementations that might not be as tolerant as qemu is of an abrupt termination?
AFAIK, the TLS level clean shutdown is only required if the application protocol does not have any way to determine an unexpected shutdown itself.
'man gnutls_bye' states:Note that not all implementations will properly terminate a TLS connec‐ tion. Some of them, usually for performance reasons, will terminate only the underlying transport layer, and thus not distinguishing between a malicious party prematurely terminating the connection and
normal termination.You're right that because the protocol has an explicit message, we can reliably distinguish any early termination prior to NBD_OPT_ABORT/NBD_CMD_DISC as being malicious, so the only case where it matters is if we have a premature termination after we asked for clean shutdown, at which point a malicious termination didn't lose any data. So on that front, I guess you are right that not using gnutls_bye isn't going to have much impact.
This is relevant for HTTP where the connection data stream may not have a well defined end condition. In the NBD case though, we have an explicit NBD_CMD_DISC to trigger the disconnect. After processing that message, an EOF is acceptable regardless of whether , before processing that message, any EOF is a unexpected.Or, we can end up with a deadlock where both ends are stuck on a read() from the other end but neither gets an EOF.If the socket has been closed abruptly why would it get stuck in read() - it should see EOF surely ?
That's what I'm trying to figure out: the nbdkit testsuite definitely hung even though 'qemu-nbd --list' exited, but I haven't yet figured out whether the bug lies in nbdkit proper or in libnbd, nor whether a cleaner tls shutdown would have prevented the hang in a more reliable manner. https://www.redhat.com/archives/libguestfs/2020-March/msg00191.html
-- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
[Prev in Thread] | Current Thread | [Next in Thread] |