Re: [Qemu-block] RFC block/iscsi command timeout

qemu-block

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] RFC block/iscsi command timeout

From:	Peter Lieven
Subject:	Re: [Qemu-block] RFC block/iscsi command timeout
Date:	Tue, 02 Jun 2015 16:45:02 +0200
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0

Am 26.05.2015 um 12:21 schrieb Paolo Bonzini:


On 26/05/2015 12:06, Kevin Wolf wrote:

Am 26.05.2015 um 11:44 hat Paolo Bonzini geschrieben:


On 26/05/2015 11:37, Kevin Wolf wrote:

If we run into a timeout we theoretically have the following options:
  - reconnect
  - retry
  - error

I would reconnect as Ronnie proposed.

Just trying to reconnect indefinitely might not be the best option.
Consider the situation where you're inside a bdrv_drain_all(), which
blocks qemu completely. Trying to reconnect once or twice is probably
fine, but if that doesn't work, eventually you want to return an error
so that qemu is unstuck.

Whenever the topic of timeout is brought about, I'm worried that
introducing timeouts (and doing anything except reconnecting) is the
same as NFS's soft option, which can actually cause data corruption.
So, why would it be safe?

How would it cause data corruption for qemu, i.e. which of the block
layer assumptions would be broken?

Reordering of operations.  Say you have:

      guest -> QEMU        write A to sector 1
      QEMU -> NFS          write A to sector 1
      QEMU -> guest        write A to sector 1 timed out
      guest -> QEMU        write B to sector 1

At this point you have the two outstanding writes are for the same
sector and with different payloads, so it's undefined which one wins.

      QEMU -> NFS          write B to sector 1
      NFS -> QEMU          write B to sector 1 completed
      QEMU -> guest        write B to sector 1 completed
      NFS -> QEMU          write A to sector 1 completed
                           (QEMU doesn't report this to the guest)

The guest thinks it has written B, but it's possible that the storage
has written A.


So you would go for infinite reconnecting? We can SIGKILL then anyway.

As said before my idea would be default of 5000ms for all sync calls and
no timeout for all async calls coming from the block layer.

A user settable timeout can be optionally specified via cmdline options
to define a timeout for both sync and async calls.

Peter

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-block] RFC block/iscsi command timeout, Peter Lieven <=
- Re: [Qemu-block] RFC block/iscsi command timeout, ronnie sahlberg, 2015/06/02
  - Re: [Qemu-block] RFC block/iscsi command timeout, Paolo Bonzini, 2015/06/03

Prev by Date: [Qemu-block] [PATCH 2/4] qcow2: add option to clean unused cache entries after some time
Next by Date: Re: [Qemu-block] [PATCH v6 0/8] block: Mirror discarded sectors
Previous by thread: [Qemu-block] [PATCH v5 0/4] Clean unused entries in the qcow2 L2/refcount cache
Next by thread: Re: [Qemu-block] RFC block/iscsi command timeout
Index(es):
- Date
- Thread