|
From: | Gavin Shan |
Subject: | Re: [PATCH] hw/char/pl011: Output characters using best-effort mode |
Date: | Mon, 24 Feb 2020 10:57:16 +1100 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.0 |
Hi Marc, On 2/21/20 8:09 PM, Marc Zyngier wrote:
On 2020-02-21 04:24, Gavin Shan wrote:On 2/20/20 9:10 PM, Peter Maydell wrote:On Thu, 20 Feb 2020 at 09:10, Marc Zyngier <address@hidden> wrote:On 2020-02-20 06:01, Gavin Shan wrote:This fixes the issue by using newly added API qemu_chr_fe_try_write_all(), which provides another type of service (best-effort). It's different from qemu_chr_fe_write_all() as the data will be dropped if the backend has been running into so-called broken state or 50 attempts of transmissions. The broken state is cleared if the data is transmitted at once.I don't think dropping the serial port output is an acceptable outcome.Agreed. The correct fix for this is the one cryptically described in the XXX comment this patch deletes: - /* XXX this blocks entire thread. Rewrite to use - * qemu_chr_fe_write and background I/O callbacks */ The idea is that essentially we end up emulating the real hardware's transmit FIFO: * as data arrives from the guest we put it in the FIFO * we try to send the data with qemu_chr_fe_write(), which does not block * if qemu_chr_fe_write() tells us it did not send all the data, we use qemu_chr_fe_add_watch() to set up an I/O callback which will get called when the output chardev has drained enough that we can try again * we make sure all the guest visible registers and mechanisms for tracking tx fifo level (status bits, interrupts, etc) are correctly wired up Then we don't lose data or block QEMU if the guest sends faster than the chardev backend can handle, assuming the guest is well-behaved -- just as with a real hardware slow serial port, the guest will fill the tx fifo and then either poll or wait for an interrupt telling it that the fifo has drained before it tries to send more data. There is an example of this in hw/char/cadence_uart.c (and an example of how it works for a UART with no tx fifo in hw/char-cmsdk-apb-uart.c, which is basically the same except the 'fifo' is just one byte.) You will also find an awful lot of XXX comments like the above one in various UART models in hw/char, because converting an old-style simple blocking UART implementation to a non-blocking one is a bit fiddly and needs knowledge of the specifics of the UART behaviour. The other approach here would be that we could add options to relevant chardev backends so the user could say "if you couldn't connect to the tcp server I specified, throw away data rather than waiting", where we don't have suitable options already. If the user specifically tells us they're ok to throw away the serial data, then it's fine to throw away the serial data :-)I was intended to convince Marc that it's fine to lose data if the serial connection is broken with an example. Now, I'm taking the example trying to convince both of you: Lets assume we have a ARM board and the UART (RS232) cable is unplugged and plugged in the middle of system booting. I think we would get some output lost. We're emulating pl011 and I think it would have same behavior. However, I'm not sure if it makes sense :)But the case you describe in the commit message is not that one. The analogy is that of a serial port *plugged* and asserting flow control.
Thanks for your time on the discussion. Well, I would say we saw two side of a coin. TCP connection isn't bidirectional until accept() is called on server side. The connection isn't fully functional until two directions are finalized. It would be unplug if the connection is treated as the cable :)
Another thing is that the "system" as been constructed this way by the user. QEMU is not in a position to choose and output what is convenient, when it is convenient. In my world, the serial output is absolutely crucial. This is where I look for clues about failures and odd behaviours, and I rely on the serial port emulation to be 100% reliable (and for what it's worth, the Linux kernel can output to the serial port asynchronously, to some extent). [...]
Yep, totally agreed :)
If above analysis is correct and the first approach doesn't work out. We have to consider the 2nd approach - adding option to backend to allow losing data. I'm going to add "allow-data-lost" option for TYPE_CHARDEV_SOCKET. With the option, a back-off algorithm in tcp_chr_write(): The channel is consider as broken if it fails to transmit data in last continuous 5 times. The transmission is still issued when the channel is in broken state and recovered to normal state if transmission succeeds for once.That'd be an option if you could configure the UART with something that says "no flow control". In that case, dropping data on the floor becomes perfectly acceptable, as it requires buy-in from the user.
Yep, the point is to has user's buy-in and it seems an explicit option like "allow-data-lost" fills the gap, but it seems Peter isn't reaching conclusion or decision yet. Lets see what's that finally :) Thanks, Gavin
[Prev in Thread] | Current Thread | [Next in Thread] |