coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed ou


From: Carl Edquist
Subject: Re: [PATCH] tee: Add --pipe-check to allow instantly detecting closed outputs
Date: Fri, 2 Dec 2022 05:40:03 -0600 (CST)

On Wed, 30 Nov 2022, Arsen Arsenović wrote:

Carl Edquist <edquist@cs.wisc.edu> writes:

It sounds like one way or another you want to copy your endless but intermittent input to multiple output pipes, but you want to quit as soon as all the output pipes become broken.

Precisely. The most important requirement there is that the tee-based substitute imitates the lifetime of it's longest lived output. Now I'm thinking, maybe --pipe-check should also block SIGPIPE, to prevent the race between poll, process death and write (which would result in the process getting killed, as it'd happen right now, to see what I mean, try ``tee >(sleep 100) >(:)'' and press enter after a bit; a race could make --pipe-check behave like that).

Right, you need to ignore SIGPIPE (like "tee -p" does) for multiple outputs if any of them can exit early... Sometimes I forget that '-p' is not on by default for tee. I don't think I've encountered a use-case for specifically wanting this option to be off.


I'll keep this in mind, for v2, which is currently waiting on me having some time to research the portability of this whole thing, and for a decision on whether to even include this feature is made.

On the topic of implementation - I was thinking more about a general solution for filter utils, and I am thinking the key thing is to provide a replacement (wrapper) for read(2), that polls two fds together (one input and one ouput), with no timeout.

It would check for POLLIN on input (in which case do the read()). Otherwise if there is an error (POLLERR or POLLHUP) on input, treat it as EOF. Otherwise if there's an error on output, remove this output, or handle it similar to SIGPIPE/EPIPE.

(Nothing is written to the output fd here, it's just used for polling.)

...

Although tee has multiple outputs, you only need to monitor a single output fd at a time. Because the only case you actually need to catch is when the final valid output becomes a broken pipe. (So I don't think it's necessary to poll(2) all the output fds together.)

I think this general approach of using poll(2) for a single input along with a single ouput could be used for both tee and other filters that only write to stdout.

(But again, "tail -f" is different, because it checks for regular files to grow as the source of intermittent input - which I don't think is something you can use poll() to wait for.)


I'll try to put together a patch do demo what I have in mind...


But if you don't have control over that, the fundamental problem is detecting broken pipes *without writing to them*, and I don't think that can be solved with any amount of extra pipes and fd redirection...

I imagine that, technically, this is attainable by editing the process substitutions involved to also signal the original process back; however, this feels less elegant and generally useful than tee handling this, given that tee's use-case is redirecting data to many places.

Ah, yeah, I bet you could rig something up waiting on processes and/or using signals. In the general case there is a slight difference between "waiting for a process to terminate" and "waiting for the pipe to become broken" (the process attached to the read-end of the pipe could close its stdin early, or it could exit after forking a child that keeps the pipe open), but yeah in the most common case the two events happen together.


Have a nice day :)

Carl


reply via email to

[Prev in Thread] Current Thread [Next in Thread]