help-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: how to save/restore (all?) bindings


From: Christoph Anton Mitterer
Subject: Re: how to save/restore (all?) bindings
Date: Sat, 21 Oct 2023 02:29:52 +0200
User-agent: Evolution 3.50.0-1

On Fri, 2023-10-20 at 15:56 +0900, Koichi Murase wrote:
> 
> I don't think Bash 5.0 is considered an ancient version.

I meant the 3.x versions that MacOS seems to stick to.


> Also, if you
> would like to contribute to fzf with this, it's not you who
> determines
> whether a specific version is ancient or not. You need to support
> Bash
> 3.2 for macOS as the current fzf setting does.

Sure and I didn't mean to "decide" what's too ancient or good enough
for fzf.
I had indeed tried make contributions for fzf work with the old
versions too. But as I've mentioned before, it seems fzf's upstream
doesn't want to add any larger features, which was however just the
thing I'd hoped to do.

A first little thing (using bash-completion's _known_hosts_real() for
getting hostnames) got even merged:
https://github.com/junegunn/fzf/commit/d718747c5b115838af7ceaf0d4a7f1c84ed7d98b
(and, AFAIU, did not require any recent bash versions or even strictly
depend on bash-completion) but was subsequently dropped again, when
upstream learned that in recent bash-completion versions that method
got renamed.

Therefore, any even further going ideas (e.g. [0] or [1]) seem to have
little chance to get accepted in fzf.

If that is however anyway out of reach, and I have to do something on
my own instead, I can also just set some arbitrary minimum supported
version of bash, and keep the code much simpler.


You're very active in bash-completion development... so maybe you can
comment on whether it would make sense to add optional fzf integration
right in there?

Like as an configurable (perhaps even per command, or even per
completed option) alternative to the standard <Tab>-completion?!
Allowing a user to configure whether e.g. <Tab> gives always the normal
completion, or always the choices via fzf, or on a case by case basis.
And perhaps something like <Shift-Tab> for always-fzf?




> > Translated to my use case you'd do:
> > 
> > bind '"\ec":"\xC0\a\xC0\r\xC0\n"' <= \ec starts the whole thing off
> > bind -x '"\xC0\a": store_and_set' <= store_and_set would run fzf,
> > store
> >                                      the old READLINE_* run fzf and
> > set
> >                                      the new (to cd something)
> > bind '"\xC0\r":accept-line'       <= would execute the current
> >                                      READLINE_* and thus the cd
> > bind -x '"\xC0\n": store_and_set' <= would restore the old
> > READLINE_*
> > 
> > Is that about the idea?
> 
> Right. It should be noted that if you are going to include it in a
> framework (such as fzf) that can be shared by many users, instead of
> \xC0\a, \xC0\r, etc., you need to choose a longer invalid UTF-8 that
> is unlikely to conflict with the keybindings of users or other
> frameworks.

Which would however still not be guaranteed to not conflict.
And if it's just because someone copy&pasted the ones that I use,
without further thinking that this might cause troubles.


> > 1) I assume that this would also prevent the problem that some
> > other
> >    keyseq could break the whole thing like above?
> >    I.e. once \ec is recognised, it first runs \xC0\a then \xC0\r
> > and
> >    then \xC0\n and nothing can come in between?
> 
> Right. Readline macros *replace* byte sequences. Macros are not
> something prepending byte sequences.

Okay, so they're assured to be executed in that order.

But I think it's still possible to have it broken (see further below).


> > 2) But the downside is, I couldn't set/restore the keybindings
> > other
> >   than the "main" one, right?!
> 
> You shouldn't set/restore keybindings just to call a readline
> function
> from `bind -x'.

> If you would like to call `accept-line' to make sure
> that the prompt is recalculated, you don't have to even change the
> keybindings (as suggested above).

That I don't understand. Even above I need to set e.g. \xC0\r to make
sure that it's accept-line? And have some special \xC0\a to store the
current readline and set the cd command and some special \xC0\n to
restore it.

Or do you mean something else?


> What is your final goal? Is it to reset keybindings or to update the
> directory name in the prompt after changing directory in `bind -x'?
> If
> it is the former you could pursue changing the keybindings in `bind
> -x', but if it is the latter there is no reason to use fragile
> keybinding resets.

My goals would (I guess) be:

- Let the user assign 1-n keyseqs, which call a widget function that
  changes the CWD, with each of the 1-n keyseqs allowing the widget to
  be called with different options (allowing for e.g. different
  starting points for the find, different filters of filesystems like
  remote-fs-types, different fzf options, etc.).

- Try to limit modifications of the shell environment as far as
  reasonably possible (i.e. in terms of global vars that I use) in
  order to prevent collisions.
  Of course I'll have at least on function for my widget and at least
  one keybinding (e.g. Alt-C) t start it.
  But I would want to try to avoid having e.g. one function that stores
  the current READLINE_* and set the cd command, and another one that
  restores the stuff, when both can be in one.

- Rule out any chances that key bindings set up by my code break
  something else (like other tools, that do similar stuff) OR that
  other such tools break mine.

  This excludes of course the one "main" keybinding that the user has
  to set up in order to launch the widget.

  But apart from that, I think it would be best, that no keybindings
  that I need are already/still set up AND that nothing I set up is
  left behind.
  I.e. all the helper key bindings, should ideally only be there when
  needed.

  Cause even if I choose keyseqs that are highly unlikely to be used by
  the user (or some other tools that do similar things like mine),
  there is some chance - and if it's just because another tool
  copy&pasted code from "mine" and thereby uses the same keyseqs.


> Please read my first reply. First, invalid UTF-8 sequences will never
> happen in the input stream from the terminial.

Yes... but a) is that still guaranteed if someone uses some non UTF-8
locales?


> Next, the code space of
> invalid UTF-8 sequence is large, so you can choose a larger code that
> is unlikely to conflict. For example, the codespace of overlong UTF-8
> encodings [4] has the size of (128 + 2048 + 65536 =) 67712. If you
> think 67712 is not enough, you can use one of them with sufficiently
> large code as an introducer to an arbitrary length sequence, then you
> have effectively an infinite size of the codespace.
> 
> [4] https://en.wikipedia.org/wiki/UTF-8#Overlong_encodings

And b) what if someone uses the same key seqs (and either my code
breaks his or vice versa)?

Even if I use larger invalid sequences, I still only get a very high
probability that no collisions will happen.
And that already works only, if everyone that does such magic picks his
own sequences.


I'm not sure whether that's enough certainty when e.g. the user may
have a current readline of "rm -rf *", sees "Oh, I'm on the wrong dir,
so let's Alt-C to where I really wanna be", and because of some
collision my widget would then actually execute the current readline
and not one the changed one with the cd command.

Your idea would have been about this, right?
   bind    '"\ec":    "\xC0\a\xC0\r\xC0\n"'
   bind -x '"\xC0\a": store_current_and_set_new_readline'
   bind    '"\xC0\r": accept-line'
   bind -x '"\xC0\n": restore_old_readline'
And that would have been set only once, right?

If, by bad luck, something changes \xC0\a to do something else or maybe
nothing, the following \xC0\r would already caused a "rm -rf *" in the
curent readline to be executed.
Or, by bad lucky, my definition may overwrite one that something else
is dependent upon.


And yes I can use longer keyseqs and reduce probability... or... it
stays just as high if the next guy (or ChatGPT ^^) re-uses "my" code.
Just as I could have re-used your \xC0\a .


What I can live with is:
If e.g. my code doesn't work at all (in the sense of: fails
gracefully), because bash is too old, or if e.g. the restoration-phase
fails and the shell remains in an completely unusable state because
*all* bindings are gone.
Obviously I wouldn't want that either, but it's IMO far less
problematic (user can simply start a new shell) than...

... what I can rather not live with:
Which is e.g. breaking something else, which may cause arbitrary
damage, or something else breaking my tool, *if* that could cause then
some arbitrary command (e.g. what the user has in the readline) to be
executed.

Obviously, some things cannot be prevented, even if I have only the one
starting binding like:
   bind -x '"\ec": mytool'
which sets everything up (helper keybindings, etc.) and restores
everything afterwards:
- other code could overwrite mytool
- other code could overwrite the \ec binding

But then, none of mytool would be part of any possible damage. \ec
would already call something else and/or mytool would no longer be
mine.
And if my store&restore works right, I cannot break anyone else's
bindings.
That's the hope at least.


> > So whatever I do it seems I have to live with possible breakage
> > because
> > of keyseqs:
> > - either (when using "my" approach) because there are other keyseqs
> >   in-between, that break my own
> 
> No. It doesn't happen with macros.

What can at least happen with macros (but also with "my" approach) is
the following.

Assume you'd have it set up like you propose:
bind '"\ec": "\xC0\a\xC0\r\xC0\n"'
bind -x '"\xC0\a": echo store and set the cd'
bind    '"\xC0\r": accept-line'
bind -x '"\xC0\n": echo restore'

*If* they're still all set up like this (which is IMO at best likely,
but not guaranteed) when one presses \ec then all are guaranteed to be
executed in a row.

But a user could still have e.g.
  PROMPT_COMMAND+=("bind -x '\"\xC0\n\": echo pwned'")
by some other tool, which also tries to play smart games.
Or the same via some command substitution in PS1.

That would overwrite the binding *while* the macros are executed.



> > - or (when using your approach) because I may break other
> > keybindings
> >  or others may break mine?
> 
> If you would like to avoid conflicts, you should choose a larger code
> for the invalid UTF-8 sequences. There is still a possibility of
> conflicts but it's unlikely to happen unless the user tries to make
> the conflict on purpose.

I guess that's the main point where we differ in views.

You seem to think it's enough to make it unlikely there are no
collisions, I am doubtful whether that's enough respectively whether
it's even realistic to try that (just as I might have simply re-used
your \xC0\a and \xC0\r and already caused possible conflict if anyone
would also use your code from https://lists.gnu.org/archive/html/help-
bash/2022-02/msg00023.html ).



> 
> > So... I continued to think and what if in step (2) I don't just set
> > the
> > bindings I need, but completely remove any others (before doing the
> > printf '\e[5n')?
> 
> If you stick to \e[5n and \e[0n, these bindings are not needed at
> all.
> I provided the solution with macros bound to the invalid UTF-8
> sequences as an alternative way to invoke a readline function. There
> is really no reason to do both of two solutions to achieve the same
> thing.

What I don't understand there is: Your solution starts off with
   bind '"\ec": "\xC0\a\xC0\r\xC0\n"'

With that, I can:
- either only trust/hope that all these bindings are still "mine"
  and risk damage if not
or
- I can at best use \xC0\a to re-setup the following \xC0\r and \xC0\n
  to what I want them to be.



> > Even if some keyseq is already in the queue, e.g. because the user
> > did
> > a fast \ecXXXX, XXXX, whatever it is, should(?) have no effect when
> > it's finally processed after the function returns after the
> > printf '\e[5n'
> 
> XXXX may invoke the temporal binding that you set before sending
> \e[5n.

The function that \ec calls, would remove all bindings other than the
three further ones needed, i.e.:
    bind    '"\e[0n": "\xC0\r\xC0\a"'
    bind    '"\xC0\r":  accept-line'
    bind -x '"\xC0\a":  __fzf_bash_integration_widget_cd -_'

So only these could be injected and have an effect, right?
And I could also use invalid UTF-8 for two of them, so they shouldn't
appear, anyway.

If XXXX were \xC0\r ... well that's anything what should happen next,
so only problem would be that I get another (empty) accept-line when
the \e[0n finally arrives.

If XXXX were \xC0\a ... I could check whether the CWD is already
changed and READLINE_LINE empty. If not I'd know that this must have
happened, and I wouldn't restore the old READLINE_*, but would restore
the old keybindings, and perhaps remove \e[0n afterwards.
If \e[0n then arrives it should have no effect.

If XXXX were \e[0n e.g. another one caused by something else.
Well the first one (whether "mine" or not) would cause what I anyway
want to do. Which would also reset \e[0n to what it was before, which
is probably what the sender of the 2nd one wanted to cause.


And no, I'm not answering my own questions, I simply wonder if it would
work like that and what's the best solution.


> > Does that mean, that I can even do my original schema (which is the
> > code from __fzf_bash_integration_widget_cd() as given here:
> > [...]
> > and be save?
> 
> You can if it is fine that user inputs fed before \e[5n arrives.
> However, there are probably more such situations than you think. When
> the user paste a text into the terminal, many bytes will be sent at
> the same time in a packet. In this case, when the shell receives the
> initial sequence (\ec), the following bytes arrives in the shell even
> before the `bind -x' function bound to \ec is run. For another
> example, when the user is connecting to a remote host with a slow
> connection, the user inputs can also be combined into a single packet
> again.

I see. So I assume \ec would eventually be executed.

The following bytes - would they already be in the shell, waiting to be
processed as keyseqs... or would they be eaten up by fzf (once I've
started that)?

If it would be the former, and if my ideas from above would work out...
then wouldn't the worst thing that happens be, that all these follow up
bytes are ignored, because I've removed their key bindings?



Thanks,
Chris.



[0] https://github.com/junegunn/fzf/issues/3461
    This is basically what this mail thread here is about: me trying to
    make the whole thing resilient against collisions with other
    keybindsings.
    But I don't see at all how this could be made working with bash
    versions that don't have bind -x .
[1] https://github.com/junegunn/fzf/pull/3476#issuecomment-1760550281
    Where I outlined some ideas, e.g. that fzf's bash integration
    should be able to simply re-use whatever bash-completion produces,
    etc. pp..



reply via email to

[Prev in Thread] Current Thread [Next in Thread]