[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [TUHS] Re: A fuzzy awk. (Was: The 'usage: ...' message.)
From: |
G. Branden Robinson |
Subject: |
Re: [TUHS] Re: A fuzzy awk. (Was: The 'usage: ...' message.) |
Date: |
Wed, 29 May 2024 01:39:32 -0500 |
Hi Frederic,
At 2024-05-29T05:21:38+0200, Frederic Chartier via wrote:
> On 2024-05-20 09:00 -0500, G. Branden Robinson wrote:
> > For grins, and for a data point from elsewhere in GNU-land, GNU
> > troff is pretty robust to this sort of thing. Much as I might like
> > to boast of having improved it in this area, it appears to have
> > already come with iron long johns courtesy of James Clark and/or
> > Werner Lemberg. I threw troff its own ELF executable as a crude
> > fuzz test some years ago, and I don't recall needing to fix anything
> > except unhelpfully vague diagnostic messages (a phenomenon I am
> > predisposed to observe anyway).
> >
> > I did notice today that in one case we were spewing back out
> > unprintable characters (newlines, character codes > 127) _in_ one
This would have better said "character codes > 159"; see below.
> > (but only one) of the diagnostic messages, and while that's ugly,
> > it's not an obvious exploitation vector to me.
>
> Going off-topic but I need a clarification. Are you saying that
> you wouldn't consider writing arbitrary characters to a terminal
> a security risk ?
In _groff_? No.
> To rephrase that in the form of a scenario :
>
> 1. Attacker crafts file that, when directly or indirectly
> processed by our program, causes it to include string /s/ in
> an error message,
>
> 2. Victim runs our program. Error message goes to standard error
> which is written to the victim's terminal.
>
> Is there no value of /s/ that could be considered harmful ?
I would not make that claim.
However, this is not groff's problem to solve. Its programs' standard
error streams (like its standard output streams) can be redirected.
Consider:
groff -Tps -mm whatever.mm > /dev/sda
Clearly, if one has write privileges to /dev/sda, this is an obvious
denial-of-service attack (assuming a device "lives" at /dev/sda). And
possibly something worse if whatever.mm is a nefariously crafted
document. (It could, perhaps, use `output` or `device` requests to
inject interesting material into the output despite postprocessing by
grops(1).)
But, in normal circumstances, redirecting output streams to files
(mundane files, not block devices) is not considered a hazardous
operation.
So what if one of those streams goes to a terminal emulator?
Well, certainly, a terminal emulator, or a shell, can decide to perform
any operation it wants upon receiving certain input.
I can do the following today without leveraging any sloppy handling of
diagnostic messages in GNU troff.
$ `printf '.tm rm -rf /\n' | groff 2>&1` # DO NOT RUN THIS
Even worse, if someone tricks me into specifying the `-U` flag...
$ `printf '.sy rm -rf /\n' | groff -U` # DO NOT RUN THIS, EITHER
But these hazards have been well known for decades and the latter is why
GNU troff _has_ "safer" mode as a default, whereas AT&T troff did not.
I have certainly seen terminal emulator (mis)behavior such that, after
accidentally catting some binary file to the terminal, some garbage
characters got stuffed back into the input stream and populated my shell
prompt, and, alarmingly, would be executed by the shell if I pressed
enter without killing the line first.
I haven't seen that problem in years. As I understand it, the
"bracketed paste mode" now supported by GNU Bash (and some other shells)
and by the XTerm teminal emulator (possibly among others), have closed
this door. (Perhaps bracketed paste mode proper didn't close this
security hole, but contemporaneous work around it did. I'm pretty vague
on the history. Anyone have any pointers to good resources?)
I'd say that, largely, the problem that concerns you is one that shells
and terminal emulators need to address, and to my knowledge, they have.
There does remain another point to consider. What, exactly, _can_ be
injected into the standard error stream by the mechanism described
earlier in the thread?
Here is the new diagnostic (still awaiting my push--I've had my
attention on other things for a few days).
35 troff:/usr/bin/troff: error: invalid positional argument number
(unprintable)
This diagnostic occurs only when an invalid escape sequence of the form
\$1
\$2
\$3
...
\$9
or
\$(01
...
\$(99
or (in a GNU troff extension)
\$[12345]
is encountered.
Unless you can wreak havoc with only 1 or 2 bytes, the first two look
like an unlikely vector. Can your exploit tolerate interruption by a
newline and another *roff diagnostic message?
So let's try something.
$ hd /tmp/branden/nefarious.groff
00000000 2e 74 6d 20 74 68 69 73 20 69 73 20 61 20 74 65 |.tm this is a te|
00000010 73 74 20 6f 66 20 73 65 74 74 69 6e 67 20 74 68 |st of setting th|
00000020 65 20 78 74 65 72 6d 20 74 69 74 6c 65 20 76 69 |e xterm title vi|
00000030 61 20 61 20 67 72 6f 66 66 20 64 6f 63 75 6d 65 |a a groff docume|
00000040 6e 74 0a 48 65 6c 6c 6f 2c 20 5c 24 5b 1b 5d 30 |nt.Hello, \$[.]0|
00000050 3b 65 76 69 6c 20 6c 61 75 67 68 74 65 72 07 5d |;evil laughter.]|
00000060 0a 77 6f 72 6c 64 21 0a |.world!.|
00000068
$ /usr/bin/groff --version|head -n 1
GNU groff version 1.22.4
That should be old enough to have problems.
$ /usr/bin/groff -Tascii -ww /tmp/branden/nefarious.groff | cat -s
this is a test of setting the xterm title via a groff document
troff: /tmp/branden/nefarious.groff:2: warning: invalid input character code 27
troff: /tmp/branden/nefarious.groff:2: empty escape name
troff: /tmp/branden/nefarious.groff:2: warning: can't find character with input
code 7
Hello, 0;evil laughter] world!
So there are (and apparently have been for many years) a couple of
barriers to injecting arbitrary content including escape sequences into
the standard error stream via GNU troff.
1. The escape character itself gets rejected in input. You're going to
have a hard time getting a 7-bit clean ECMA-48/ISO 6429/"ANSI"
escape sequence[0] started if the escape character is filtered out.
Invalid input characters are discarded early in GNU troff's lexical
analysis; for the purpose of _parsing_ a *roff document, they don't
exist.
https://www.gnu.org/software/groff/manual/groff.html.node/Identifiers.html
2. ECMA-48 "operating system commands" must begin with the sequence
ESC ]. This is a frustrating coincidence for wicked groff documents
seeking to do evil: in my example above, the discard of the escape
character per the previous item leaves an escape sequence that looks
like this:
\$[]
...which is also invalid.
Hence:
troff: /tmp/branden/nefarious.groff:2: empty escape name
(That diagnostic seems a little vague to me. Maybe I'll change it.)
3. Having decided that the escape sequence is invalid, GNU troff leaves
escape-sequence-handling state, and interprets remaining characters
as formatted text (or whatever the enclosing context was).
Thus:
Hello, 0;evil laughter] world!
Disappointingly for our miscreant, the xterm title is unmolested.
4. There are other ECMA-48 escape sequences. Many are of a form
starting with ESC [. The opening bracket will be syntactically
valid in more groff escape sequences, but the removal of the escape
_character_ will still happen and frustrate attempts at _terminal_
escape sequence injection to the standard error stream.
5. There is the CSI approach. This is rapidly dying because in UTF-8
environments, you can't use CSI (0x9B) to introduce escape sequences
anymore because it is also a valid component of a UTF-8 character
sequence. So any tool that consumes UTF-8 will interpret it thus,
(possibly discarding it altogether if a UTF-8 continuation sequence
is invalid) and not switch to escape sequence interpretation mode.
6. But, you may argue, groff doesn't interpret UTF-8 input yet. One of
the main reasons it doesn't in fact rides to our rescue here. groff
uses some of the code points in the C1 control area (0x80-0x9F) to
represent objects of interest to the parser.
https://git.savannah.gnu.org/cgit/groff.git/tree/src/roff/troff/input.h?h=1.23.0#n22
It turns out that 0x9B (CSI) (155 decimal, 233 octal) is _not_
allocated to any such meaning in GNU troff.
So can we pass it in?
$ printf 'hello,\233world' | groff -Tascii -ww | cat -s
troff:<standard input>:1: warning: invalid input character code 155
hello,world
Nope.
In fact let's go ahead and explore the space of C0 and C1 controls.
$ for dec in $(seq 0 31) $(seq 128 159); do oct=$(printf "%o" $dec); \
printf "\\$oct\\n" | ~/groff-stable/bin/groff -Tascii -ww -z; done
troff:<standard input>:1: warning: invalid input character code 0
troff:<standard input>:1: warning: character with input code 2 not defined
troff:<standard input>:1: warning: character with input code 3 not defined
troff:<standard input>:1: warning: character with input code 4 not defined
troff:<standard input>:1: warning: character with input code 5 not defined
troff:<standard input>:1: warning: character with input code 6 not defined
troff:<standard input>:1: warning: character with input code 7 not defined
troff:<standard input>:1: warning: invalid input character code 11
troff:<standard input>:1: warning: character with input code 12 not defined
troff:<standard input>:1: warning: invalid input character code 13
troff:<standard input>:1: warning: invalid input character code 14
troff:<standard input>:1: warning: invalid input character code 15
troff:<standard input>:1: warning: invalid input character code 16
troff:<standard input>:1: warning: invalid input character code 17
troff:<standard input>:1: warning: invalid input character code 18
troff:<standard input>:1: warning: invalid input character code 19
troff:<standard input>:1: warning: invalid input character code 20
troff:<standard input>:1: warning: invalid input character code 21
troff:<standard input>:1: warning: invalid input character code 22
troff:<standard input>:1: warning: invalid input character code 23
troff:<standard input>:1: warning: invalid input character code 24
troff:<standard input>:1: warning: invalid input character code 25
troff:<standard input>:1: warning: invalid input character code 26
troff:<standard input>:1: warning: invalid input character code 27
troff:<standard input>:1: warning: invalid input character code 28
troff:<standard input>:1: warning: invalid input character code 29
troff:<standard input>:1: warning: invalid input character code 30
troff:<standard input>:1: warning: invalid input character code 31
troff:<standard input>:1: warning: invalid input character code 128
troff:<standard input>:1: warning: invalid input character code 129
troff:<standard input>:1: warning: invalid input character code 130
troff:<standard input>:1: warning: invalid input character code 131
troff:<standard input>:1: warning: invalid input character code 132
troff:<standard input>:1: warning: invalid input character code 133
troff:<standard input>:1: warning: invalid input character code 134
troff:<standard input>:1: warning: invalid input character code 135
troff:<standard input>:1: warning: invalid input character code 136
troff:<standard input>:1: warning: invalid input character code 137
troff:<standard input>:1: warning: invalid input character code 138
troff:<standard input>:1: warning: invalid input character code 139
troff:<standard input>:1: warning: invalid input character code 140
troff:<standard input>:1: warning: invalid input character code 141
troff:<standard input>:1: warning: invalid input character code 142
troff:<standard input>:1: warning: invalid input character code 143
troff:<standard input>:1: warning: invalid input character code 144
troff:<standard input>:1: warning: invalid input character code 145
troff:<standard input>:1: warning: invalid input character code 146
troff:<standard input>:1: warning: invalid input character code 147
troff:<standard input>:1: warning: invalid input character code 148
troff:<standard input>:1: warning: invalid input character code 149
troff:<standard input>:1: warning: invalid input character code 150
troff:<standard input>:1: warning: invalid input character code 151
troff:<standard input>:1: warning: invalid input character code 152
troff:<standard input>:1: warning: invalid input character code 153
troff:<standard input>:1: warning: invalid input character code 154
troff:<standard input>:1: warning: invalid input character code 155
troff:<standard input>:1: warning: invalid input character code 156
troff:<standard input>:1: warning: invalid input character code 157
troff:<standard input>:1: warning: invalid input character code 158
troff:<standard input>:1: warning: invalid input character code 159
That's GNU troff 1.23.0 output. groff 1.22.4 looks the same. GNU troff
has been filtering out this sort of garbage for a long time.[1]
The reason I attacked this problem in the first place is that the error
diagnostics _looked_ bad, and, I thought, might lead a user to worry
that GNU troff wasn't sanitizing input as scrupulously as it in fact
does. Less forgiving input validation has indeed been a theme of my
changes to groff over the past 5 years or so, but this sanitation long
predates my participation in its development.
I won't say there isn't _any_ way to put wicked things on the standard
error stream with GNU troff. To make such a claim with any confidence
would require more time than I've given the issue tonight, and likely
the assistance of other people with greater expertise in exploitation.
But fundamentally I think the problem you raise isn't groff's to solve.
Terminal emulators need to be careful with input that doesn't come from
an actual human input device, and the existence of bracketed-paste mode
suggests that they are exercising more caution in this area.
Nevertheless, beyond discarding ESC and CSI characters--among other
control codes less promising for system penetration--from the input
stream, after I push the change noted earlier in the thread, GNU troff
will prove less useful a vehicle for writing discomfiting content to a
terminal device via the standard error stream.
And I hope that makes us all feel a little bit (unprintable) better. ;-)
Regards,
Branden
[0]
https://www.ecma-international.org/wp-content/uploads/ECMA-48_5th_edition_june_1991.pdf
[1] One might wonder why a few values low in C0 are not complained
about. They are code points with syntactical meaning to *roff (not
just GNU troff), such as the typographical leader, backspace[2],
horizontal tab, line feed (newline), and a handful of control
characters that historically have been used in formatted output
comparison operators as delimiters (^B, ^C, ^G; ^D, ^E, and ^F are
less often seen). Because GNU troff tracks the "input level" of
delimited arguments, this subterfuge is not necessary in documents
prepared for our formatter.
https://www.gnu.org/software/groff/manual/groff.html.node/Compatibility-Mode.html#index-input-level-in-delimited-arguments
[2] I don't think I've actually documented the semantics of an input
backspace in the GNU troff manual. And at present it says,
seemingly wrongly, that 0x08 is an invalid input character! A task
awaits...
signature.asc
Description: PGP signature