bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: quotearg improvements [was: filenames in error messages]


From: Eric Blake
Subject: Re: quotearg improvements [was: filenames in error messages]
Date: Wed, 13 Feb 2008 18:12:50 -0700
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.9) Gecko/20071031 Thunderbird/2.0.0.9 Mnenhy/0.7.5.666

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

According to Karl Berry on 2/13/2008 5:45 PM:
|
|     the "c" quoting style now outputs "\"?\"\"?/\""
|     ("?""?/") rather than "\"?\\?/\"" ("?\?/"),
|
| Sorry, I'm not following this.  What's the original filename?

Consider the original filename of `dir??/file'.  Before my patch, the
c_quoting_style converted it to `"dir?\?/file"', since `??/' is a trigraph
for `\', but that is not a valid C string.  Right now, the output is
`"dir?""?/file"', i.e. two concatenated C strings, so that a C parser
would unambiguously recognize the quoted output, even if it is parsing
trigraphs.

|
|     this assumes that C string concatenation is acceptable in that style
|
| Then we'll have to say that.  I did not imagine that it would be
| necessary.  Indeed, it seems problematic to me, it means a parsing
| program has to recognize whether the character after the first string
| constant is another string constant or (I guess) a :.  That seems like
| nontrivial complexity to be adding.

Maybe it's worth another flag to the quotearg module, default off means
output trigraphs without worrying about extra quoting (since trigraphs
default to off in gcc), but when enabled, output concatenated C strings if
the output would otherwise be a trigraph.

|
|     #include "quotearg.h"
|     ...
|     set_quoting_style (NULL, c_maybe_quoting_style);
|     quotearg_colon (string);
|
| Excellent.
|
| Can we add something to the .texi about this?
|

I'll try to spend some time on this.

|
| Meanwhile, I had sent a proposed simple change to rms for standards.texi
| about this.  No problem with the principle, but he wants to specify the
| exact list of troublesome characters and one escape to use for each, not
| just say "like C string constants".
|
| I suppose we could always use \OOO, but somehow using \n and the like
| seems like it would be much more readable.  So it'll take me a little
| time to work up that list.  And I'm not sure what effect this new
| wrinkle will have on your code, sorry.

For C strings, the code already outputs \a, \b, \f, \n, \r, \t, \v, \\,
\"; and for all other non-printable characters, a 3-digit \nnn octal
(except for NUL, which is abbreviated to \0 if the next character output
is not a digit, but never a 2-digit octal).  But now you've made me worry
whether we should also quote shell characters.  For example, should it be:

program:question mark?:line: message
or
program:"question mark?":line: message

since both ' ' and '?' are special to the shell, but not to C strings?

- --
Don't work too hard, make some time for fun as well!

Eric Blake             address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHs5WS84KuGfSFAYARArM7AJ0VWZFSlKuzvihLBB80o4MozFBGiwCfTq0k
P8ha9GbMAIgVfr1K5h8Vb6k=
=bItm
-----END PGP SIGNATURE-----




reply via email to

[Prev in Thread] Current Thread [Next in Thread]