[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Monotone-devel] sketch of i18n specification
From: |
Ori Berger |
Subject: |
Re: [Monotone-devel] sketch of i18n specification |
Date: |
Thu, 20 Nov 2003 01:29:51 +0200 |
User-agent: |
Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.4; MultiZilla v1.5.0.2g) Gecko/20031007 |
graydon hoare wrote:
please let me know what's wrong with it.
I would, if there was something wrong, but there isn't.
All in all, this is a great solution, which -- once polished --
should probably be compiled into an RFC-like-thingy for other
protocols and programs - most of the things are not monotone
specific at all.
That said, a few comments:
1. filenames:
.. snipped
- FIXME: what do we do about case sensitivity on Windows?
.. snipped
- a path component is a sequence of any UTF-8 character codes
except:
all codes less than 0x20 (ASCII SPACE)
0x22 (ASCII " )
0x2A (ASCII * )
0x2F (ASCII / )
0x3A (ASCII : )
0x3C (ASCII < )
0x3E (ASCII > )
0x3F (ASCII ? )
0x5C (ASCII \ )
0x7C (ASCII | )
0x7F (ASCII DEL)
This list should also include '[','{',']','}' (things that need to
be escaped in shell) if I understand the rational. Also, Windows
needs to escape "," (comma) in the shell.
This, together with the FIXME line above raises the question: What
does one do when a filename _does_ contain said characters? Refusing
to take it into monotone is a possible solution, but I think there's
a better idea: Use a standard escaping mechanism, e.g., the URL %xx
escaping. This way, the file can be manipulated, tracked, etc.
However, if you try to check it out, you'll get a legal file name.
Now, add a certificate for the file that certifies a translation
back to "standard" chars, for a given operating system. Thus, only
if I trust a key that certifies a file as "unix-shell-safe", will it
translate back to native chars on Unix; And only if I trust a key
that certifies a file as "win32-shell-safe", will it translate back
to native chars on Windows.
The monotone program (or a graphical front end) could certify a file
as safe for the architecture it runs on when it is added or renamed.
The Windows case can be similarly fixed: If the same directory
contains two names that map to the same case-insensitive name, both
should be given "win32-alternate-name", or neither will be checked out.
- manifests are constructed from the internal form (UTF-8). the
LC_COLLATE category is *not* used to sort manifest entries.
It's a good thing you state that. So many descriptions leave such
important details out.
2. file contents:
.. snipped
- as an abbreviation, setting the persistent attribute "text" with
value "true" will enable both character and line ending conversion.
Don't do that. What if text, lineconv and charconv disagree? "Text"
could be a command line abbreviation, but it shouldn't be a real
certificate, I think.
- file SHA1 values are calculated from the internal form.
Need to add to the doc that sha1sum won't be able to verify a
manifest in this case. That's a reasonable price for cross-platform
text files.
3. UI messages:
- UI messages are displayed via calls to gettext().
What about programs that want to run monotone and parse the output?
There should be some "-machinereadable" switch that makes all output
suitable for use by other programs.
6. cert values:
- subject to character set and line ending conversion unless
overridden by a hook.
I'm not sure I understand this. When and how would the hook be applied?
Is binary zero a legal character in a cert value?
Ori.