gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Increasing the filename space (Or: begging for trou


From: Tom Lord
Subject: Re: [Gnu-arch-users] Increasing the filename space (Or: begging for trouble?)
Date: Tue, 3 Feb 2004 14:52:55 -0800 (PST)

    > From: Christian =?ISO-8859-1?Q?Th=E4ter?= <address@hidden>

    >4) some of the current commands print formatted data like
    >   file/patch lists, this lists might be '--escaped' if one intends
    >   to process them with a script 

That should be the default.


    > 5) tla escape 'foo bar' -> 'foo\40bar'
    >    tla unescape 'foo\40bar' -> 'foo bar'

    > People told me that some external utilities parse tla metadata,
    > i didnt checked, but if they do they need to be escaping aware
    > and need to be updated. This commands are just used as
    > authorative escaping engine for such scripts to ease the
    > maintainance. 

Great.   Much more useful would be a document specifying the encoding.

I think that the specific encoding you are showing is problematic.  To
illustrate, let me write a filename but enclosing each character in
'<...>' but writing the tab character as <TAB>.  What does your
encoding produce for:

        <f><o><o><TAB><6><b><a><r>

Does that escape to:

        foo\116bar

?

Please keep in mind that as Unicode support is added, the set of
whitespace characters grows beyond just those in ASCII.   As far as I
am concerned, that escaped string spells (at best):

        fooNbar



    > 6) the escaping engine is table driven, by default it currently
    > uses a brute-force search which is somewhere at O(N*M). 

Something seems to have gone fairly well wrong, then -- but at least
captured behind one of you internal abstraction barriers.

    > The lookup strategy can be changed. For static defined tables like
    > in tla, it would be nice to create a pefect hash at compiletime
    > to make it much faster. 

Alternatively, choose a better syntax that doesn't create the problem
you're trying to solve.

Actually -- I have a syntax that I'd prefer:  the one that's shaping
up for Pika Scheme which would permit either:

        foo\(tab)bar

or

        foo\(U+9)bar

or 
        foo\(U+0009)bar

etc.

The latter forms should require no associative lookup of character
names.

    > For dynamic build/loaded escaping tables 

An interesting exercise but I'm not sure why one would want such a
thing for the problem at hand.

    > an AVL tree might be the choice (easy insert, no deletes). Note:
    > I consider this parts as fun excercise with some real value, but
    > it has no priority.

Good.

    >> Also: have you been using and do your patches include updates to both
    >> the tests run by `make test' and the `changeset-tests'?   The latter
    >> is absolutely critical.

    > Yes! Do you want a forked of testsuite (which will slow the test
    > down) or would it be enough if the current tests are changed to
    > contain spaces in filenames (i prefer that, since it has no
    > other side effect and doesn't make it twice as slow)

You can just add some `make test' tests that involve spaces.

The changeset-tests should really exercise the mechanism using
randomly generated filenames.


-t




reply via email to

[Prev in Thread] Current Thread [Next in Thread]