monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] Re: A Two-Fold Proposal: On Formats And Front-Ends


From: Graydon Hoare
Subject: [Monotone-devel] Re: A Two-Fold Proposal: On Formats And Front-Ends
Date: Tue, 4 Oct 2005 14:02:14 -0700

Hi,

I understand your concern, but I think you're over-stating the problem
(or failing to note the slow direction we're already moving in).
Internally, most of monotone's text formats are converging to a single
style: basic_io. Contrary to what you've suggested, whitespace is not
significant in parsing basic_io. Whitespace *is* dictated and
normalized in basic_io, because the output of basic_io is meant to be
hashed, and whitespace is still hashable bytes. We have to use *some*
level of custom-notation because almost no other text notations
normalize whitespace. Hashing imposes unique requirements.

The framework for parsing basic_io is LL(1), predictive. As simple as
possible. Read a token. The token must be a symbol. If the symbol is
one you know, branch into the thing which reads the sequence of tokens
(symbols, strings, or hex-blobs) for that symbol. Repeat. It's like
s-expressions or JSON, only it causes slightly less gnashing of teeth
of you print it out to a user. Initially, basic_io was JSON-ish, but
everyone complained that there were too many braces and colons, and
wanted "special formats" for contexts where users see it. So we
revised it to look nicer: the whitespace creates visual groupings in
the token stream, but is not significant to parsing.

Revisions, MT/options, MT/work, and .mt-attrs use basic_io. Manifests
and .mt-attrs are going away, replaced by rosters. Rosters use
basic_io. I intend all future formats in monotone to use basic_io.

basic_io was designed to be *relatively tolerable* for humans to read
as well as machines. Hence the stanza-alignment, stanza-based line
breaking, and use of quoting. This was to reduce the need for multiple
output formats in the "status" and "commit" commands: we just print
the internal, hashed representation to the screen. My intention is to
change the "ls certs", "ls keys" and "log" outputs to be basic_io
someday as well; probably when certs get their much-needed overhaul.
Sooner if someone else helps :)

There is a small, mostly-unused corner of monotone's i/o machinery
called "packets", which were intended, long ago, to generate and
consume non-whitespace-sensitive transport encodings, for example,
when sending things through email. This was added back when whitespace
actually played a role in some parsing operations. Since all of the
packet objects in question are being shifted to basic_io anyways
(which normalizes whitespace after parsing whitespace-insensitively),
we should probably discard the packet format too. A lot of this is
about available effort and time.

Some commands, as you've noted, still stand out. I can identify three
families of commands here:

1. There are commands which produce simple lists of newline-delimited filenames.
2. There are 'automate' commands which produce custom formats.
3. There is a --brief format.

I am fine with #1, in the sense that there is already special
command-line treatment for filenames (all non-option, positional
arguments are, or should be, treated as filenames), and the "unix
tradition" of for example piping a list of filenames to xargs is worth
supporting. I did not implement #2 or #3, and honestly I would not
have done them the way they're done. But you know, it's not 100% under
my control. Not even 10%. I think I was even on vacation when automate
happened.

I can tell you what my preference is, though: I'd prefer if the
automate commands all pumped out basic_io stanzas. I'd prefer if you
could send basic_io stanzas to monotone as command sequences (say, for
monotone stdio). I'd prefer if all commands could be invoked via
stdio. And I'd prefer, rather than per-command things like --brief,
that we do what marcel suggested (and what, if you look, the ROADMAP
file has listed for some time), and give lua hooks a chance to control
output formatting in general, so that if you *don't* want basic_io,
there's something simple and general you can do about it.

-graydon




reply via email to

[Prev in Thread] Current Thread [Next in Thread]