[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-indent] Re: status of indent
From: |
Matt Ball |
Subject: |
Re: [Bug-indent] Re: status of indent |
Date: |
Wed, 29 Sep 2004 07:39:10 -0600 |
User-agent: |
Internet Messaging Program (IMP) 3.2.2 |
Hi Charlie,
See comments below:
Quoting Charlie Gordon <address@hidden>:
> Probably more likely on savannah ;-)
(savannah.gnu.org) - I'll look into it. It doesn't quite seem as stable as
sourceforge, but it seems more appropriate. In any case, I probably need to
start with my own server, then move over to one of these systems after reaching
some stability.
> > I've got several ideas, including the following:
>
> > - Change the parser to use lex/yacc (i.e. flex/bison)
>
> Why ?
>
> The C tokens are simple enough that lex/flex is quite overkill, and what
> grammar are you going to parse ?
>
> Are you going to to preprocessing ?
Not exactly...
> What about comments, are you going to strip them out ?
No, comments need to stay in.
> What about ifdef'd sections, are you going to skip them ?
The pretty-printer needs to account for all compile paths simultaneously, just
like the current indent.
> What about C-like languages and variants, what grammar will you enforce ?
Any language that does not have a preprocessor will be relatively easy, with
the
appropriate grammar rules. Each language may need its own grammar. The hope
is that yacc will make it relatively easy to add new languages.
> Given the type of constructs you want to identify in the unpreprocessed
> token stream, a grammar will hardly fit the purpose. I don't think indent
> should even try to make too much sense out of the source code. It is so
> unlikely indent will have the proper include pathes setup right, even the
> include files might not be there, not to mention command line definitions,
> architecture dependent defines, compiler specific extensions and typedefs...
>
>
> Forget it, indent should only look at the current file and resort to simple
> heuristics for block and statement extraction.
Inclusion of a grammar makes it easier to line-up statements, because the
program can more easily identify similar structures. The grammar can also help
to more correctly identify good places at which to break up long lines.
Typically, it looks better if a line is broken up at the binary operator with
lowest precedence.
The grammar is actually relatively simple, with the exception of the #if,
#elif, #else, and #endif statements. These statements create multiple parse
sequences, all of which must be considered by the pretty-printer. The current
version of indent handles the #else and #elif sections by restoring the
indentation from the original #if. This becomes very important with C code
that has unbalanced brackets within conditional compile sections. A new
grammar would need to emulate this behavior, likely by saving the parse stack
for each #if, then restoring it for each #elif and #else, and popping it for
each #endif.
>
> > - Improve the source code analysis so that the -T option isn't necessary
> to
> > identify typedefs (identify function argument declarations, and assume "A
> *B;"
> > is a declaration, not a useless multiply statement).
>
> Why even assume it can be a multiply statement ?
>
Here is a more precise example of why -T is currently needed. Take these
functions:
typedef int A;
void myfunc(void)
{
unsigned long i;
A *B;
...
}
versus...
int A;
void myfunc(void)
{
unsigned long B;
A * B; // Useless multiply statement
...
}
indent cannot distinguish between these usages within knowing somehow that A is
a typedef instead of a variable. My suggestion was to simply assume that the
programmer won't use a useless multiply statement, and that these statements
are declarations of pointers. This resolution would probably need a
more advanced grammar than indent currently has.
> > - Add an option to line up similar components of parallel structures (i.e.
> > argument declarations, structure initializer tables, bit field ':',
> parallel
> > assignments, etc)
>
> That would be nice!
As I mentioned above, lining up parallel structures is difficult to do without
a
parse tree. Indent currently lines up a couple things (declarations,
comments, etc), but largely ignores the more complicated structures, such as
parallel assignments to similar members of a struct.
The current indent has to special-case everything that it lines up. My hope is
to implement a generic solution that lines up everything that looks like it
should line up.
> Configuration by example would be nice too : give indent a template showing
> how exactly you want code to look and make it determine its internal
> configuration from the actual formatting of the template. It shouldn't be
> too difficult to implement, and indent could complain about what part of the
> template it cannot reproduce. This template file could be stored locally or
> somewhere up the directory tree (a hidden .indent file for example, just
> like .cvsignore).
This would be a nice feature to add. I don't think it's as easy as you think,
though, otherwise it would have been added 10 years ago. We can look into
it...
More notes:
I had actually sent the original message on July 13th, and it was just posted
last week. David responded to me on July 23rd. It sounds like he's been busy
fixing up a house.
I'll see what I can do to pick up indent. My thought is to only place bug
fixes
onto the 2.2 branch, and put a group of new features onto 2.3.0. After being
released, 2.3.1 and thereafter would only contain bugfixes, with 2.4 containing
new features.
At some point I would release 3.0.0, which would be a re-write using lex and
yacc, but would otherwise be a drop-in replacement for the 2.x series. Version
3.1 would then contain some of the new features, like lining up similar
constructs.
As a disclaimer, all these plans rely on me having sufficient free time.
Depending on the curve balls life throws at me, I may or may not be able to
live up to these expectations. We'll see...
-Matt
- Re: [Bug-indent] Re: status of indent,
Matt Ball <=