bug-global
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Tony.RE: GNU Global Parsing Suffixless Files Patch


From: Shigio YAMAGUCHI
Subject: Re: Tony.RE: GNU Global Parsing Suffixless Files Patch
Date: Thu, 6 Oct 2016 15:10:17 +0900

> At one stage I thought of extending the gtags file format to include
> an optional language override, it's similar to your file list idea...
> However as I used global more I started to shy away from that as it's
> high maintenance and would break automatic recursive update on file addition.
>
> For example: If you're working on a project that has non-standard file
> naming conventions and/or has particular type types in odd places (like
> my texi/inc example) then if you used a file list/type approach you'd
> need to update that each time you added another suffixless header file.

The can not be automated is a misunderstanding. You can automate it
just to write the following script and use instead of 'global -u'.

        [global-u.sh]
        +-----------------------------------------------------------------
        |#!/bin/sh
        |root=`global -pr` && cd $root                  # Move to the project root
        |if [ $? = 0 ]; then
        |       find ..... > cppfiles                   # Make cppfiles
        |       gtags -i --force-language=cpp:cppfiles
        |fi
        +-----------------------------------------------------------------

Global(1) is available in a shell script.

> So as I understand it we would have --language-force=<Language>:<Specifier>
> where <Specifier> would be one of:
>         *x- Existing langmap style extension list e.g. `.c.h'.
>         *x- File only glob pattern e.g. `([Mm]akefile)'.
>         *x - A mixture of the above two e.g. `.c.h([Mm]akefile)(*.inc)'
>          x - A dumb path substring match (possibly with the caveat that
>              it must start with ./ or / to distinguish it from the above?)
>              e.g. '/include/'.
>          ? - A bare name of a file list in the config e.g. `cppfiles'?
> Those entries marked with * would also apply to langmap config entries as well.
> Those entries marked with x meet my requirements/wishlist.
> With those additional features marked with x and your proposed priority
> list as detailed yesterday I would say that would give maximum benefit
> without too much extra cost (famous last words!).
>
> One additional feature/thought is that one could have a language type of auto
> that would mean do normal file type detection. Thus the above example would read:
>
>          --force-language=cpp:include --force-language=auto:.c \
> --force-language=auto:([Mm]akefile)

To hear your explanation, I though that we should take the file list again.
It is suitable for dealing with those with no rules like C++ include file.
Otherwise, the specification becomes too large.

Though the file list has also bad points, there is also a big good point that
you can entrust the function of selecting files external programs (or humans).
About for 'selecting files' there is no program other than find(1).

> Do you have that patch for ctags to give out references? And if applied could
> gtags make use of them? If so would you be happy to send that to me?

It was merged to Universal Ctags. But there is no parser which use the mechanism yet.
(See makeSimpleRefTag in main/parse.c)

Regards,
Shigio


2016-10-06 0:28 GMT+09:00 Cooper, Anthony <address@hidden>:
SECURITY CLASSIFICATION: OFFICIAL




> -----Original Message-----
> From: address@hidden [mailto:address@hidden] On Behalf Of
> Shigio YAMAGUCHI
> Sent: 05 October 2016 02:56
> To: Cooper, Anthony
> Cc: address@hidden
> Subject: Re: GNU Global Parsing Suffixless Files Patch
>
> > Q: I'm assuming any glob patterns would implicitly be anchored to
> > the end of the path string (as they are in bash)?
>
> Yes. In ctags, '(<pattern>)' matches to file names not path names, like '.c.h'.

:-)

>
> > Yes I know... In fact after originally looking at global and ctags I
> > thought how potentially dangerous ctags's --force-language option
> > was and that's why I called my extension suffixless_langmap.
> > My intention was  that this option wouldn't force anything but
> > instead provide a default language when there wasn't a file suffix.
> >
> > For example, in project include directories you quite often get
> > other
>
> > artefacts like .c, .texi, .html (I know that these get excluded) and
> > .inc files (MSVS). If the --force-language override option is used
> > on those include directories then files with a suffix don't
> > automatically get handled the way they should. Instead you'd
> > possibly have to put in additional more specific --force-language
> > overrides to reinstate default behaviour for certain extensions. E.g.:
>
> You are right. It is a important point. You should be able to finely control.
>
> How about using a 'file list' instead of a direct path.
>
> --language-force=<lang>:<file list>
>
> File list is a file which lists file names.
>
> e.g.
> [cppfiles]
> +-----------------------------
> |include/c++/4.8/algorithm
> |include/c++/4.8/bits/stl_algo.h
> |include/c++/5.1/algorithm
>
> $ gtags --language-force=cpp:cppfiles
>
> You can use find(1) command to make a file list.
> This will satisfy your request too, because find(1) has both glob and
> regex. :)
>
> New priority:
> [high]
> 1. --language-force=<lang>:<file list> 2. langmap=<lang>:<suffix or
> glob pattern list> [low]
>
> What do you think?

An interesting idea :-). Upon reflection I'm actually quite happy with what you proposed yesterday - sorry perhaps I should have been clearer at the time...

At one stage I thought of extending the gtags file format to include an optional language override, it's similar to your file list idea... However as I used global more I started to shy away from that as it's high maintenance and would break automatic recursive update on file addition.

For example: If you're working on a project that has non-standard file naming conventions and/or has particular type types in odd places (like my texi/inc example) then if you used a file list/type approach you'd need to update that each time you added another suffixless header file. However with your path/specific glob approach and priority scheme(let's call this prio-path-glob):

        --force-language=cpp:include --force-language=c:.c --force-language=makefile:([Mm]akefile) ...

This does the job quite nicely. You wouldn't need to update any config unless there was a new file type that needed to be excluded (unlikely within an existing project). You could just run global -u and update as normal. If given the file list feature I would avoid using it because of the need to maintain it. A couple of the really cool things about gtags is you just type gtags and it does it all for you (unless you have non-standard stuff) and global -u picks up updates and new files.

The only `upsides' my `explicitly select the overridden files with RE' approach has over yours was:
        1) RE patterns are more powerful and succinct - would deal with cases we haven't thought of.
        2) You're explicitly selecting what you want to override.

So 1 is overkill as agreed (the prio-path-glob approach will meet all the requirements we can think of) so that's gone; and as for 2 if prio-path-glob were used instead you'd probably only need to have a couple of file type override directives in there anyway, as the skip list will weed out most exceptions anyway. So upon reflection I feel that a file type list would add extra complexity that isn't needed. If you have a specific requirement for it yourself then could we have it in addition to what you proposed yesterday please?

So as I understand it we would have --language-force=<Language>:<Specifier> where <Specifier> would be one of:
        *x- Existing langmap style extension list e.g. `.c.h'.
        *x- File only glob pattern e.g. `([Mm]akefile)'.
        *x - A mixture of the above two e.g. `.c.h([Mm]akefile)(*.inc)'
          x - A dumb path substring match (possibly with the caveat that it must start with ./ or / to distinguish it from the above?) e.g. '/include/'.
         ? - A bare name of a file list in the config e.g. `cppfiles'?
Those entries marked with * would also apply to langmap config entries as well. Those entries marked with x meet my requirements/wishlist.

With those additional features marked with x and your proposed priority list as detailed yesterday I would say that would give maximum benefit without too much extra cost (famous last words!).

One additional feature/thought is that one could have a language type of auto that would mean do normal file type detection. Thus the above example would read:

        --force-language=cpp:include --force-language=auto:.c --force-language=auto:([Mm]akefile) ...

This is similar to ctags usage of the `auto' language designator. Only an idea that just occurred to me, I'm not deliberately trying to add more work - honest!

On a completely different note...

Do you have that patch for ctags to give out references? And if applied could gtags make use of them? If so would you be happy to send that to me? I understand if it's awkward etc.

BTW I was fiddling around mixing parsers today in the same run (using internal gtags and then falling back on ctags for unsupported files). Very easy to do and it works so well :-). Many thanks.

Regards,

--
Shigio YAMAGUCHI <address@hidden>
PGP fingerprint: D1CB 0B89 B346 4AB6 5663  C4B6 3CA5 BBB3 57BE DDA3

reply via email to

[Prev in Thread] Current Thread [Next in Thread]