|
From: | Shigio YAMAGUCHI |
Subject: | Re: Tony.RE: GNU Global Parsing Suffixless Files Patch |
Date: | Thu, 6 Oct 2016 15:10:17 +0900 |
SECURITY CLASSIFICATION: OFFICIAL
> -----Original Message-----
> From: address@hidden [mailto:address@hidden] On Behalf Of
> Shigio YAMAGUCHI
> Sent: 05 October 2016 02:56
> To: Cooper, Anthony
> Cc: address@hidden
> Subject: Re: GNU Global Parsing Suffixless Files Patch
>
> > Q: I'm assuming any glob patterns would implicitly be anchored to
> > the end of the path string (as they are in bash)?
>
> Yes. In ctags, '(<pattern>)' matches to file names not path names, like '.c.h'.
:-)
>
> > Yes I know... In fact after originally looking at global and ctags I
> > thought how potentially dangerous ctags's --force-language option
> > was and that's why I called my extension suffixless_langmap.
> > My intention was that this option wouldn't force anything but
> > instead provide a default language when there wasn't a file suffix.
> >
> > For example, in project include directories you quite often get
> > other
>
> > artefacts like .c, .texi, .html (I know that these get excluded) and
> > .inc files (MSVS). If the --force-language override option is used
> > on those include directories then files with a suffix don't
> > automatically get handled the way they should. Instead you'd
> > possibly have to put in additional more specific --force-language
> > overrides to reinstate default behaviour for certain extensions. E.g.:
>
> You are right. It is a important point. You should be able to finely control.
>
> How about using a 'file list' instead of a direct path.
>
> --language-force=<lang>:<file list>
>
> File list is a file which lists file names.
>
> e.g.
> [cppfiles]
> +-----------------------------
> |include/c++/4.8/algorithm
> |include/c++/4.8/bits/stl_algo.h
> |include/c++/5.1/algorithm
>
> $ gtags --language-force=cpp:cppfiles
>
> You can use find(1) command to make a file list.
> This will satisfy your request too, because find(1) has both glob and
> regex. :)
>
> New priority:
> [high]
> 1. --language-force=<lang>:<file list> 2. langmap=<lang>:<suffix or
> glob pattern list> [low]
>
> What do you think?
An interesting idea :-). Upon reflection I'm actually quite happy with what you proposed yesterday - sorry perhaps I should have been clearer at the time...
At one stage I thought of extending the gtags file format to include an optional language override, it's similar to your file list idea... However as I used global more I started to shy away from that as it's high maintenance and would break automatic recursive update on file addition.
For example: If you're working on a project that has non-standard file naming conventions and/or has particular type types in odd places (like my texi/inc example) then if you used a file list/type approach you'd need to update that each time you added another suffixless header file. However with your path/specific glob approach and priority scheme(let's call this prio-path-glob):
--force-language=cpp:include --force-language=c:.c --force-language=makefile:([Mm]akefile) ...
This does the job quite nicely. You wouldn't need to update any config unless there was a new file type that needed to be excluded (unlikely within an existing project). You could just run global -u and update as normal. If given the file list feature I would avoid using it because of the need to maintain it. A couple of the really cool things about gtags is you just type gtags and it does it all for you (unless you have non-standard stuff) and global -u picks up updates and new files.
The only `upsides' my `explicitly select the overridden files with RE' approach has over yours was:
1) RE patterns are more powerful and succinct - would deal with cases we haven't thought of.
2) You're explicitly selecting what you want to override.
So 1 is overkill as agreed (the prio-path-glob approach will meet all the requirements we can think of) so that's gone; and as for 2 if prio-path-glob were used instead you'd probably only need to have a couple of file type override directives in there anyway, as the skip list will weed out most exceptions anyway. So upon reflection I feel that a file type list would add extra complexity that isn't needed. If you have a specific requirement for it yourself then could we have it in addition to what you proposed yesterday please?
So as I understand it we would have --language-force=<Language>:<Specifier> where <Specifier> would be one of:
*x- Existing langmap style extension list e.g. `.c.h'.
*x- File only glob pattern e.g. `([Mm]akefile)'.
*x - A mixture of the above two e.g. `.c.h([Mm]akefile)(*.inc)'
x - A dumb path substring match (possibly with the caveat that it must start with ./ or / to distinguish it from the above?) e.g. '/include/'.
? - A bare name of a file list in the config e.g. `cppfiles'?
Those entries marked with * would also apply to langmap config entries as well. Those entries marked with x meet my requirements/wishlist.
With those additional features marked with x and your proposed priority list as detailed yesterday I would say that would give maximum benefit without too much extra cost (famous last words!).
One additional feature/thought is that one could have a language type of auto that would mean do normal file type detection. Thus the above example would read:
--force-language=cpp:include --force-language=auto:.c --force-language=auto:([Mm]akefile) ...
This is similar to ctags usage of the `auto' language designator. Only an idea that just occurred to me, I'm not deliberately trying to add more work - honest!
On a completely different note...
Do you have that patch for ctags to give out references? And if applied could gtags make use of them? If so would you be happy to send that to me? I understand if it's awkward etc.
BTW I was fiddling around mixing parsers today in the same run (using internal gtags and then falling back on ctags for unsupported files). Very easy to do and it works so well :-). Many thanks.
Regards,
[Prev in Thread] | Current Thread | [Next in Thread] |