bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#27978: Detection of section name in man.el


From: Eli Zaretskii
Subject: bug#27978: Detection of section name in man.el
Date: Fri, 18 Aug 2017 11:49:57 +0300

> From: Grégory Mounié
>       <Gregory.Mounie@imag.fr>
> Date: Sun, 6 Aug 2017 01:44:19 +0200
> 
>   When parsing manual in languages with non-ascii letters, the section 
> names using non-ascii letters are not added to the table of content.
> 
>   I noticed the bug reading the French bash manual: the quite useful 
> "COMMANDES INTERNES DE l'INTERPRÉTEUR" section does not appear (SHELL 
> BUILTIN COMMAND). (because of the É letter)
> 
>   I propose to use Character class instead of ascii interval in the 
> appropriate regexp defvar. It should not change anything for english 
> manual and it should work for many other languages.

Thanks, I pushed these changes with some minor adjustments.
Specifically:

> -(defvar Man-section-regexp "[0-9][a-zA-Z0-9+]*\\|[LNln]"
> +(defvar Man-section-regexp "[[:digit:]][[:alnum:]+]*\\|[LNln]"
>    "Regular expression describing a manpage section within parentheses.")

I didn't change this one, because I think a section always uses only
ASCII letters and numbers, as in ".1n".  If you disagree, can you show
an example where this is not so?

> -(defvar Man-heading-regexp "^\\([A-Z][A-Z0-9 /-]+\\)$"
> +(defvar Man-heading-regexp "^\\([[:upper:]][[:upper:][:digit:] /-]+\\)$"
>    "Regular expression describing a manpage heading entry.")

I see no reason to replace 0-9 with [:digit:] here, since I think
non-ASCII digits will never be used in this context.  Do you agree?

Incidentally, I see quite a few similar regexps elsewhere in man.el,
did you audit all of them and established that they don't need similar
changes?  If not, would you like to propose similar changes there?





reply via email to

[Prev in Thread] Current Thread [Next in Thread]