[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#27978: Detection of section name in man.el
From: |
Eli Zaretskii |
Subject: |
bug#27978: Detection of section name in man.el |
Date: |
Fri, 18 Aug 2017 11:49:57 +0300 |
> From: Grégory Mounié
> <Gregory.Mounie@imag.fr>
> Date: Sun, 6 Aug 2017 01:44:19 +0200
>
> When parsing manual in languages with non-ascii letters, the section
> names using non-ascii letters are not added to the table of content.
>
> I noticed the bug reading the French bash manual: the quite useful
> "COMMANDES INTERNES DE l'INTERPRÉTEUR" section does not appear (SHELL
> BUILTIN COMMAND). (because of the É letter)
>
> I propose to use Character class instead of ascii interval in the
> appropriate regexp defvar. It should not change anything for english
> manual and it should work for many other languages.
Thanks, I pushed these changes with some minor adjustments.
Specifically:
> -(defvar Man-section-regexp "[0-9][a-zA-Z0-9+]*\\|[LNln]"
> +(defvar Man-section-regexp "[[:digit:]][[:alnum:]+]*\\|[LNln]"
> "Regular expression describing a manpage section within parentheses.")
I didn't change this one, because I think a section always uses only
ASCII letters and numbers, as in ".1n". If you disagree, can you show
an example where this is not so?
> -(defvar Man-heading-regexp "^\\([A-Z][A-Z0-9 /-]+\\)$"
> +(defvar Man-heading-regexp "^\\([[:upper:]][[:upper:][:digit:] /-]+\\)$"
> "Regular expression describing a manpage heading entry.")
I see no reason to replace 0-9 with [:digit:] here, since I think
non-ASCII digits will never be used in this context. Do you agree?
Incidentally, I see quite a few similar regexps elsewhere in man.el,
did you audit all of them and established that they don't need similar
changes? If not, would you like to propose similar changes there?