[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
simplifying configuration of encoded characters/entities output
From: |
Patrice Dumas |
Subject: |
simplifying configuration of encoded characters/entities output |
Date: |
Wed, 29 Dec 2021 13:35:05 +0100 |
Hello,
I would like to simplify the customization and code determining what is
output for characters/entities.
For Info and Plaintext, I propose to remove the check on documentencoding
being set or not, leaving two possibilities
* --enable-encoding (the default), try to output unicode points encoded
characters for every output, be it accents like @'e, @-commands like
@l{} or dashes and quotes.
* --disable-encoding use ascii everywhere
If we want to have some differential treatment of some categories, we
should add specific customization options by class, but I think that it
adds unnecessary complexities.
Here is my proposal for HTML
* remove FALLBACK_TO_NUMERIC_ENTITY, always setting it for HTML (and
never for TexinfoXML, or always set, not sure about it, and probably
does not matter much).
* remove ENABLE_ENCODING_USE_ENTITY
* if ENABLE_ENCODING is set, try to output unicode points encoded
characters for every output, be it accents like @'e, @-commands like
@l{} or dashes and quotes.
That would mean 3 possibilities for HTML
* default, use named entities if possible, fallback to numeric entities
* --enable-encoding triggers outputting encoded characters
* with USE_NUMERIC_ENTITY output numeric entities
Note than in most if not all cases, the actual output would still be
guarded by the OUTPUT_ENCODING_NAME value, such that the conversions
with ENABLE_ENCODING set are only done when they are known to be
possible.
Opinions, ideas?
--
Pat
- simplifying configuration of encoded characters/entities output,
Patrice Dumas <=
Re: simplifying configuration of encoded characters/entities output, Gavin Smith, 2021/12/29