texmacs-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Texmacs-dev] Chinese Localization Issues


From: Joris van der Hoeven
Subject: Re: [Texmacs-dev] Chinese Localization Issues
Date: Wed, 2 Oct 2013 15:54:48 +0200
User-agent: Mutt/1.5.20 (2009-06-14)

Hi,

On Wed, Oct 02, 2013 at 05:31:39PM +0800, jiazhaoconga wrote:
> > Would it be possible to send us a short example text which
> > illustrates the various points, with how things should
> > or should not be done?
> 
> I will post a web page showing how TeXmacs and LaTeX deal with
> each point. I will give the link later.

Thanks for the further details; I looked at the posted webpage.

> > Does this mean that the various conventions that you mention might
> > vary according to the context.  For instance, should I understand that
> > some Chinese do put spaces between words, and that others don't,
> > but that the latter is more frequent?
> 
> The CJK and ctex LaTeX package provide many options, but in fact
> Chinese typesetting conventions don't vary much.

OK, so we probably should reduce the number of options then,
and keep things user friendly.  I restudied several CJK documents
that I took randomly from the web and it seems to me that typesetting
conventions are pretty similar, which is good news.

> The space problems exist in "source code", Chinese don't use space to
> separate characters and group them into words. Because there are
> thousands of Chinese "character", and "word" is usually made of less than
> four "character". Extra spaces have no use in final typesetting but help
> improve LaTeX source readability. This is not a problem for TeXmacs
> because WYSIWYG, you are not editing TeXmacs source directly.
> So users should respond for not input extra space and not abuse spaces
> to create alignment.  Space problem is a LaTeX problem, it influences
> TeXmacs in "latex import" part.

OK, so we both have the LaTeX conversion issue and the internal typesetting 
issue.
Please discuss the LaTeX conversion issue with Francois; he will take care of 
that.
As to the CJK typesetting, I just SVN-committed various improvements:

  - I now forbid linebreaks before ponctuation symbols.

  - I added a special "CJK ponctuation space" after ponctuation symbols,
    which is zero by default, but which is allowed to be reduced or extended
    when justifying text.

The result is quite pleasant to the eye, so please test it.
I see a few remaining issues:

  - Text does not really look justified when a line ends with a ponctuation
    symbol, because CJK ponctuation symbols have the logical width of
    a character, but a much smaller "ink width".  Any ideas about what
    I should do about this problem?  Leave it as is, or try to somehow
    work with the real widths of ponctuation symbols?

  - No justification will take place yet if a line contains no ponctuation
    symbols at all.  I plan to provide limited support for letter-based
    stretching (also for European languages), but this is still on the list.

> > Should we have a length unit for the width of a character? em?
> > How to determine it from the font (I noticed that most characters
> > have exactly the same width most of the time; I probably should
> > use that fact)?
> 
> I think the length unit is pt.

This is usually much smaller than one character.
What is the (LaTeX) length unit for the horizontal width of a character
(and for the height of a character in cases where width and height
do not coincide)?

> > As you may have noticed, TeXmacs uses indentation in 'article' style,
> > but not in 'generic' style, where paragraphs are separated by a large
> > vertical space.  Similar conventions make sense in Chinese?
> 
> It's OK to have paragraphs separated by a large vertical space instead of
> indent in generic style. In some styles, every paragraph is indent except the
> first one in each section.

OK, so conventions are basically similar, except that we should default
to a first indentation of exactly two characters when first indentation
is being used.

> > This brings me to the issue of line breaking: I guess that you need
> > the equivalent of justified text.  If I understand you well,
> > then the only spaces which I am allowed to expand or compress are
> > spaces around punctuation characters.  Am I allowed to put very tiny
> > additional spaces around characters themselves when needed
> > (e.g. for lines without any punctation characters)?
> 
> Yes, this is about punctuation kerning. Tiny additional spaces
> around characters is allowed.

OK, so this is still to be implemented.

> >> 5. Font. Chinese font is very different from english font.
> >
> > I see; so you suggest that the font menu should be adapted in
> > the case of Chinese?  Anyway, a new font browser is being developed,
> > you can test it by 'Tools -> Experimental -> New style fonts'.
> > Please let me know of your comments.
> 
> TeXmacs seems only recognize fonts listed in `TeXmacs/fonts/font-database.scm`
> but not system fonts?

I will provide a utility for searching for system fonts and adding them
to your personal font database.  However, the TeXmacs font database contains
useful extra information about fonts (e.g. properties such as 'sans serif',
'small caps', and many subtle automatically determined properties).
This allows TeXmacs to find appropriate substitution fonts when
sending a document to somebody else, and also to find adequate fonts
for various kinds of markup (strong, emphasize, name, etc.).
Unfortunately, we will have to build this enriched database ourselves.
The current database contains all default system fonts under
Linux, MacOS and Windows.

> > Do you suggest that we modify the 'em' and 'strong' tags so as
> > to use different fonts / bold sans serif fonts?
> 
> Yes, that should happen automatically.

What are your suggested substitutes for 'italic' and 'small caps'?
I.e., is there some distinctive property about the 'look and feel' of
a CJK font to make it suitable as an 'italic' or 'small caps' font?

> In LaTeX, the CJK package masks english character entries in map file
> or something like that. CM fonts are fine when used with Chinese documents.
> I think we need two environment variables for two fonts.

In the upcoming font system, I allow complex font names which
are really lists of fonts in which I search for the first match.
So this seems doable and merely a matter of the right user interface.

Best wishes, --Joris



reply via email to

[Prev in Thread] Current Thread [Next in Thread]