groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: hyphenating non-english characters


From: Gáspár Gergő
Subject: Re: hyphenating non-english characters
Date: Thu, 1 Aug 2024 10:23:31 +0200

Hi Branden!

Thank you for your thorough answer!

> You didn't indicate where the hyphenation pattern file came from
> exactly (though its name is suggestive), or attach it, but odds are it's
> UTF-8 encoded, and that is a problem for GNU troff in its present state,
> which supports only single-byte encodings.  

> The good news is that if you convert the hyphenation pattern file to ISO
> Latin-2 (ISO 8859-2), you should be able to use it.

Unfortunately, I got stuck at this point, as the hyphenation file that I used 
is encoded in latin1. Sorry for not including it, this is the place I got the 
file from: https://ctan.org/tex-archive/language/hungarian/hyphenation

Another weird thing is, while it does contain the á,é,í,ó,ö and ü characters,
it omits the ő and ű characters entirely. I am guessing this is because of the 
limitations of the latin1 encoding, since it is only missing the ű and the ő
out of the "long" vowels Hungarian uses. I don't understand the choice of 
latin1 
at all. Maybe the reason this all fails is because of a subpar pattern file.

But since you said that latin1 is the default, I don't really see why those 
hcode request fail that refer to characters contained in the latin1 set. I am 
attaching a small sample to recreate the error, and the tex pattern file as 
well.
I am using groff 1.23.0.

This encoding business is out of my depth, so sorry if I'm missing something 
obvious.

Best,
Gergő

Attachment: sample.mom
Description: Text document

Attachment: huhyphn.tex
Description: TeX document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]