lilypond-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: UTF-8-aware backwards string searching in Guile, or: fixing centered


From: Alexander Kobel
Subject: Re: UTF-8-aware backwards string searching in Guile, or: fixing centered lyrics ignoring punctuation
Date: Wed, 31 Oct 2018 11:05:48 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1

Gosh,

once again I'm flabbergasted about the expertise and helpfulness of the folks on this list. Thanks a ton!

As far as I can tell, your corrected version works like a charm (and even fixed a minor misalignment of the "à" syllable that I did not spot earlier). And IIUC, the only drawback of the quick'n'dirty variant is that it does not actually do a canonicalization of the strings involved, so that two different representations of a glyph (e.g., using different combining characters) will not match. But this is extremely unlikely given that they should be usually entered by the same user in the same way, and that the set of characters to compare with is fairly limited and probably even has a unique encoding anyway.
So, problem solved for me.

I can only hope that you did not spend the 4.75 hours between your mails for the "quick fix"...


Cheers,
Alex


On 31/10/2018 05.23, Aaron Hill wrote:
On 2018-10-30 4:56 pm, Aaron Hill wrote:
On 2018-10-30 10:01 am, Alexander Kobel wrote:
Does anyone have a hint how to approach this one? (Or is the answer
just: be patient and hope for Guile v2?)

The only hint here is to replace the built-in functions with ones
which understand UTF8 encoding and can perform the work needed.  There
very well might be someone online who has already done this work,
which would save on having to do it yourself.

Otherwise, the basic strategy is to replace string->list with a
version that decodes UTF8 and returns a list of integers (essentially
UTF32).  Then, all of the string work is being done with these lists
of integers instead.  (The character set would also just be a set of
integers representing the unique Unicode code points.)  After you find
the subsets of the list that are interesting to measure, you'll then
need to convert the list back into a string.  This means encoding back
into UTF8 and emitting a string.

Here's a quick-n-dirty patch to address the issue.

%%%%
\version "2.19.82"
\include "center-lyrics-ignoring-punctuation.ily"
{ d'4 4 4 }
\addlyrics { Å Ɓ† «Ḉ…» }
%%%%

-- Aaron Hill

_______________________________________________
lilypond-user mailing list
address@hidden
https://lists.gnu.org/mailman/listinfo/lilypond-user


Attachment: smime.p7s
Description: S/MIME Cryptographic Signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]