Re: utf-8-strings

lilypond-devel
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: utf-8-strings

From:	Thomas Morley
Subject:	Re: utf-8-strings
Date:	Sun, 8 Jul 2012 16:16:09 +0200
2012/7/8 David Kastrup <address@hidden>:
> Thomas Morley <address@hidden> writes:
>
>> 2012/7/8 David Kastrup <address@hidden>:
>>>
>>> This can basically be done using charsets.  I tried doing this with
>>> regexps, but curiously enough, in contrast to Guile proper, those appear
>>> to be already utf-8 aware, so
>>>
>>> #(use-modules (ice-9 regex))
>>>
>>> #(define (utf8-substrings str)
>>>    (define char-pat (make-regexp "."))
>>>    (map match:substring (list-matches char-pat str)))
>>>
>>> #(write (utf8-substrings "áèçäöüテスト"))
>>>
>>> works just fine (if you overlook the fact that write misbehaves, writing
>>> some byte codes quoted as \xhh inside of a string and others literally).
>>
>> Wow!
>> Following your suggestion I managed to drop about 300 lines, reducing
>> it to a quarter of the original.
>
> It is likely that the regexp library takes a look at the current locale,
> so this is likely not going to work in non-utf8 locales (LilyPond's
> utf-8 only nature is likely going to lead to other problems
> additionally).  Also I think that you treated combining characters, and
> "." most likely does not.

Well, I don't understand the consequences. This is beyond my knowledge.

>> You definitly should earn more money!!
>>
>> Of course I had to redefine `string-list->string'. I used recursion,
>> which was the best I could think of.
>> (`string-list->string' isn't used here, but I need it elsewhere)
>>
>> Do you agree If I turn it into a patch?
>> I think `string->string-list' and `string-list->string' are very
>> useful tools and `char-space' might be of interest, too.
>>
>>
>> Thanks a lot,
>>   Harm
>>
>> \version "2.15.39"
>>
>> #(use-modules (ice-9 regex))
>>
>> #(define (utf8-substrings str)
>>    (define char-pat (make-regexp "."))
>>    (map match:substring (list-matches char-pat str)))
>>
>> #(define (string->string-list strg)
>>   (utf8-substrings strg))
>>
>> #(define (string-list->string ls)
>>   (define (helper ls strg)
>>     "
>>     Converts a list of strings into a new string.
>>     e.g '("1" "2" "3" "4") -> "1234"
>>     "
>>       (if (null? ls)
>>         strg
>>         (begin
>>         (set! strg (string-append strg (car ls)))
>>         (if (null? (cdr ls))
>>           strg
>>           (helper (cdr ls) strg)))))
>>  (helper ls ""))
>
> Uh, something like
>
> #(define string-list->string string-concatenate)
>
> is likely shorter...

It is.

>
>>
>> #(define (stack-chars stencil stils kern)
>>   (set! stencil (ly:stencil-combine-at-edge stencil X RIGHT (car stils) 
>> kern))
>>    (if (null? (cdr stils))
>>      stencil
>>      (stack-chars stencil (cdr stils) kern)))
>
> This can be written better as
>
> #(define (stack-chars stencil stils kern)
>    (if (null? stils)
>        stencil
>        (stack-chars
>           (ly:stencil-combine-at-edge stencil X RIGHT (car stils) kern)
>           (cdr stils)
>           kern)))
>
> as this can also deal with the degenerate case where stils is already
> null.
>
> Another rather Schemeish rendition would be
>
> #(define (stack-chars stencil stils kern)
>   (fold
>      (lambda (stil stencil)
>         (ly:stencil-combine-at-edge stencil X RIGHT stil kern))
>      stencil
>      stils))
>
> but in this case it is probably not worth creating a closure here
> (without use of kern, it would be a proper function).
>
>
>> #(define (make-one-character-strings l1 l2)
>>     "
>>     l1 is supposed to be a list of strings.
>>     make-one-character-strings will return a new list l2,
>>     build of the elements of l1.
>>     Every string of l2 is a one character-string
>>     e.g '("12" "34") -> '("1" "2" "3" "4")
>>     "
>>   (if (null? l1)
>>      l2
>>      (begin
>>        (set! l2 (append l2 (string->string-list (car l1))))
>>        (make-one-character-strings (cdr l1) l2))))
>
> There is no point for this function to get two arguments as far as I can
> see.  Just write
>
> #(define (make-one-character-strings l1)
>    (append-map string->string-list l1))

Seems I have to extend my scheme-vocabulary.
I didn't know/remember `append-map' and `string-concatenate'.

>
>> #(define-markup-command (char-space layout props nmbr args)(number? 
>> markup-list?)
>>   #:category font
>>   #:properties ((word-space 0.6) (word-space-left #f) (word-space-right #f))
>>   "
>> @cindex changing the space between single characters
>>
>> Inserts @var{nmbr} to the space between every character of @var{args}.
>> If @var{nmbr} is positive some additional space is created to the left and 
>> the
>> right of @var{args}.
>> @code{word-space-left} and @code{word-space-right} may be used to determine 
>> the
>> space on the left or right side of @var{args}, @code{word-space} to 
>> determine it
>> at both sides.
>>
>> @lilypond[verbatim,quote]
>> \\markup \\override #'(line-width . 66) \\wordwrap {
>>   This text contains some \\bold \\char-space #-0.3 { compressed } parts and
>>   some parts which are \\bold \\char-space #1 { stretched. } If a part is
>>   \\bold \\char-space #1 { stretched } some additional space will be inserted
>>   to the left and to the right. If you want to change the default use the 
>> properties
>>   \\italic word-space, \\italic word-space-left and \\italic 
>> word-space-right. If a part is
>>   \\bold \\char-space #-0.3 { compressed } the space to the left and to the 
>> right
>>   is not affected.
>>   Par example: \\override #'(word-space . 2) \\bold \\char-space #1 { This 
>> little text }
>>   has more space on both sides.
>>   This will work with accented letters, german Umlaute etc:
>>   \\bold \\char-space #1 { .¡.©.§.¤.¶.¼.�†.‚.ト }
>> }
>> @end lilypond"
>>   (let* ((args-rev (remove (lambda (x) (string=? "" x)) args))
>            (args-rev (delete "" args))

I really have to learn more of the predefined guile-procedures.

>>          (new-args (list-join args-rev " "))
>
> list-join?

I wanted to insert " " between every element of new-args. Otherwise
there will be no space between the words of  p.e. \markup \char-space
#1 { This little text }
list-join seems to be a nice function to do so. Anything problematic with it?

>
>>          (argls (make-one-character-strings new-args '()))
>>          (pos-nmbr (max nmbr 0.0)) ; 'nmbr' limited to be not below 0.0
>>          (stils (map (lambda (x)(interpret-markup layout props x)) argls))
>>          (first-stil (if (eq? argls '()) point-stencil (car stils))))
>
> (eq? argls '()) is usually written as (null? argls).  Quite equivalent,
> but null? is the classical predicate to use and expect.

Of course. Sorry.

>
>>
>>  (ly:stencil-combine-at-edge
>>    (ly:stencil-combine-at-edge
>>       (ly:make-stencil ""
>>              (cons 0 (abs (* pos-nmbr 3 (if (number? word-space-left)
>>    word-space-left word-space))))
>
> word-space-left has been given a default in the function definition.  So
> if it is not a number, this is an error case I would not try to catch.

Do you mean writing

 (ly:stencil-combine-at-edge
   (ly:stencil-combine-at-edge
      (ly:make-stencil ""
             (cons 0 (abs (* pos-nmbr 3 (if word-space-left
word-space-left word-space))))

instead?

>
>
> --
> David Kastrup



-Harm
utf-8-strings-rev-03.ly
Description: Binary data
[Prev in Thread]
Current Thread
[Next in Thread]
Re: utf-8-strings, (continued)
Prev by Date: Re: utf-8-strings
Next by Date: 2.15.41 Regtests
Previous by thread: Re: utf-8-strings
Next by thread: Re: utf-8-strings
Index(es):
- Date
- Thread