Re: utf-8-strings

lilypond-devel
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: utf-8-strings

From:	David Kastrup
Subject:	Re: utf-8-strings
Date:	Sun, 08 Jul 2012 14:57:01 +0200
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/24.1.50 (gnu/linux)
Thomas Morley <address@hidden> writes:

> 2012/7/8 David Kastrup <address@hidden>:
>>
>> This can basically be done using charsets.  I tried doing this with
>> regexps, but curiously enough, in contrast to Guile proper, those appear
>> to be already utf-8 aware, so
>>
>> #(use-modules (ice-9 regex))
>>
>> #(define (utf8-substrings str)
>>    (define char-pat (make-regexp "."))
>>    (map match:substring (list-matches char-pat str)))
>>
>> #(write (utf8-substrings "Ã¡Ã¨Ã§Ã¤Ã¶Ã¼ãã¹ã"))
>>
>> works just fine (if you overlook the fact that write misbehaves, writing
>> some byte codes quoted as \xhh inside of a string and others literally).
>
> Wow!
> Following your suggestion I managed to drop about 300 lines, reducing
> it to a quarter of the original.

It is likely that the regexp library takes a look at the current locale,
so this is likely not going to work in non-utf8 locales (LilyPond's
utf-8 only nature is likely going to lead to other problems
additionally).  Also I think that you treated combining characters, and
"." most likely does not.

> You definitly should earn more money!!
>
> Of course I had to redefine `string-list->string'. I used recursion,
> which was the best I could think of.
> (`string-list->string' isn't used here, but I need it elsewhere)
>
> Do you agree If I turn it into a patch?
> I think `string->string-list' and `string-list->string' are very
> useful tools and `char-space' might be of interest, too.
>
>
> Thanks a lot,
>   Harm
>
> \version "2.15.39"
>
> #(use-modules (ice-9 regex))
>
> #(define (utf8-substrings str)
>    (define char-pat (make-regexp "."))
>    (map match:substring (list-matches char-pat str)))
>   
> #(define (string->string-list strg)
>   (utf8-substrings strg))
>   
> #(define (string-list->string ls)
>   (define (helper ls strg)
>     "
>     Converts a list of strings into a new string.
>     e.g '("1" "2" "3" "4") -> "1234"
>     "
>       (if (null? ls)
>         strg 
>         (begin
>         (set! strg (string-append strg (car ls)))
>         (if (null? (cdr ls))
>           strg
>           (helper (cdr ls) strg)))))     
>  (helper ls ""))

Uh, something like

#(define string-list->string string-concatenate)

is likely shorter...

>   
> #(define (stack-chars stencil stils kern)
>   (set! stencil (ly:stencil-combine-at-edge stencil X RIGHT (car stils) kern))
>    (if (null? (cdr stils))
>      stencil
>      (stack-chars stencil (cdr stils) kern)))

This can be written better as

#(define (stack-chars stencil stils kern)
   (if (null? stils)
       stencil
       (stack-chars
          (ly:stencil-combine-at-edge stencil X RIGHT (car stils) kern)
          (cdr stils)
          kern)))

as this can also deal with the degenerate case where stils is already
null.

Another rather Schemeish rendition would be

#(define (stack-chars stencil stils kern)
  (fold
     (lambda (stil stencil)
        (ly:stencil-combine-at-edge stencil X RIGHT stil kern))
     stencil
     stils))

but in this case it is probably not worth creating a closure here
(without use of kern, it would be a proper function).


> #(define (make-one-character-strings l1 l2)
>     "
>     l1 is supposed to be a list of strings.
>     make-one-character-strings will return a new list l2, 
>     build of the elements of l1.
>     Every string of l2 is a one character-string
>     e.g '("12" "34") -> '("1" "2" "3" "4")
>     "
>   (if (null? l1)
>      l2
>      (begin
>        (set! l2 (append l2 (string->string-list (car l1))))
>        (make-one-character-strings (cdr l1) l2))))

There is no point for this function to get two arguments as far as I can
see.  Just write

#(define (make-one-character-strings l1)
   (append-map string->string-list l1))

> #(define-markup-command (char-space layout props nmbr args)(number? 
> markup-list?)
>   #:category font
>   #:properties ((word-space 0.6) (word-space-left #f) (word-space-right #f))
>   "
> @cindex changing the space between single characters
>
> Inserts @var{nmbr} to the space between every character of @var{args}. 
> If @var{nmbr} is positive some additional space is created to the left and 
> the 
> right of @var{args}.
> @code{word-space-left} and @code{word-space-right} may be used to determine 
> the
> space on the left or right side of @var{args}, @code{word-space} to determine 
> it 
> at both sides.
>
> @lilypond[verbatim,quote]
> \\markup \\override #'(line-width . 66) \\wordwrap {
>   This text contains some \\bold \\char-space #-0.3 { compressed } parts and
>   some parts which are \\bold \\char-space #1 { stretched. } If a part is 
>   \\bold \\char-space #1 { stretched } some additional space will be inserted
>   to the left and to the right. If you want to change the default use the 
> properties 
>   \\italic word-space, \\italic word-space-left and \\italic 
> word-space-right. If a part is 
>   \\bold \\char-space #-0.3 { compressed } the space to the left and to the 
> right
>   is not affected.
>   Par example: \\override #'(word-space . 2) \\bold \\char-space #1 { This 
> little text }
>   has more space on both sides.
>   This will work with accented letters, german Umlaute etc: 
>   \\bold \\char-space #1 { .¡.©.§.¤.¶.¼...ã }
> }
> @end lilypond"
>   (let* ((args-rev (remove (lambda (x) (string=? "" x)) args))
           (args-rev (delete "" args))
>          (new-args (list-join args-rev " "))

list-join?

>          (argls (make-one-character-strings new-args '()))
>          (pos-nmbr (max nmbr 0.0)) ; 'nmbr' limited to be not below 0.0
>          (stils (map (lambda (x)(interpret-markup layout props x)) argls))
>          (first-stil (if (eq? argls '()) point-stencil (car stils))))

(eq? argls '()) is usually written as (null? argls).  Quite equivalent,
but null? is the classical predicate to use and expect.

>
>  (ly:stencil-combine-at-edge  
>    (ly:stencil-combine-at-edge
>       (ly:make-stencil "" 
>              (cons 0 (abs (* pos-nmbr 3 (if (number? word-space-left)
>    word-space-left word-space))))

word-space-left has been given a default in the function definition.  So
if it is not a number, this is an error case I would not try to catch.


-- 
David Kastrup
[Prev in Thread]
Current Thread
[Next in Thread]
Re: utf-8-strings, (continued)
Prev by Date: Re: utf-8-strings
Next by Date: Re: utf-8-strings
Previous by thread: Re: utf-8-strings
Next by thread: Re: utf-8-strings
Index(es):
- Date
- Thread