[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: utf-8-strings
From: |
Thomas Morley |
Subject: |
Re: utf-8-strings |
Date: |
Sun, 8 Jul 2012 16:16:09 +0200 |
2012/7/8 David Kastrup <address@hidden>:
> Thomas Morley <address@hidden> writes:
>
>> 2012/7/8 David Kastrup <address@hidden>:
>>>
>>> This can basically be done using charsets. I tried doing this with
>>> regexps, but curiously enough, in contrast to Guile proper, those appear
>>> to be already utf-8 aware, so
>>>
>>> #(use-modules (ice-9 regex))
>>>
>>> #(define (utf8-substrings str)
>>> (define char-pat (make-regexp "."))
>>> (map match:substring (list-matches char-pat str)))
>>>
>>> #(write (utf8-substrings "áèçäöüテスト"))
>>>
>>> works just fine (if you overlook the fact that write misbehaves, writing
>>> some byte codes quoted as \xhh inside of a string and others literally).
>>
>> Wow!
>> Following your suggestion I managed to drop about 300 lines, reducing
>> it to a quarter of the original.
>
> It is likely that the regexp library takes a look at the current locale,
> so this is likely not going to work in non-utf8 locales (LilyPond's
> utf-8 only nature is likely going to lead to other problems
> additionally). Also I think that you treated combining characters, and
> "." most likely does not.
Well, I don't understand the consequences. This is beyond my knowledge.
>> You definitly should earn more money!!
>>
>> Of course I had to redefine `string-list->string'. I used recursion,
>> which was the best I could think of.
>> (`string-list->string' isn't used here, but I need it elsewhere)
>>
>> Do you agree If I turn it into a patch?
>> I think `string->string-list' and `string-list->string' are very
>> useful tools and `char-space' might be of interest, too.
>>
>>
>> Thanks a lot,
>> Harm
>>
>> \version "2.15.39"
>>
>> #(use-modules (ice-9 regex))
>>
>> #(define (utf8-substrings str)
>> (define char-pat (make-regexp "."))
>> (map match:substring (list-matches char-pat str)))
>>
>> #(define (string->string-list strg)
>> (utf8-substrings strg))
>>
>> #(define (string-list->string ls)
>> (define (helper ls strg)
>> "
>> Converts a list of strings into a new string.
>> e.g '("1" "2" "3" "4") -> "1234"
>> "
>> (if (null? ls)
>> strg
>> (begin
>> (set! strg (string-append strg (car ls)))
>> (if (null? (cdr ls))
>> strg
>> (helper (cdr ls) strg)))))
>> (helper ls ""))
>
> Uh, something like
>
> #(define string-list->string string-concatenate)
>
> is likely shorter...
It is.
>
>>
>> #(define (stack-chars stencil stils kern)
>> (set! stencil (ly:stencil-combine-at-edge stencil X RIGHT (car stils)
>> kern))
>> (if (null? (cdr stils))
>> stencil
>> (stack-chars stencil (cdr stils) kern)))
>
> This can be written better as
>
> #(define (stack-chars stencil stils kern)
> (if (null? stils)
> stencil
> (stack-chars
> (ly:stencil-combine-at-edge stencil X RIGHT (car stils) kern)
> (cdr stils)
> kern)))
>
> as this can also deal with the degenerate case where stils is already
> null.
>
> Another rather Schemeish rendition would be
>
> #(define (stack-chars stencil stils kern)
> (fold
> (lambda (stil stencil)
> (ly:stencil-combine-at-edge stencil X RIGHT stil kern))
> stencil
> stils))
>
> but in this case it is probably not worth creating a closure here
> (without use of kern, it would be a proper function).
>
>
>> #(define (make-one-character-strings l1 l2)
>> "
>> l1 is supposed to be a list of strings.
>> make-one-character-strings will return a new list l2,
>> build of the elements of l1.
>> Every string of l2 is a one character-string
>> e.g '("12" "34") -> '("1" "2" "3" "4")
>> "
>> (if (null? l1)
>> l2
>> (begin
>> (set! l2 (append l2 (string->string-list (car l1))))
>> (make-one-character-strings (cdr l1) l2))))
>
> There is no point for this function to get two arguments as far as I can
> see. Just write
>
> #(define (make-one-character-strings l1)
> (append-map string->string-list l1))
Seems I have to extend my scheme-vocabulary.
I didn't know/remember `append-map' and `string-concatenate'.
>
>> #(define-markup-command (char-space layout props nmbr args)(number?
>> markup-list?)
>> #:category font
>> #:properties ((word-space 0.6) (word-space-left #f) (word-space-right #f))
>> "
>> @cindex changing the space between single characters
>>
>> Inserts @var{nmbr} to the space between every character of @var{args}.
>> If @var{nmbr} is positive some additional space is created to the left and
>> the
>> right of @var{args}.
>> @code{word-space-left} and @code{word-space-right} may be used to determine
>> the
>> space on the left or right side of @var{args}, @code{word-space} to
>> determine it
>> at both sides.
>>
>> @lilypond[verbatim,quote]
>> \\markup \\override #'(line-width . 66) \\wordwrap {
>> This text contains some \\bold \\char-space #-0.3 { compressed } parts and
>> some parts which are \\bold \\char-space #1 { stretched. } If a part is
>> \\bold \\char-space #1 { stretched } some additional space will be inserted
>> to the left and to the right. If you want to change the default use the
>> properties
>> \\italic word-space, \\italic word-space-left and \\italic
>> word-space-right. If a part is
>> \\bold \\char-space #-0.3 { compressed } the space to the left and to the
>> right
>> is not affected.
>> Par example: \\override #'(word-space . 2) \\bold \\char-space #1 { This
>> little text }
>> has more space on both sides.
>> This will work with accented letters, german Umlaute etc:
>> \\bold \\char-space #1 { .¡.©.§.¤.¶.¼.�†.‚.ト }
>> }
>> @end lilypond"
>> (let* ((args-rev (remove (lambda (x) (string=? "" x)) args))
> (args-rev (delete "" args))
I really have to learn more of the predefined guile-procedures.
>> (new-args (list-join args-rev " "))
>
> list-join?
I wanted to insert " " between every element of new-args. Otherwise
there will be no space between the words of p.e. \markup \char-space
#1 { This little text }
list-join seems to be a nice function to do so. Anything problematic with it?
>
>> (argls (make-one-character-strings new-args '()))
>> (pos-nmbr (max nmbr 0.0)) ; 'nmbr' limited to be not below 0.0
>> (stils (map (lambda (x)(interpret-markup layout props x)) argls))
>> (first-stil (if (eq? argls '()) point-stencil (car stils))))
>
> (eq? argls '()) is usually written as (null? argls). Quite equivalent,
> but null? is the classical predicate to use and expect.
Of course. Sorry.
>
>>
>> (ly:stencil-combine-at-edge
>> (ly:stencil-combine-at-edge
>> (ly:make-stencil ""
>> (cons 0 (abs (* pos-nmbr 3 (if (number? word-space-left)
>> word-space-left word-space))))
>
> word-space-left has been given a default in the function definition. So
> if it is not a number, this is an error case I would not try to catch.
Do you mean writing
(ly:stencil-combine-at-edge
(ly:stencil-combine-at-edge
(ly:make-stencil ""
(cons 0 (abs (* pos-nmbr 3 (if word-space-left
word-space-left word-space))))
instead?
>
>
> --
> David Kastrup
-Harm
utf-8-strings-rev-03.ly
Description: Binary data
- Re: utf-8-strings, (continued)
- Re: utf-8-strings, Thomas Morley, 2012/07/08
- Re: utf-8-strings, David Kastrup, 2012/07/08
- Re: utf-8-strings, Thomas Morley, 2012/07/08
- Re: utf-8-strings, David Kastrup, 2012/07/08
- Re: utf-8-strings, David Nalesnik, 2012/07/08
- Re: utf-8-strings, David Kastrup, 2012/07/08
- Re: utf-8-strings, Thomas Morley, 2012/07/10
- Re: utf-8-strings, David Nalesnik, 2012/07/10
- Re: utf-8-strings, David Kastrup, 2012/07/10
- Re: utf-8-strings, David Kastrup, 2012/07/08
- Re: utf-8-strings,
Thomas Morley <=
- Re: utf-8-strings, David Kastrup, 2012/07/08
- Re: utf-8-strings, Thomas Morley, 2012/07/08
- Re: utf-8-strings, David Kastrup, 2012/07/08