[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: utf-8-strings
From: |
David Kastrup |
Subject: |
Re: utf-8-strings |
Date: |
Sun, 08 Jul 2012 14:57:01 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.1.50 (gnu/linux) |
Thomas Morley <address@hidden> writes:
> 2012/7/8 David Kastrup <address@hidden>:
>>
>> This can basically be done using charsets. I tried doing this with
>> regexps, but curiously enough, in contrast to Guile proper, those appear
>> to be already utf-8 aware, so
>>
>> #(use-modules (ice-9 regex))
>>
>> #(define (utf8-substrings str)
>> (define char-pat (make-regexp "."))
>> (map match:substring (list-matches char-pat str)))
>>
>> #(write (utf8-substrings "áèçäöüãã¹ã"))
>>
>> works just fine (if you overlook the fact that write misbehaves, writing
>> some byte codes quoted as \xhh inside of a string and others literally).
>
> Wow!
> Following your suggestion I managed to drop about 300 lines, reducing
> it to a quarter of the original.
It is likely that the regexp library takes a look at the current locale,
so this is likely not going to work in non-utf8 locales (LilyPond's
utf-8 only nature is likely going to lead to other problems
additionally). Also I think that you treated combining characters, and
"." most likely does not.
> You definitly should earn more money!!
>
> Of course I had to redefine `string-list->string'. I used recursion,
> which was the best I could think of.
> (`string-list->string' isn't used here, but I need it elsewhere)
>
> Do you agree If I turn it into a patch?
> I think `string->string-list' and `string-list->string' are very
> useful tools and `char-space' might be of interest, too.
>
>
> Thanks a lot,
> Harm
>
> \version "2.15.39"
>
> #(use-modules (ice-9 regex))
>
> #(define (utf8-substrings str)
> (define char-pat (make-regexp "."))
> (map match:substring (list-matches char-pat str)))
>
> #(define (string->string-list strg)
> (utf8-substrings strg))
>
> #(define (string-list->string ls)
> (define (helper ls strg)
> "
> Converts a list of strings into a new string.
> e.g '("1" "2" "3" "4") -> "1234"
> "
> (if (null? ls)
> strg
> (begin
> (set! strg (string-append strg (car ls)))
> (if (null? (cdr ls))
> strg
> (helper (cdr ls) strg)))))
> (helper ls ""))
Uh, something like
#(define string-list->string string-concatenate)
is likely shorter...
>
> #(define (stack-chars stencil stils kern)
> (set! stencil (ly:stencil-combine-at-edge stencil X RIGHT (car stils) kern))
> (if (null? (cdr stils))
> stencil
> (stack-chars stencil (cdr stils) kern)))
This can be written better as
#(define (stack-chars stencil stils kern)
(if (null? stils)
stencil
(stack-chars
(ly:stencil-combine-at-edge stencil X RIGHT (car stils) kern)
(cdr stils)
kern)))
as this can also deal with the degenerate case where stils is already
null.
Another rather Schemeish rendition would be
#(define (stack-chars stencil stils kern)
(fold
(lambda (stil stencil)
(ly:stencil-combine-at-edge stencil X RIGHT stil kern))
stencil
stils))
but in this case it is probably not worth creating a closure here
(without use of kern, it would be a proper function).
> #(define (make-one-character-strings l1 l2)
> "
> l1 is supposed to be a list of strings.
> make-one-character-strings will return a new list l2,
> build of the elements of l1.
> Every string of l2 is a one character-string
> e.g '("12" "34") -> '("1" "2" "3" "4")
> "
> (if (null? l1)
> l2
> (begin
> (set! l2 (append l2 (string->string-list (car l1))))
> (make-one-character-strings (cdr l1) l2))))
There is no point for this function to get two arguments as far as I can
see. Just write
#(define (make-one-character-strings l1)
(append-map string->string-list l1))
> #(define-markup-command (char-space layout props nmbr args)(number?
> markup-list?)
> #:category font
> #:properties ((word-space 0.6) (word-space-left #f) (word-space-right #f))
> "
> @cindex changing the space between single characters
>
> Inserts @var{nmbr} to the space between every character of @var{args}.
> If @var{nmbr} is positive some additional space is created to the left and
> the
> right of @var{args}.
> @code{word-space-left} and @code{word-space-right} may be used to determine
> the
> space on the left or right side of @var{args}, @code{word-space} to determine
> it
> at both sides.
>
> @lilypond[verbatim,quote]
> \\markup \\override #'(line-width . 66) \\wordwrap {
> This text contains some \\bold \\char-space #-0.3 { compressed } parts and
> some parts which are \\bold \\char-space #1 { stretched. } If a part is
> \\bold \\char-space #1 { stretched } some additional space will be inserted
> to the left and to the right. If you want to change the default use the
> properties
> \\italic word-space, \\italic word-space-left and \\italic
> word-space-right. If a part is
> \\bold \\char-space #-0.3 { compressed } the space to the left and to the
> right
> is not affected.
> Par example: \\override #'(word-space . 2) \\bold \\char-space #1 { This
> little text }
> has more space on both sides.
> This will work with accented letters, german Umlaute etc:
> \\bold \\char-space #1 { .¡.©.§.¤.¶.¼...ã }
> }
> @end lilypond"
> (let* ((args-rev (remove (lambda (x) (string=? "" x)) args))
(args-rev (delete "" args))
> (new-args (list-join args-rev " "))
list-join?
> (argls (make-one-character-strings new-args '()))
> (pos-nmbr (max nmbr 0.0)) ; 'nmbr' limited to be not below 0.0
> (stils (map (lambda (x)(interpret-markup layout props x)) argls))
> (first-stil (if (eq? argls '()) point-stencil (car stils))))
(eq? argls '()) is usually written as (null? argls). Quite equivalent,
but null? is the classical predicate to use and expect.
>
> (ly:stencil-combine-at-edge
> (ly:stencil-combine-at-edge
> (ly:make-stencil ""
> (cons 0 (abs (* pos-nmbr 3 (if (number? word-space-left)
> word-space-left word-space))))
word-space-left has been given a default in the function definition. So
if it is not a number, this is an error case I would not try to catch.
--
David Kastrup
- Re: utf-8-strings, (continued)
- Re: utf-8-strings, David Nalesnik, 2012/07/08
- Re: utf-8-strings, Thomas Morley, 2012/07/08
- Re: utf-8-strings, David Kastrup, 2012/07/08
- Re: utf-8-strings, Thomas Morley, 2012/07/08
- Re: utf-8-strings, David Kastrup, 2012/07/08
- Re: utf-8-strings, David Nalesnik, 2012/07/08
- Re: utf-8-strings, David Kastrup, 2012/07/08
- Re: utf-8-strings, Thomas Morley, 2012/07/10
- Re: utf-8-strings, David Nalesnik, 2012/07/10
- Re: utf-8-strings, David Kastrup, 2012/07/10
- Re: utf-8-strings,
David Kastrup <=
- Re: utf-8-strings, Thomas Morley, 2012/07/08
- Re: utf-8-strings, David Kastrup, 2012/07/08
- Re: utf-8-strings, Thomas Morley, 2012/07/08
- Re: utf-8-strings, David Kastrup, 2012/07/08