emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to create a derived encoding?


From: Oliver Scholz
Subject: Re: How to create a derived encoding?
Date: Thu, 14 Oct 2004 13:12:45 +0200
User-agent: Gnus/5.1006 (Gnus v5.10.6) Emacs/21.3.50 (windows-nt)

Very interesting question.  There are, of course, people on this list
who know more about coding systems than I; yet I might as well give it
a try, I thought.

David Kastrup <address@hidden> writes:

> Stefan Monnier <address@hidden> writes:
>
>>>> 1 - assume the raw TeX output with its funny quoted bytes is in the
>>>> current temp buffer.   The buffer is in unibyte mode.
>>
>>> No good.  We are talking about process output that is accumulating in
>>> a buffer.  We can't just let everything trickle in in raw mode since
>>> the buffer may be interactive and so we need to have more or less
>>> accurate stuff at each point of time.
>>
>> That's OK.  This assumption is not important.  You can do the
>> decoding in the process filter, or anywhere else.
>>
>>>> 3 - call decode-coding-region with the appropriate coding system.
>>>> 4 - set the buffer to multibyte.
>>
>>> The buffer comes into being incrementally.
>>
>> There can be several buffers.  Remember in point 1 I said "temp buffer".
>> And I'm sue it can be all done within a multibyte buffer if necessary.
>>
>>>> If the step number 2 is too slow, you can most likely implement a
>>>> CCL program that does it faster.
>>
>>> Well, that was what I was asking about.  And how to let this CCL
>>> program run prefixed to the normal process output decoding program.
>>
>> You can run a CCL program independently from any coding system.

Is there any other way to do this than `ccl-execute-on-string'? Using
the latter would imply string allocation (two times, if I read the
code correctly). This is not the case for coding systems, AFAICS.  It
would be nice to have a `ccl-execute-on-region'.

> Well, I can hardly run it manually _before_ the process decoding
> stuff.  And if I run it in the filter function, it has to deal with
> partial characters at the end of the string.  And the utf-8 decoding
> after it also has to deal with partial characters at the end of the
> string, which is normally done by the process filter.

The best I can think of without changing the C code is to write a CCL
program that returns the number of octets at the end that are
suspected to be incomplete control words. Run that in a filter and
frob the process mark or whatever (I am largely ignorant of process
issues, so please bear with me, if that happens to be nonsense.)  Like
with the example below:

(defun example-ccl-test-hex-to-byte (reg)
  "Return CCL code to convert hex char to byte.
REG is the CCL register where the character is stored.  This only
deals with lowercase hex chars."
  `(if (,reg < ?0)
       ((write ,reg)
        (,reg = -1))
     (if (,reg <= ?9)
         (,reg -= 48)
       (if (,reg < ?a)
           ((write ,reg)
            (,reg = -1))
         (if (,reg > ?f)
             ((write ,reg)
              (,reg = -1))
           (,reg -= 87))))))


(define-ccl-program example-ccl-progam
  `(1
    (loop
     (r1 = -1)
     (r2 = -1)
     (r0 = 0)
     (read r1)
     (if (r1 != ?^)
         (write r1)
       ;; We update r0 accordingly whenever we read some character in.
       ((r0 += 1)
        (read r1)
        (if (r1 != ?^)
            ((write ?^)
             (write r1))
          ;; We have found the sequence ^^ so far. Let's for now just
          ;; /assume/ that the following two chars are a valid hex
          ;; number.
          ((r0 += 1)
           (read r1)
           (r0 += 1)
           (read r2)
           ,(example-ccl-test-hex-to-byte 'r1)
           ,(example-ccl-test-hex-to-byte 'r2)
           (r1 *= 16)
           (r1 += r2)
           (write r1)))))
     (repeat))
    (if (r1 != -1)
        ((write ?^)
         (write ?^)
         (write r1)))))

(defun example-decode-region (from to)
  ;; This returns the number of characters at the end that are
  ;; suspected to be part of a yet incomplete control.
  (let ((str (buffer-substring-no-properties from to))
        (vect (make-vector 9 nil)))
    (delete-region from to)
    (insert (ccl-execute-on-string 'example-ccl-program
                                   vect
                                   str))
    (aref vect 0)))


    Oliver
-- 
Oliver Scholz               23 Vendémiaire an 213 de la Révolution
Ostendstr. 61               Liberté, Egalité, Fraternité!
60314 Frankfurt a. M.       




reply via email to

[Prev in Thread] Current Thread [Next in Thread]