chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Chicken-users] code snippet (suggested for the wiki) - and several newc


From: F. Wittenberger
Subject: [Chicken-users] code snippet (suggested for the wiki) - and several newcomer questions
Date: Sun, 17 Aug 2008 01:11:04 +0200

Hi all,

in this message
http://lists.gnu.org/archive/html/chicken-users/2008-08/msg00066.html
I posted an snippet, which someone suggested as an explanatory example
for the wiki.

Looking closer, I'd rather tear it apart in some discussion, before I
should go there, since it indeed touches a lot of issues chicken
beginners might want to learn about.

make-internal-pipe returns two values, an input- and an output-port,
connected to each other.  (So it's only useful in multithreaded
contexts, which is why it's interesting, though simple.)

The first version - in the above referenced message - uses only the
documented interface.  Here I'll discuss a more elaborate version, which
uses the currently undocumented interface for read-string and read-line
too.

First lesson.  Pro's and Con's of user level threading.
-------------

We shall soon need to read a single character from a given input string
buffer and update the buffer index atomically.  Within a typical POSIX
thread system, we would need to lock some mutex, read the character,
update the offset and unlock the mutex.

Now let me tell you a secret of success of a lot of games, databases and
a personal experience from several years as professional computer
scientist: the worst thing you can do with operating systems is actually
use them.  No matter how careful you are, switching to kernel mode and
back is expensive in comparison to register arithmetic.

User level thread systems, as chicken provides one, schedule in a way,
which could be understood as cooperative under the hood.  In chicken,
any C expression is never interrupted.

[[[ Is this actually true or just my understanding? ]]]

To get the character out of the string buffer, we need a C snippet no
longer than the one, we would put inside the critical section.  No need
to use locks at all!

(define string-ref++ 
(foreign-lambda*
   char
   ((scheme-pointer buf) ((c-pointer integer) i))
   "char *p=(char *)buf; return(p[(*i)++]);"))

Now this could have been the first trap already.

Note that the "buf" parameter is declared as a scheme-pointer.  A
beginner would have easily chosen then c-string type instead.  After
all, it's a string, which is going to be passed in.  If we declared the
"buf" parameter as c-string, no visible harm where done.  But behind the
scene chicken would copy the whole input string into a fresh location,
then run the C fragment and leave the string for the garbage collector.
-- We better had used locks.

Now the "usual" header:

(define make-internal-pipe
  (let ([make-input-port make-input-port]
[make-output-port make-output-port]
[make-mutex make-mutex]
[make-queue make-queue]
[make-condition-variable make-condition-variable]
[string-length string-length]
[string-ref string-ref]
[string-append string-append])
    (lambda args
      (define name (or (and (pair? args) (car args))
       'internal-pipe))

Beginners did ask, why all these [x x]-bindings?  We keep a local
reference in case the global one becomes redefined.

This is a questionable practise.  While it's just what the doctor
ordered, if you want to provide a definition, which is guaranteed to be
immune against overwrites, it's difficult to use.  If you would - for
example - (use 'utf-8) *before* this unit is initialised, the captured
string-ref where already utf-8 aware and no longer what we need here.

[[[ Again this is what I understand.  Is it true?  ]]]

Also, if you want, say for debugging, to rely on Scheme's ability to
overwrite existing bindings, you are lost.

      (let-location
       ((off integer 0))

Here we declare "off" to be the offset in the string buffer to be an
integer value, shall be usable in the form "(location off)" as the
second parameter to string-ref++.  Otherwise it's nothing but a new
variable.

Now the structure and some predicates for convenience.

       (let ((mutex (make-mutex name))
             (condition (make-condition-variable name))
             (queue (make-queue))
             (buf #f))
         (define (eof?) (eq? #!eof buf))
         (define (buf-empty?) (or (not buf) (fx>= off (string-length buf))))

read-input will either update "buf" and "off" or wait for input.  We use
plain SRFI-18 here.  A better version would use the mailbox egg and a
make-mailbox instead of make-queue.  This would safe both, the
"condition" and the "mutex".  (read-input!) would become a simple
mailbox-receive! on the queue.

[[[ correct? ]]]

         (define (read-input!)
           (mutex-lock! mutex)
           (if (buf-empty?)
               (if (queue-empty? queue)
                   (begin
                     (mutex-unlock! mutex condition)
                     (read-input!))
                   (begin
                     (set! buf #f)
                     (set! buf (queue-remove! queue))
                     (set! off 0)
                     (mutex-unlock! mutex)))
               (mutex-unlock! mutex)))

(read!) shall read the current character

         (define (read!)
           (if (eof?) buf
               (if (buf-empty?)
                   (begin (read-input!) (read!))
                   (string-ref++ buf (location off)))))

GOTCHA!  We have our first raise condition!

Badly enough: this may result in a Segfault.  You have been warned:
If (buf-empty?) returns #f, there is no way to know, that buf is not
already #!eof when the thread comes to execute our string-ref++.

This will arise *only* if more than one thread access the port.
Certainly few people will have use for concurrent reads from the same
port.  But some Schemes provide guarantees wrt. atomicity of port
operations and in another thread I recall having read a suggestion to
use port for inter-thread-communication.  So our Scheme system is better
safe against such cases.  At least it should raise useful exceptions,
not run into system level race conditions.  And it should deal with
ports being close by other threads (supervisors), since this is a common
idiom to terminate unwanted network connections.

Now it's time to learn about a second way to avoid the race.  (The 1st
would be to check in string-ref++ that a string is passed, otherwise
return #f and check the result.  But this would loose the point soon,
since we shall have more uses where we want to update both "buf" and
"off" atomically.)

If the whole unit is compiled with the declaration
"(disable-interrupts)", no thread switch will interrupt us and
everything is fine.  But be careful with that declaration.  Never use it
for potentially long running procedures, it will result in unfair
scheduling.

         (define (ready?)
           (and (not (eof?))
                (or (not (buf-empty?))
                    (not (queue-empty? queue)))))

No questions about that one.

Let's prepare procedures for the undocumented interface now.

read-string reads up to "n" characters from port "p" (the one, we are
about to define) into the string "dest" starting at position "start".

         (define (read-string p n dest start)
           (let loop ((n n) (m 0) (start start))
             (cond ((eq? n 0) m)
                   ((eof?) m)
                   ((buf-empty?) (read-input!) (loop n m start))
                   (else
                    (let* ((rest (fx- (string-length buf) off))
                           (n2 (if (fx< n rest) n rest)))
                      (##core#inline "C_substring_copy" buf dest off (fx+ off 
n2) start)

So far I've been unable to make head or tail of those
##namespace#symbols.  Some appear to be always available, other
sometimes.  Please chicken hackers, enlighten me.

Basically, this inserts a plain "C" call to copy the substring.

                      (set! off (fx+ off n2))
                      (loop (fx- n n2) (fx+ m n2) (fx+ start n2)) ) ) ) ))

         (define (read-line p limit)
           (let loop ((str #f))
             (cond ((eof?) (or str ""))
                   ((buf-empty?) (read-input!) (loop str))
                   (else
                    (##sys#scan-buffer-line
                     buf 
                     (string-length buf)
                     off
                     (lambda (pos2 next)
                       (let ((dest (##sys#make-string (fx- pos2 off))))
                         (##core#inline "C_substring_copy" buf dest off pos2 0)
                         (set! off next)
                         (cond ((eq? pos2 next) ; no line-terminator encountered
                                (read-input!)
                                (loop (if str (##sys#string-append str dest) 
dest)) )
                               (else
                                (##sys#setislot p 4 (fx+ (##sys#slot p 4) 1))
                                (if str (##sys#string-append str dest) dest)) ) 
) ) ) ) ) ) )

This one even more needs some documentation.  I copied it more or less
verbatim from the "tcp.scm" of chicken.

         (define (close) (set! buf #!eof))
         (define (write! s)
           (if (or (and (string? s) (fx> (string-length s) 0))
                   (eof-object? s))
               (begin
                 (mutex-lock! mutex)
                 (queue-add! queue s)
                 (condition-variable-signal! condition)
                 (mutex-unlock! mutex) )))

Here's one more caveat to know about.  For some most code this will
never be an issue: the string, which is written to the port is NOT
copied here, since we want to avoid such copying.  However, if the
writer later modifies the string, the receiver will see the modified
data, not the written one.  So be sure what you do, or go the costly
route and insert the string-copy here.

         (values
          (make-input-port read! ready? close #f read-string read-line)
          (make-output-port write! (lambda () (write! #!eof)))))))))

Well, we're done.

/Jörg

Attachment: make-internal-pipe.scm
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]