guile-commits
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Guile-commits] GNU Guile branch, master, updated. release_1-9-14-96-ga2


From: Ludovic Courtès
Subject: [Guile-commits] GNU Guile branch, master, updated. release_1-9-14-96-ga2c3637
Date: Tue, 25 Jan 2011 23:30:05 +0000

This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU Guile".

http://git.savannah.gnu.org/cgit/guile.git/commit/?id=a2c36371ce3de246bdb892afd50915ecf450df47

The branch, master has been updated
       via  a2c36371ce3de246bdb892afd50915ecf450df47 (commit)
       via  cc540d0bbd4595be32d760bfd6e9e5ec3cdbd3d3 (commit)
       via  2e59af2100f05c1461ca2913027cc499959b67f8 (commit)
       via  647dc1ac2370cd00ae0996ddc9c4659d96ee77a2 (commit)
       via  e9a35a965bb52d22a5093b168013a0a2adb24440 (commit)
      from  c32f0d6b87b8a7ec024b3d6d54c9b03e9dfd73a4 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
commit a2c36371ce3de246bdb892afd50915ecf450df47
Author: Ludovic Courtès <address@hidden>
Date:   Wed Jan 26 00:24:34 2011 +0100

    Rewrite `read-line' in terms of `scm_getc'.
    
    As a result `read-line' handles decoding and decoding errors the same
    way as `scm_getc'.  It's also simpler and free of `malloc' calls.
    
    * libguile/rdelim.c (scm_do_read_line): Remove.
      (scm_read_line): Rewrite as a loop that calls `scm_getc'.
    
    * test-suite/tests/rdelim.test: New file.
    * test-suite/Makefile.am (SCM_TESTS): Add `tests/rdelim.test'.

commit cc540d0bbd4595be32d760bfd6e9e5ec3cdbd3d3
Author: Ludovic Courtès <address@hidden>
Date:   Wed Jan 26 00:16:10 2011 +0100

    Have `scm_getc' honor the port's conversion strategy.
    
    * libguile/ports.c (get_codepoint): Reset `pt->input_cd' upon failure.
      If `pt->ilseq_handler' is `SCM_ICONVEH_QUESTION_MARK', then return a
      question mark.
      [failure]: Use `scm_encoding_error' when raising an error.
    
    * test-suite/lib.scm (exception:encoding-error): Adjust regexp.
    
    * test-suite/tests/ports.test ("string ports")["read-char, wrong
      encoding, error", "read-char, wrong encoding, escape", "read-char,
      wrong encoding, substitute"]: New tests.

commit 2e59af2100f05c1461ca2913027cc499959b67f8
Author: Ludovic Courtès <address@hidden>
Date:   Wed Jan 26 00:05:27 2011 +0100

    Test substitution and escaping on port output.
    
    * test-suite/tests/ports.test ("string ports")["wrong encoding,
      substitute", "wrong encoding, escape"]: New tests.

commit 647dc1ac2370cd00ae0996ddc9c4659d96ee77a2
Author: Ludovic Courtès <address@hidden>
Date:   Tue Jan 25 23:36:35 2011 +0100

    Add `scm_{to,from}_utf32_string'.
    
    * libguile/strings.c (scm_from_utf32_string, scm_from_utf32_stringn,
      scm_to_utf32_string, scm_to_utf32_stringn): New functions.
    
    * libguile/strings.h (scm_from_utf32_string, scm_from_utf32_stringn,
      scm_to_utf32_string, scm_to_utf32_stringn): New declarations.
    
    * doc/ref/api-data.texi (Conversion to/from C): Document
      `scm_{to,from}_{utf8,utf32}_stringn'.

commit e9a35a965bb52d22a5093b168013a0a2adb24440
Author: Ludovic Courtès <address@hidden>
Date:   Tue Jan 25 23:24:36 2011 +0100

    Optimize `scm_{to,from}_latin1_string'.
    
    * libguile/strings.c (scm_from_latin1_stringn): Directly return a narrow
      string instead of going through `scm_from_stringn'.
      (scm_to_latin1_stringn): Directly return a copy of STR's raw bytes when
      it's narrow.

-----------------------------------------------------------------------

Summary of changes:
 doc/ref/api-data.texi        |   19 ++++--
 libguile/ports.c             |   38 ++++++----
 libguile/rdelim.c            |  164 +++++++++++++-----------------------------
 libguile/strings.c           |   89 ++++++++++++++++++++++-
 libguile/strings.h           |    5 ++
 test-suite/Makefile.am       |    4 +-
 test-suite/lib.scm           |    5 +-
 test-suite/tests/ports.test  |   62 ++++++++++++++--
 test-suite/tests/rdelim.test |   76 +++++++++++++++++++
 9 files changed, 313 insertions(+), 149 deletions(-)
 create mode 100644 test-suite/tests/rdelim.test

diff --git a/doc/ref/api-data.texi b/doc/ref/api-data.texi
index 17b32bb..e9d40bf 100755
--- a/doc/ref/api-data.texi
+++ b/doc/ref/api-data.texi
@@ -4040,14 +4040,21 @@ conversion functions are provided to convert between 
Latin-1 C strings
 and Guile strings.
 
 @deftypefn {C Function} SCM scm_from_latin1_stringn (const char *str, size_t 
len)
-This function returns a scheme string from an ISO-8859-1-encoded C
-string @var{str} of length @var{len}.
address@hidden {C Function} SCM scm_from_utf8_stringn (const char *str, size_t 
len)
address@hidden {C Function} SCM scm_from_utf32_stringn (const scm_t_wchar *str, 
size_t len)
+Return a scheme string from C string @var{str}, which is ISO-8859-1-,
+UTF-8-, or UTF-32-encoded, of length @var{len}.  @var{len} is the number
+of bytes pointed to by @var{str} for @code{scm_from_latin1_stringn} and
address@hidden; it is the number of elements (code points)
+in @var{str} in the case of @code{scm_from_utf32_stringn}.
 @end deftypefn
 
address@hidden {C function} char * scm_to_latin1_stringn (SCM str, size_t *lenp)
-This function returns a newly allocated, ISO-8859-1-encoded C string
-from the scheme string @var{str}.  An error will be thrown if the scheme
-string cannot be converted to the ISO-8859-1 encoding.  If @var{lenp} is
address@hidden {C function} char *scm_to_latin1_stringn (SCM str, size_t *lenp)
address@hidden {C function} char *scm_to_utf8_stringn (SCM str, size_t *lenp)
address@hidden {C function} scm_t_wchar *scm_to_utf32_stringn (SCM str, size_t 
*lenp)
+Return a newly allocated, ISO-8859-1-, UTF-8-, or UTF-32-encoded C string
+from Scheme string @var{str}.  An error is thrown when @var{str}
+string cannot be converted to the specified encoding.  If @var{lenp} is
 @code{NULL}, the returned C string will be null terminated, and an error
 will be thrown if the C string would otherwise contain null
 characters. If @var{lenp} is not NULL, the length of the string is
diff --git a/libguile/ports.c b/libguile/ports.c
index 36f4b88..1cfcba0 100644
--- a/libguile/ports.c
+++ b/libguile/ports.c
@@ -1114,6 +1114,7 @@ utf8_to_codepoint (const scm_t_uint8 *utf8_buf, size_t 
size)
    failure.  */
 static scm_t_wchar
 get_codepoint (SCM port, char buf[SCM_MBCHAR_BUF_SIZE], size_t *len)
+#define FUNC_NAME "scm_getc"
 {
   int err, byte_read;
   size_t bytes_consumed, output_size;
@@ -1164,10 +1165,22 @@ get_codepoint (SCM port, char buf[SCM_MBCHAR_BUF_SIZE], 
size_t *len)
     }
 
   if (err != 0)
-    goto failure;
+    {
+      /* Reset the `iconv' state.  */
+      iconv (pt->input_cd, NULL, NULL, NULL, NULL);
+
+      if (pt->ilseq_handler == SCM_ICONVEH_QUESTION_MARK)
+       codepoint = '?';
+      else
+       /* Fail when the strategy is SCM_ICONVEH_ERROR or
+          SCM_ICONVEH_ESCAPE_SEQUENCE (the latter doesn't make sense
+          for input encoding errors.)  */
+       goto failure;
+    }
+  else
+    /* Convert the UTF8_BUF sequence to a Unicode code point.  */
+    codepoint = utf8_to_codepoint (utf8_buf, output_size);
 
-  /* Convert the UTF8_BUF sequence to a Unicode code point.  */
-  codepoint = utf8_to_codepoint (utf8_buf, output_size);
   update_port_lf (codepoint, port);
 
   *len = bytes_consumed;
@@ -1176,23 +1189,18 @@ get_codepoint (SCM port, char buf[SCM_MBCHAR_BUF_SIZE], 
size_t *len)
 
  failure:
   {
-    char *err_buf;
-    SCM err_str = scm_i_make_string (bytes_consumed, &err_buf);
-    memcpy (err_buf, buf, bytes_consumed);
-
-    if (err == EILSEQ)
-      scm_misc_error (NULL, "input encoding error for ~s: ~s",
-                     scm_list_2 (scm_from_locale_string 
(scm_i_get_port_encoding (port)),
-                                 err_str));
-    else
-      scm_misc_error (NULL, "input encoding error (invalid) for ~s: ~s\n", 
-                     scm_list_2 (scm_from_locale_string 
(scm_i_get_port_encoding (port)),
-                                 err_str));
+    SCM bv;
+
+    bv = scm_c_make_bytevector (bytes_consumed);
+    memcpy (SCM_BYTEVECTOR_CONTENTS (bv), buf, bytes_consumed);
+    scm_encoding_error (FUNC_NAME, err, "input decoding error",
+                       pt->encoding, "UTF-8", bv);
   }
 
   /* Never gets here.  */
   return 0;
 }
+#undef FUNC_NAME
 
 /* Read a codepoint from PORT and return it.  */
 scm_t_wchar
diff --git a/libguile/rdelim.c b/libguile/rdelim.c
index 1340c62..0fa0b8f 100644
--- a/libguile/rdelim.c
+++ b/libguile/rdelim.c
@@ -1,5 +1,6 @@
-/* Copyright (C) 1995,1996,1997,1998,1999,2000,2001, 2006 Free Software 
Foundation, Inc.
- * 
+/* Copyright (C) 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2006,
+ *   2011 Free Software Foundation, Inc.
+ *
  * This library is free software; you can redistribute it and/or
  * modify it under the terms of the GNU Lesser General Public License
  * as published by the Free Software Foundation; either version 3 of
@@ -100,88 +101,6 @@ SCM_DEFINE (scm_read_delimited_x, "%read-delimited!", 3, 
3, 0,
 }
 #undef FUNC_NAME
 
-static unsigned char *
-scm_do_read_line (SCM port, size_t *len_p)
-{
-  scm_t_port *pt = SCM_PTAB_ENTRY (port);
-  unsigned char *end;
-
-  /* I thought reading lines was simple.  Mercy me.  */
-
-  /* The common case: the buffer contains a complete line. 
-     This needs to be fast.  */
-  if ((end = memchr (pt->read_pos, '\n', (pt->read_end - pt->read_pos)))
-          != 0)
-    {
-      size_t buf_len = (end + 1) - pt->read_pos;
-      /* Allocate a buffer of the perfect size.  */
-      unsigned char *buf = scm_malloc (buf_len + 1);
-
-      memcpy (buf, pt->read_pos, buf_len);
-      pt->read_pos += buf_len;
-
-      buf[buf_len] = '\0';
-
-      *len_p = buf_len;
-      return buf;
-    }
-
-  /* The buffer contains no newlines.  */
-  {
-    /* When live, len is always the number of characters in the
-       current buffer that are part of the current line.  */
-    size_t len = (pt->read_end - pt->read_pos);
-    size_t buf_size = (len < 50) ? 60 : len * 2;
-    /* Invariant: buf always has buf_size + 1 characters allocated;
-       the `+ 1' is for the final '\0'.  */
-    unsigned char *buf = scm_malloc (buf_size + 1);
-    size_t buf_len = 0;
-
-    for (;;)
-      {
-       if (buf_len + len > buf_size)
-         {
-           size_t new_size = (buf_len + len) * 2;
-           buf = scm_realloc (buf, new_size + 1);
-           buf_size = new_size;
-         }
-
-       /* Copy what we've got out of the port, into our buffer.  */
-       memcpy (buf + buf_len, pt->read_pos, len);
-       buf_len += len;
-       pt->read_pos += len;
-
-       /* If we had seen a newline, we're done now.  */
-       if (end)
-         break;
-
-       /* Get more characters.  */
-       if (scm_fill_input (port) == EOF)
-         {
-           /* If we're missing a final newline in the file, return
-              what we did get, sans newline.  */
-           if (buf_len > 0)
-             break;
-
-           free (buf);
-           return 0;
-         }
-
-       /* Search the buffer for newlines.  */
-       if ((end = memchr (pt->read_pos, '\n',
-                          (len = (pt->read_end - pt->read_pos))))
-           != 0)
-         len = (end - pt->read_pos) + 1;
-      }
-
-    /* I wonder how expensive this realloc is.  */
-    buf = scm_realloc (buf, buf_len + 1);
-    buf[buf_len] = '\0';
-    *len_p = buf_len;
-    return buf;
-  }
-}
-
 
 /*
  * %read-line 
@@ -201,52 +120,67 @@ SCM_DEFINE (scm_read_line, "%read-line", 0, 1, 0,
            "@code{(#<eof> . #<eof>)}.")
 #define FUNC_NAME s_scm_read_line
 {
-  scm_t_port *pt;
-  char *s;
-  size_t slen = 0;
-  SCM line, term;
-  const char *enc;
-  scm_t_string_failed_conversion_handler hndl;
+/* Threshold under which the only allocation performed is that of the
+   resulting string and pair.  */
+#define LINE_BUFFER_SIZE 1024
+
+  SCM line, strings, result;
+  scm_t_wchar buf[LINE_BUFFER_SIZE], delim;
+  size_t index;
 
   if (SCM_UNBNDP (port))
     port = scm_current_input_port ();
-  SCM_VALIDATE_OPINPORT (1,port);
 
-  pt = SCM_PTAB_ENTRY (port);
-  enc = pt->encoding;
-  hndl = pt->ilseq_handler;
-  if (pt->rw_active == SCM_PORT_WRITE)
-    scm_ptobs[SCM_PTOBNUM (port)].flush (port);
+  SCM_VALIDATE_OPINPORT (1,port);
 
-  s = (char *) scm_do_read_line (port, &slen);
+  index = 0;
+  delim = 0;
+  strings = SCM_EOL;
 
-  if (s == NULL)
-    term = line = SCM_EOF_VAL;
-  else
+  do
     {
-      if (s[slen - 1] == '\n')
+      if (index >= sizeof (buf))
        {
-         term = SCM_MAKE_CHAR ('\n');
-         s[slen - 1] = '\0';
-
-         line = scm_from_stringn (s, slen - 1, enc, hndl);
-         free (s);
-         SCM_INCLINE (port);
+         /* The line is getting longer than BUF so store its current
+            contents in STRINGS.  */
+         strings = scm_cons (scm_from_utf32_stringn (buf, index),
+                             scm_is_false (strings) ? SCM_EOL : strings);
+         index = 0;
        }
       else
        {
-         /* Fix: we should check for eof on the port before assuming this. */
-         term = SCM_EOF_VAL;
-         line = scm_from_stringn (s, slen, enc, hndl);
-         free (s);
-         SCM_COL (port) += scm_i_string_length (line);
+         buf[index] = scm_getc (port);
+         switch (buf[index])
+           {
+           case EOF:
+           case '\n':
+             delim = buf[index];
+             break;
+
+           default:
+             index++;
+           }
        }
     }
+  while (delim == 0);
+
+  if (scm_is_false (strings))
+    line = scm_from_utf32_stringn (buf, index);
+  else
+    {
+      /* Aggregate the intermediary results.  */
+      strings = scm_cons (scm_from_utf32_stringn (buf, index), strings);
+      line = scm_string_concatenate (scm_reverse (strings));
+    }
 
-  if (pt->rw_random)
-    pt->rw_active = SCM_PORT_READ;
+  if (delim == EOF && scm_i_string_length (line) == 0)
+    result = scm_cons (SCM_EOF_VAL, SCM_EOF_VAL);
+  else
+    result = scm_cons (line,
+                      delim == EOF ? SCM_EOF_VAL : SCM_MAKE_CHAR (delim));
 
-  return scm_cons (line, term);
+  return result;
+#undef LINE_BUFFER_SIZE
 }
 #undef FUNC_NAME
 
diff --git a/libguile/strings.c b/libguile/strings.c
index 93dd6a7..41998a9 100644
--- a/libguile/strings.c
+++ b/libguile/strings.c
@@ -1545,7 +1545,17 @@ scm_from_latin1_string (const char *str)
 SCM
 scm_from_latin1_stringn (const char *str, size_t len)
 {
-  return scm_from_stringn (str, len, NULL, SCM_FAILED_CONVERSION_ERROR);
+  char *buf;
+  SCM result;
+
+  if (len == (size_t) -1)
+    len = strlen (str);
+
+  /* Make a narrow string and copy STR as is.  */
+  result = scm_i_make_string (len, &buf);
+  memcpy (buf, str, len);
+
+  return result;
 }
 
 SCM
@@ -1560,6 +1570,28 @@ scm_from_utf8_stringn (const char *str, size_t len)
   return scm_from_stringn (str, len, "UTF-8", SCM_FAILED_CONVERSION_ERROR);
 }
 
+SCM
+scm_from_utf32_string (const scm_t_wchar *str)
+{
+  return scm_from_utf32_stringn (str, -1);
+}
+
+SCM
+scm_from_utf32_stringn (const scm_t_wchar *str, size_t len)
+{
+  SCM result;
+  scm_t_wchar *buf;
+
+  if (len == (size_t) -1)
+    len = u32_strlen ((uint32_t *) str);
+
+  result = scm_i_make_wide_string (len, &buf);
+  memcpy (buf, str, len * sizeof (scm_t_wchar));
+  scm_i_try_narrow_string (result);
+
+  return result;
+}
+
 /* Create a new scheme string from the C string STR.  The memory of
    STR may be used directly as storage for the new string.  */
 /* FIXME: GC-wise, the only way to use the memory area pointed to by STR
@@ -1752,9 +1784,26 @@ scm_to_latin1_string (SCM str)
 
 char *
 scm_to_latin1_stringn (SCM str, size_t *lenp)
+#define FUNC_NAME "scm_to_latin1_stringn"
 {
-  return scm_to_stringn (str, lenp, NULL, SCM_FAILED_CONVERSION_ERROR);
+  char *result;
+
+  SCM_VALIDATE_STRING (1, str);
+
+  if (scm_i_is_narrow_string (str))
+    {
+      if (lenp)
+       *lenp = scm_i_string_length (str);
+
+      result = scm_strdup (scm_i_string_data (str));
+    }
+  else
+    result = scm_to_stringn (str, lenp, NULL,
+                            SCM_FAILED_CONVERSION_ERROR);
+
+  return result;
 }
+#undef FUNC_NAME
 
 char *
 scm_to_utf8_string (SCM str)
@@ -1768,6 +1817,42 @@ scm_to_utf8_stringn (SCM str, size_t *lenp)
   return scm_to_stringn (str, lenp, "UTF-8", SCM_FAILED_CONVERSION_ERROR);
 }
 
+scm_t_wchar *
+scm_to_utf32_string (SCM str)
+{
+  return scm_to_utf32_stringn (str, NULL);
+}
+
+scm_t_wchar *
+scm_to_utf32_stringn (SCM str, size_t *lenp)
+#define FUNC_NAME "scm_to_utf32_stringn"
+{
+  scm_t_wchar *result;
+
+  SCM_VALIDATE_STRING (1, str);
+
+  if (scm_i_is_narrow_string (str))
+    result = (scm_t_wchar *)
+      scm_to_stringn (str, lenp, "UTF-32",
+                     SCM_FAILED_CONVERSION_ERROR);
+  else
+    {
+      size_t len;
+
+      len = scm_i_string_length (str);
+      if (lenp)
+       *lenp = len;
+
+      result = scm_malloc ((len + 1) * sizeof (scm_t_wchar));
+      memcpy (result, scm_i_string_wide_chars (str),
+             len * sizeof (scm_t_wchar));
+      result[len] = 0;
+    }
+
+  return result;
+}
+#undef FUNC_NAME
+
 /* Return a malloc(3)-allocated buffer containing the contents of STR encoded
    according to ENCODING.  If LENP is non-NULL, set it to the size in bytes of
    the returned buffer.  If the conversion to ENCODING fails, apply the 
strategy
diff --git a/libguile/strings.h b/libguile/strings.h
index 168fcb7..b9e901b 100644
--- a/libguile/strings.h
+++ b/libguile/strings.h
@@ -148,6 +148,11 @@ SCM_API char *scm_to_utf8_stringn (SCM str, size_t *lenp);
 SCM_API SCM scm_from_utf8_string (const char *str);
 SCM_API SCM scm_from_utf8_stringn (const char *str, size_t len);
 
+SCM_API scm_t_wchar *scm_to_utf32_string (SCM str);
+SCM_API scm_t_wchar *scm_to_utf32_stringn (SCM str, size_t *lenp);
+SCM_API SCM scm_from_utf32_string (const scm_t_wchar *str);
+SCM_API SCM scm_from_utf32_stringn (const scm_t_wchar *str, size_t len);
+
 SCM_API char *scm_to_stringn (SCM str, size_t *lenp, const char *encoding,
                               scm_t_string_failed_conversion_handler handler);
 SCM_API size_t scm_to_locale_stringbuf (SCM str, char *buf, size_t max_len);
diff --git a/test-suite/Makefile.am b/test-suite/Makefile.am
index b1f184e..4d4b250 100644
--- a/test-suite/Makefile.am
+++ b/test-suite/Makefile.am
@@ -1,6 +1,7 @@
 ## Process this file with automake to produce Makefile.in.
 ##
-## Copyright 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010 
Software Foundation, Inc.
+## Copyright 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009,
+##   2010, 2011 Software Foundation, Inc.
 ##
 ## This file is part of GUILE.
 ##
@@ -100,6 +101,7 @@ SCM_TESTS = tests/00-initial-env.test               \
            tests/r6rs-unicode.test             \
            tests/rnrs-libraries.test           \
            tests/ramap.test                    \
+           tests/rdelim.test                   \
            tests/reader.test                   \
            tests/receive.test                  \
            tests/regexp.test                   \
diff --git a/test-suite/lib.scm b/test-suite/lib.scm
index 8ebcb01..f3cbfd7 100644
--- a/test-suite/lib.scm
+++ b/test-suite/lib.scm
@@ -1,5 +1,6 @@
 ;;;; test-suite/lib.scm --- generic support for testing
-;;;; Copyright (C) 1999, 2000, 2001, 2004, 2006, 2007, 2009, 2010 Free 
Software Foundation, Inc.
+;;;; Copyright (C) 1999, 2000, 2001, 2004, 2006, 2007, 2009, 2010,
+;;;;   2011 Free Software Foundation, Inc.
 ;;;;
 ;;;; This program is free software; you can redistribute it and/or
 ;;;; modify it under the terms of the GNU Lesser General Public
@@ -279,7 +280,7 @@
 (define exception:system-error
   (cons 'system-error ".*"))
 (define exception:encoding-error
-  (cons 'encoding-error "(cannot convert to output locale|input locale 
conversion error)"))
+  (cons 'encoding-error "(cannot convert to output locale|input (locale 
conversion|decoding) error)"))
 (define exception:miscellaneous-error
   (cons 'misc-error "^.*"))
 (define exception:read-error
diff --git a/test-suite/tests/ports.test b/test-suite/tests/ports.test
index 4edd531..8d3f672 100644
--- a/test-suite/tests/ports.test
+++ b/test-suite/tests/ports.test
@@ -1,8 +1,9 @@
 ;;;; ports.test --- Guile I/O ports.    -*- coding: utf-8; mode: scheme; -*-
 ;;;; Jim Blandy <address@hidden> --- May 1999
 ;;;;
-;;;;   Copyright (C) 1999, 2001, 2004, 2006, 2007, 2009, 2010 Free Software 
Foundation, Inc.
-;;;; 
+;;;;   Copyright (C) 1999, 2001, 2004, 2006, 2007, 2009, 2010,
+;;;;      2011 Free Software Foundation, Inc.
+;;;;
 ;;;; This library is free software; you can redistribute it and/or
 ;;;; modify it under the terms of the GNU Lesser General Public
 ;;;; License as published by the Free Software Foundation; either
@@ -18,11 +19,12 @@
 ;;;; Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 
USA
 
 (define-module (test-suite test-ports)
-  :use-module (test-suite lib)
-  :use-module (test-suite guile-test)
-  :use-module (ice-9 popen)
-  :use-module (ice-9 rdelim)
-  :use-module (rnrs bytevectors))
+  #:use-module (test-suite lib)
+  #:use-module (test-suite guile-test)
+  #:use-module (ice-9 popen)
+  #:use-module (ice-9 rdelim)
+  #:use-module (rnrs bytevectors)
+  #:use-module ((rnrs io ports) #:select (open-bytevector-input-port)))
 
 (define (display-line . args)
   (for-each display args)
@@ -424,6 +426,26 @@
                (string=? to "ISO-8859-1")
                (string? (strerror errno)))))))
 
+  (pass-if "wrong encoding, substitute"
+    (let ((str "ĉu bone?"))
+      (with-fluids ((%default-port-encoding "ISO-8859-1"))
+        (string=? (with-output-to-string
+                    (lambda ()
+                      (set-port-conversion-strategy! (current-output-port)
+                                                     'substitute)
+                      (display str)))
+                  "?u bone?"))))
+
+  (pass-if "wrong encoding, escape"
+    (let ((str "ĉu bone?"))
+      (with-fluids ((%default-port-encoding "ISO-8859-1"))
+        (string=? (with-output-to-string
+                    (lambda ()
+                      (set-port-conversion-strategy! (current-output-port)
+                                                     'escape)
+                      (display str)))
+                  "\\u0109u bone?"))))
+
   (pass-if "peek-char [latin-1]"
     (let ((p (with-fluids ((%default-port-encoding #f))
                (open-input-string "hello, world"))))
@@ -440,7 +462,31 @@
            (char=? (peek-char p) #\안)
            (char=? (peek-char p) #\안)
            (= (port-line p) 0)
-           (= (port-column p) 0)))))
+           (= (port-column p) 0))))
+
+  (pass-if-exception "read-char, wrong encoding, error"
+    exception:encoding-error
+    (let ((p (with-fluids ((%default-port-encoding "UTF-8"))
+               (open-bytevector-input-port #vu8(255 1 2 3)))))
+      (set-port-conversion-strategy! p 'error)
+      (read-char p)
+      #t))
+
+  (pass-if-exception "read-char, wrong encoding, escape"
+    exception:encoding-error
+    ;; `escape' should behave like `error'.
+    (let ((p (with-fluids ((%default-port-encoding "UTF-8"))
+               (open-bytevector-input-port #vu8(255 1 2 3)))))
+      (set-port-conversion-strategy! p 'escape)
+      (read-char p)
+      #t))
+
+  (pass-if "read-char, wrong encoding, substitute"
+    (let ((p (with-fluids ((%default-port-encoding "UTF-8"))
+               (open-bytevector-input-port #vu8(255 206 187 206 188)))))
+      (set-port-conversion-strategy! p 'substitute)
+      (equal? (list (read-char p) (read-char p) (read-char p))
+              '(#\? #\λ #\μ)))))
 
 (with-test-prefix "call-with-output-string"
 
diff --git a/test-suite/tests/rdelim.test b/test-suite/tests/rdelim.test
new file mode 100644
index 0000000..ec7e5b4
--- /dev/null
+++ b/test-suite/tests/rdelim.test
@@ -0,0 +1,76 @@
+;;;; rdelim.test --- Delimited I/O.      -*- mode: scheme; coding: utf-8; -*-
+;;;; Ludovic Courtès <address@hidden>
+;;;;
+;;;;   Copyright (C) 2011 Free Software Foundation, Inc.
+;;;;
+;;;; This library is free software; you can redistribute it and/or
+;;;; modify it under the terms of the GNU Lesser General Public
+;;;; License as published by the Free Software Foundation; either
+;;;; version 3 of the License, or (at your option) any later version.
+;;;;
+;;;; This library is distributed in the hope that it will be useful,
+;;;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;;;; Lesser General Public License for more details.
+;;;;
+;;;; You should have received a copy of the GNU Lesser General Public
+;;;; License along with this library; if not, write to the Free Software
+;;;; Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 
USA
+
+(define-module (test-suite test-rdelim)
+  #:use-module (ice-9 rdelim)
+  #:use-module (test-suite lib))
+
+(with-fluids ((%default-port-encoding "UTF-8"))
+
+  (with-test-prefix "read-line"
+
+    (pass-if "one line"
+      (let* ((s "hello, world")
+             (p (open-input-string s)))
+        (and (string=? s (read-line p))
+             (eof-object? (read-line p)))))
+
+    (pass-if "two lines, trim"
+      (let* ((s "foo\nbar\n")
+             (p (open-input-string s)))
+        (and (equal? (string-tokenize s)
+                     (list (read-line p) (read-line p)))
+             (eof-object? (read-line p)))))
+
+    (pass-if "two lines, concat"
+      (let* ((s "foo\nbar\n")
+             (p (open-input-string s)))
+        (and (equal? '("foo\n" "bar\n")
+                     (list (read-line p 'concat)
+                           (read-line p 'concat)))
+             (eof-object? (read-line p)))))
+
+    (pass-if "two lines, peek"
+      (let* ((s "foo\nbar\n")
+             (p (open-input-string s)))
+        (and (equal? '("foo" #\newline "bar" #\newline)
+                     (list (read-line p 'peek) (read-char p)
+                           (read-line p 'peek) (read-char p)))
+             (eof-object? (read-line p)))))
+
+    (pass-if "two lines, split"
+      (let* ((s "foo\nbar\n")
+             (p (open-input-string s)))
+        (and (equal? '(("foo" . #\newline)
+                       ("bar" . #\newline))
+                     (list (read-line p 'split)
+                           (read-line p 'split)))
+             (eof-object? (read-line p)))))
+
+    (pass-if  "two Greek lines, trim"
+      (let* ((s "λαμβδα\nμυ\n")
+             (p (open-input-string s)))
+        (and (equal? (string-tokenize s)
+                     (list (read-line p) (read-line p)))
+             (eof-object? (read-line p)))))))
+
+;;; Local Variables:
+;;; eval: (put 'with-test-prefix 'scheme-indent-function 1)
+;;; eval: (put 'pass-if 'scheme-indent-function 1)
+;;; End:


hooks/post-receive
-- 
GNU Guile



reply via email to

[Prev in Thread] Current Thread [Next in Thread]