--- Begin Message ---
Subject: |
GUILE 2.0.11: open-bytevector-input-port fails to open in binary mode |
Date: |
Wed, 25 Mar 2015 15:31:32 +0100 |
Run the following code in an UTF-8 capable locale:
(setlocale LC_ALL "")
(use-modules (rnrs io ports) (rnrs bytevectors) (ice-9 format))
(let ((p (open-bytevector-input-port
(u8-list->bytevector '(#xc3 #x9f #xc3 #X9f)))))
(format #t "~a ~a\n" (port-encoding p) (binary-port? p))
(format #t "#x~x\n" (char->integer (read-char p)))
(format #t "~a ~a\n" (port-encoding p) (binary-port? p))
(set-port-encoding! p "ISO-8859-1")
(format #t "~a ~a\n" (port-encoding p) (binary-port? p))
(format #t "#x~x\n" (char->integer (read-char p)))
(format #t "~a ~a\n" (port-encoding p) (binary-port? p)))
This results in the output
#f #t
#xdf
#f #t
ISO-8859-1 #f
#xc3
ISO-8859-1 #f
The manual, however, states:
-- Scheme Procedure: port-encoding port
-- C Function: scm_port_encoding (port)
Returns, as a string, the character encoding that PORT uses to
interpret its input and output. The value ‘#f’ is equivalent to
‘"ISO-8859-1"’.
That would appear to be false since the value #f here is treated as
equivalent to "UTF-8" rather than "ISO-8859-1".
In addition, the manual states
-- Scheme Procedure: binary-port? port
Return ‘#t’ if PORT is a "binary port", suitable for binary data
input/output.
Note that internally Guile does not differentiate between binary
and textual ports, unlike the R6RS. Thus, this procedure returns
true when PORT does not have an associated encoding—i.e., when
‘(port-encoding PORT)’ is ‘#f’ (*note port-encoding: Ports.). This
is the case for ports returned by R6RS procedures such as
‘open-bytevector-input-port’ and ‘make-custom-binary-output-port’.
However, Guile currently does not prevent use of textual I/O
procedures such as ‘display’ or ‘read-char’ with binary ports.
Doing so “upgrades” the port from binary to textual, under the
ISO-8859-1 encoding. Likewise, Guile does not prevent use of
‘set-port-encoding!’ on a binary port, which also turns it into a
“textual” port.
But it would appear that the only way to actually get binary-encoded
read-char behavior is to switch the port to textual. While the port is
in "binary" mode, it will decode as utf-8 rather than deliver binary
data. Also it will not automagically switch itself away from the
nominal #f encoding which is not actually present.
Putting (with-fluids ((%default-port-encoding #f)) ...) around the
open-bytevector-input-port call results in the output
#f #t
#xc3
ISO-8859-1 #f
ISO-8859-1 #f
#x9f
ISO-8859-1 #f
which actually corresponds to the documentation.
--
David Kastrup
--- End Message ---
--- Begin Message ---
Subject: |
Re: bug#20200: GUILE 2.0.11: open-bytevector-input-port fails to open in binary mode |
Date: |
Sat, 28 Mar 2015 16:13:56 -0400 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) |
Fixed in d574d96f879c147c6c14df43f2e4ff9e8a6876b9, which will be in
Guile 2.0.12. I'm closing this bug now.
Thanks,
Mark
--- End Message ---