octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #46855] [OF] io package OCT interface: XML/Uni


From: Philip Nienhuis
Subject: [Octave-bug-tracker] [bug #46855] [OF] io package OCT interface: XML/Unicode issues
Date: Fri, 08 Jan 2016 22:42:49 +0000
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:41.0) Gecko/20100101 Firefox/41.0 SeaMonkey/2.38

URL:
  <http://savannah.gnu.org/bugs/?46855>

                 Summary: [OF] io package OCT interface: XML/Unicode issues
                 Project: GNU Octave
            Submitted by: philipnienhuis
            Submitted on: Fri 08 Jan 2016 11:42:48 PM CET
                Category: Octave Forge Package
                Severity: 3 - Normal
                Priority: 5 - Normal
              Item Group: Incorrect Result
                  Status: None
             Assigned to: None
         Originator Name: Philip Nienhuis
        Originator Email: 
             Open/Closed: Open
         Discussion Lock: Any
                 Release: other
        Operating System: Any

    _______________________________________________________

Details:

Writing text strings containing non-alphanumeric characters to xlsx/ods files
using OCT interface seems to go fine but Excel nor LibreOffice can read the
produced files w/o errors.
LibreOffice flat-out refuses to read the file; Excel deletes the entire string
container (xl/sharedStrings.xml) from the zipped xlsx file so no strings
appear in the opened file.

I've just applied a first fix, pertaining to an (in hindsight obvious)
oversight when developing the XML string I/O code:

> < & ' "

(which are illegal characters in XML) are now translated back and forth into
XML escape sequences (fix to be in future io-2.4.1+).
But most other non-alphanumeric characters still give problems. When
non-alphanumeric chars are input directly in Excel or LibreOffice themselves
and saved to file from those programs, reading the files with the OCT
interface returns double-byte sequences.
A workaround could be filtering out any non-alphanumeric characters before
writing and after reading respectively, but that'll require costly regexprep
calls :-(  and when writing it may screw up files produced by other programs.

Does Octave have any binary function, or is there an easy and fast procedure
for filtering out alphanumeric characters?





    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?46855>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]