octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 3.0.4 RC3 (mingw 3.4.5)-2


From: Benjamin Lindner
Subject: Re: 3.0.4 RC3 (mingw 3.4.5)-2
Date: Tue, 03 Mar 2009 19:49:52 +0100
User-agent: Thunderbird 2.0.0.18 (Windows/20081105)

John W. Eaton wrote:
On 30-Jan-2009, Benjamin Lindner wrote:

| To work around it in octave we need to rethink octave's way to save | anonymous function handles in ascii format. | I have tried something for in the development branch, which allows this | test to pass, but it means a small but significant change in the format | of saving anonymous function handles. This breaks backwards | compatibility, so I'm not really happy about my approach.

We can open text files for reading in binary mode.  It just means that
we have to handle the CRLF, CR, and LF line endings ourselves.  We
already do that (more or less) for .m files.  But it would maybe be a
good idea to come up with a unified way of doing it for all of Octave.


I have put some thought into this problem and have changed the behaviour of reading ascii mode data by doing the following *) files get always opened in binary mode when loading data (saving to ascii is still done in text mode) *) walked through the load_ascii( ) method of the various octave_value derived classes and moved all newline/cr/lf related code into common functions *) created ls-ascii-helper.h and ls-ascii-helper.cc and stored the common functions there

With this, I get rid of the failing test in ov-fcn-handle.cc on mingw. I have no additional failed tests. Saving and loading in "-text" format seems to work, I have not encountered problems yet.

I don't have access to matlab, so I could not check what happens with data files in matlab's text format. I'd appreciate cross-checks.

benjamin
# HG changeset patch
# User Benjamin Lindner <address@hidden>
# Date 1236071396 -3600
# Node ID 1315d5c25fa983805808e7334971fe1830362fbe
# Parent  97991a9e7a18ce7605851e9e65c98e1185bf8615
fix CRLF issues with text-mode reading in windows when loading ascii data

diff -r 97991a9e7a18 -r 1315d5c25fa9 src/ChangeLog
--- a/src/ChangeLog     Tue Mar 03 19:28:03 2009 +0100
+++ b/src/ChangeLog     Tue Mar 03 10:09:56 2009 +0100
@@ -1,3 +1,15 @@
+2009-03-03  Benjamin Lindner  <address@hidden>
+
+       * ls-ascii-helper.h ls-ascii-helper.cc: New files, provide helper 
+       functions skip_until_newline(), skip_preceeding_newline() and
+       read_until_newline() that take care of CR/LF handling.
+       * Makefile.in: add new files
+       * load-save.cc: Open files always in binary mode in Fload
+       * ls-mat-ascii.cc (get_mat_data_input_line), ls-oct-ascii.cc 
+       (extract_keyword, read_ascii_data), ls-oct-ascii.h (extract_keyword), 
+       ov-fcn-handle.cc, ov-fcn-inline.cc, ov-range.cc, ov-str-mat.cc 
+       (load_ascii): Use helper functions 
+       
 2009-03-01  Jaroslav Hajek  <address@hidden>
 
        * ov-perm.cc (octave_perm_matrix::print_raw): Call
diff -r 97991a9e7a18 -r 1315d5c25fa9 src/Makefile.in
--- a/src/Makefile.in   Tue Mar 03 19:28:03 2009 +0100
+++ b/src/Makefile.in   Tue Mar 03 10:09:56 2009 +0100
@@ -123,7 +123,7 @@
        comment-list.h debug.h defun-dld.h defun-int.h defun.h \
        dirfns.h display.h dynamic-ld.h error.h file-io.h gl-render.h \
        gripes.h help.h input.h lex.h load-path.h load-save.h ls-hdf5.h \
-       ls-mat-ascii.h ls-mat4.h ls-mat5.h ls-oct-ascii.h \
+       ls-mat-ascii.h ls-mat4.h ls-mat5.h ls-oct-ascii.h ls-ascii-helper.h \
        ls-oct-binary.h ls-utils.h mex.h mexproto.h oct-errno.h \
        oct-fstrm.h oct-hdf5.h oct-hist.h oct-iostrm.h oct-map.h oct-obj.h \
        oct-prcstrm.h oct-procbuf.h oct-stdstrm.h oct-stream.h \
@@ -219,7 +219,7 @@
        cutils.c data.cc debug.cc defaults.cc defun.cc dirfns.cc \
        display.cc dynamic-ld.cc error.cc file-io.cc gl-render.cc graphics.cc \
        gripes.cc help.cc input.cc lex.l load-path.cc load-save.cc \
-       ls-hdf5.cc ls-mat-ascii.cc ls-mat4.cc ls-mat5.cc ls-oct-ascii.cc \
+       ls-hdf5.cc ls-mat-ascii.cc ls-mat4.cc ls-mat5.cc ls-oct-ascii.cc 
ls-ascii-helper.cc \
        ls-oct-binary.cc ls-utils.cc main.c mappers.cc matherr.c \
        mex.cc oct-fstrm.cc oct-hist.cc oct-iostrm.cc oct-map.cc \
        oct-obj.cc oct-prcstrm.cc oct-procbuf.cc oct-stream.cc \
diff -r 97991a9e7a18 -r 1315d5c25fa9 src/load-save.cc
--- a/src/load-save.cc  Tue Mar 03 19:28:03 2009 +0100
+++ b/src/load-save.cc  Tue Mar 03 10:09:56 2009 +0100
@@ -793,15 +793,12 @@
 
          std::ios::openmode mode = std::ios::in;
 
-         if (format == LS_BINARY
-#ifdef HAVE_HDF5
-             || format == LS_HDF5
-#endif
-             || format == LS_MAT_BINARY
-             || format == LS_MAT5_BINARY
-             || format == LS_MAT7_BINARY)
-           mode |= std::ios::binary;
-
+         // Open in binary mode in any case, to fix annoying bug that
+         // text-mode opened streams cannot be seekg'ed/tellg'ed with
+         // mingw32 (See http://oldwiki.mingw.org/index.php/Known%20Problems )
+         // The CR/LF issues are handled in ls-ascii-helper.cc
+         mode |= std::ios::binary;
+         
 #ifdef HAVE_ZLIB
          if (use_zlib)
            {
diff -r 97991a9e7a18 -r 1315d5c25fa9 src/ls-ascii-helper.cc
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/src/ls-ascii-helper.cc    Tue Mar 03 10:09:56 2009 +0100
@@ -0,0 +1,160 @@
+/*
+
+Copyright (C) 2003, 2005, 2006, 2007 John W. Eaton
+
+This file is part of Octave.
+
+Octave is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by the
+Free Software Foundation; either version 3 of the License, or (at your
+option) any later version.
+
+Octave is distributed in the hope that it will be useful, but WITHOUT
+ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with Octave; see the file COPYING.  If not, see
+<http://www.gnu.org/licenses/>.
+
+*/
+
+
+#include "ls-ascii-helper.h"
+
+#include <iostream>
+#include <sstream>
+
+// Helper functions when reading from ascii files.
+// These function take care of CR/LF issues when files are opened in text-mode 
for reading 
+
+// Skip characters from stream IS until a newline is reached.
+// Depending on KEEP_NEWLINE, either eat newline from stream or
+// keep it unread
+
+void
+skip_until_newline( std::istream& is, bool keep_newline )
+{
+  if (!is)
+    return;
+  
+  char c,d;
+  
+  while (is)
+  {
+      c = is.peek();
+      if (c == '\n' || c == '\r')
+      {
+         // reached newline
+         if (keep_newline == false)
+         {
+             // eat the CR or LF character
+             is.get(d);
+             
+             // make sure that for binary-mode opened ascii files containing 
CRLF line endings
+             // we skip the LF after CR...
+             if (c == '\r' && is.peek()=='\n')
+             {
+                 // yes, LF following CR, eat it...
+                 is.get(d);
+             }
+         }
+         
+         // Newline was found, and read from stream if keep_newline==true, so 
exit loop
+         break;
+      }
+      else
+         // no newline charater peeked, so read it and proceed to next 
character
+         is.get(d);
+  }
+  
+  return;
+}
+
+
+// If stream IS currently points to a newline (a leftover from a previous read)
+// then eat newline(s) until a non-newline character is found
+
+void
+skip_preceeding_newline( std::istream& is )
+{
+  if (!is)
+    return;
+  
+  char c,d;
+  
+  // Check if IS currently points to newline character
+  c = is.peek();
+  if (c == '\n' || c == '\r')
+  {
+      // Yes, at newline
+      do {
+         // eat the CR or LF character
+         is.get(d);
+         
+         // make sure that for binary-mode opened ascii files containing CRLF 
line endings
+         // we skip the LF after CR...
+         if (c == '\r' && is.peek() == '\n')
+         {
+             // yes, LF following CR, eat it...
+             is.get(d);
+         }
+         
+         // Peek into next character
+         c = is.peek();
+      // Loop while still a newline ahead
+      } while( c == '\n' || c == '\r' );
+  }
+  
+  return;
+}
+
+
+// Read charaters from stream IS until a newline is reached.
+// Depending on KEEP_NEWLINE, either eat newline from stream or
+// keep it unread
+// Characters read are stored and returned as std::string
+
+std::string
+read_until_newline( std::istream& is, bool keep_newline )
+{
+  if (!is)
+    return std::string();
+  
+  char c,d;
+  std::ostringstream buf;
+  
+  while (is)
+  {
+      c = is.peek();
+      if (c == '\n' || c == '\r')
+      {
+         // reached newline
+         if (keep_newline == false)
+         {
+             // eat the CR or LF character
+             is.get(d);
+             
+             // make sure that for binary-mode opened ascii files containing 
CRLF line endings
+             // we skip the LF after CR...
+             if (c == '\r' && is.peek() == '\n')
+             {
+                 // yes, LF following CR, eat it...
+                 is.get(d);
+             }
+         }
+         
+         // Newline was found, and read from stream if keep_newline==true, so 
exit loop
+         break;
+      }
+      else
+      {
+         // no newline charater peeked, so read it, store it, and proceed to 
next
+         is.get(d);
+         buf << d;
+      }
+  }
+  
+  return buf.str();
+}
diff -r 97991a9e7a18 -r 1315d5c25fa9 src/ls-ascii-helper.h
--- /dev/null   Thu Jan 01 00:00:00 1970 +0000
+++ b/src/ls-ascii-helper.h     Tue Mar 03 10:09:56 2009 +0100
@@ -0,0 +1,40 @@
+/*
+
+Copyright (C) 2003, 2005, 2006, 2007 John W. Eaton
+
+This file is part of Octave.
+
+Octave is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by the
+Free Software Foundation; either version 3 of the License, or (at your
+option) any later version.
+
+Octave is distributed in the hope that it will be useful, but WITHOUT
+ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with Octave; see the file COPYING.  If not, see
+<http://www.gnu.org/licenses/>.
+
+*/
+
+#if !defined (octave_ls_ascii_helper_h)
+#define octave_ls_ascii_helper_h 1
+
+#include <iosfwd>
+#include <string>
+
+#include "oct-dlldefs.h"
+
+extern OCTINTERP_API void
+skip_until_newline( std::istream& is, bool keep_newline = false );
+
+extern OCTINTERP_API void
+skip_preceeding_newline( std::istream& is );
+
+extern OCTINTERP_API std::string
+read_until_newline( std::istream& is, bool keep_newline = false );
+
+#endif  // !defined (octave_ls_ascii_helper_h)
diff -r 97991a9e7a18 -r 1315d5c25fa9 src/ls-mat-ascii.cc
--- a/src/ls-mat-ascii.cc       Tue Mar 03 19:28:03 2009 +0100
+++ b/src/ls-mat-ascii.cc       Tue Mar 03 10:09:56 2009 +0100
@@ -64,6 +64,7 @@
 #include "dMatrix.h"
 
 #include "ls-mat-ascii.h"
+#include "ls-ascii-helper.h"
 
 static std::string
 get_mat_data_input_line (std::istream& is)
@@ -80,14 +81,16 @@
       while (is.get (c))
        {
          if (c == '\n' || c == '\r')
-           break;
+           {
+             // Let skip_until_newline handle CR/LF issues...
+             skip_until_newline (is, false);
+             break;
+           }
 
          if (c == '%' || c == '#')
            {
              // skip to end of line
-             while (is.get (c))
-               if (c == '\n' || c == '\r')
-                 break;
+             skip_until_newline (is, false);
 
              break;
            }
diff -r 97991a9e7a18 -r 1315d5c25fa9 src/ls-oct-ascii.cc
--- a/src/ls-oct-ascii.cc       Tue Mar 03 19:28:03 2009 +0100
+++ b/src/ls-oct-ascii.cc       Tue Mar 03 10:09:56 2009 +0100
@@ -63,6 +63,7 @@
 #include "dMatrix.h"
 
 #include "ls-oct-ascii.h"
+#include "ls-ascii-helper.h"
 
 // The number of decimal digits to use when writing ascii data.
 static int Vsave_precision = 16;
@@ -118,14 +119,15 @@
                }
 
              retval = value.str ();
+             skip_until_newline (is, false);
              break;
            }
          else if (next_only)
            break;
          else
            {
-             while (is.get (c) && c != '\n' && c != '\r')
-               ; // Skip to end of line.
+             // Skip to end of line
+             skip_until_newline (is, false);
            }
        }
     }
diff -r 97991a9e7a18 -r 1315d5c25fa9 src/ls-oct-ascii.h
--- a/src/ls-oct-ascii.h        Tue Mar 03 19:28:03 2009 +0100
+++ b/src/ls-oct-ascii.h        Tue Mar 03 10:09:56 2009 +0100
@@ -29,6 +29,7 @@
 #include <string>
 
 #include "str-vec.h"
+#include "ls-ascii-helper.h"
 
 // Flag for cell elements
 #define CELL_ELT_TAG "<cell-element>"
@@ -103,8 +104,8 @@
                is >> value;
              if (is)
                status = true;
-             while (is.get (c) && c != '\n' && c != '\r')
-               ; // Skip to beginning of next line;
+             // Skip to beginning of next line;
+             skip_until_newline (is, false);
              break;
            }
          else if (next_only)
@@ -165,8 +166,8 @@
                    is >> value;
                  if (is)
                    status = true;
-                 while (is.get (c) && c != '\n' && c != '\r')
-                   ; // Skip to beginning of next line;
+                 // Skip to beginning of next line;
+                 skip_until_newline (is, false);
                  return status;
                }
            }
diff -r 97991a9e7a18 -r 1315d5c25fa9 src/ov-fcn-handle.cc
--- a/src/ov-fcn-handle.cc      Tue Mar 03 19:28:03 2009 +0100
+++ b/src/ov-fcn-handle.cc      Tue Mar 03 10:09:56 2009 +0100
@@ -59,6 +59,7 @@
 #include "ls-oct-binary.h"
 #include "ls-hdf5.h"
 #include "ls-utils.h"
+#include "ls-ascii-helper.h"
 
 DEFINE_OCTAVE_ALLOCATOR (octave_fcn_handle);
 
@@ -299,26 +300,18 @@
     {
       octave_idx_type len = 0;
       char c;
-      std::ostringstream buf;
+      std::string buf;
 
       // Skip preceeding newline(s).
-      while (is.get (c) && c == '\n')
-       /* do nothing */;
+      skip_preceeding_newline (is);
 
       if (is)
        {
-         buf << c;
 
          // Get a line of text whitespace characters included, leaving
          // newline in the stream.
+         buf = read_until_newline (is, true);
 
-         while (is.peek () != '\n')
-           {
-             is.get (c);
-             if (! is)
-               break;
-             buf << c;
-           }
        }
 
       pos = is.tellg ();
@@ -363,7 +356,7 @@
 
          int parse_status;
          octave_value anon_fcn_handle = 
-           eval_string (buf.str (), true, parse_status);
+           eval_string (buf, true, parse_status);
 
          if (parse_status == 0)
            {
diff -r 97991a9e7a18 -r 1315d5c25fa9 src/ov-fcn-inline.cc
--- a/src/ov-fcn-inline.cc      Tue Mar 03 19:28:03 2009 +0100
+++ b/src/ov-fcn-inline.cc      Tue Mar 03 10:09:56 2009 +0100
@@ -51,6 +51,7 @@
 #include "ls-oct-ascii.h"
 #include "ls-hdf5.h"
 #include "ls-utils.h"
+#include "ls-ascii-helper.h"
 
 DEFINE_OCTAVE_ALLOCATOR (octave_fcn_inline);
 
@@ -163,28 +164,20 @@
        nm = "";
 
       char c;
-      std::ostringstream buf;
+      std::string buf;
 
       // Skip preceeding newline(s)
-      while (is.get (c) && c == '\n')
-       /* do nothing */;
+      skip_preceeding_newline (is);
 
       if (is)
        {
-         buf << c;
 
          // Get a line of text whitespace characters included, leaving
          // newline in the stream
-         while (is.peek () != '\n')
-           {
-             is.get (c);
-             if (! is)
-               break;
-             buf << c;
-           }
+         buf = read_until_newline (is, true);
        }
 
-      iftext = buf.str ();
+      iftext = buf;
 
       octave_fcn_inline tmp (iftext, ifargs, nm);
       fcn = tmp.fcn;
diff -r 97991a9e7a18 -r 1315d5c25fa9 src/ov-range.cc
--- a/src/ov-range.cc   Tue Mar 03 19:28:03 2009 +0100
+++ b/src/ov-range.cc   Tue Mar 03 10:09:56 2009 +0100
@@ -41,6 +41,7 @@
 #include "byte-swap.h"
 #include "ls-hdf5.h"
 #include "ls-utils.h"
+#include "ls-ascii-helper.h"
 
 DEFINE_OCTAVE_ALLOCATOR (octave_range);
 
@@ -326,14 +327,9 @@
        break;
     }
 
-  for (;;)
-    {
-      if (is && (c == '%' || c == '#'))
-       while (is.get (c) && c != '\n')
-         ; // Skip to beginning of next line, ignoring everything.
-      else
-       break;
-    }
+  // Skip to beginning of next line, ignoring everything.
+  skip_until_newline (is, false);
+  
 }
 
 bool 
diff -r 97991a9e7a18 -r 1315d5c25fa9 src/ov-str-mat.cc
--- a/src/ov-str-mat.cc Tue Mar 03 19:28:03 2009 +0100
+++ b/src/ov-str-mat.cc Tue Mar 03 10:09:56 2009 +0100
@@ -51,6 +51,7 @@
 #include "pr-output.h"
 #include "pt-mat.h"
 #include "utils.h"
+#include "ls-ascii-helper.h"
 
 DEFINE_OCTAVE_ALLOCATOR (octave_char_matrix_str);
 DEFINE_OCTAVE_ALLOCATOR (octave_char_matrix_sq_str);
@@ -317,8 +318,7 @@
                      char *ftmp = tmp.fortran_vec ();
 
                      // Skip the return line
-                     if (! is.read (ftmp, 1))
-                       return false;
+                     skip_preceeding_newline (is);
 
                      if (! is.read (ftmp, dv.numel ()) || !is)
                        {

reply via email to

[Prev in Thread] Current Thread [Next in Thread]