[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Octave-bug-tracker] [bug #60138] Dataframe package incorrectly parses c
From: |
Tasos Papastylianou |
Subject: |
[Octave-bug-tracker] [bug #60138] Dataframe package incorrectly parses csv entries containing commas inside quotes |
Date: |
Sat, 27 Feb 2021 11:21:04 -0500 (EST) |
User-agent: |
Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:86.0) Gecko/20100101 Firefox/86.0 |
URL:
<https://savannah.gnu.org/bugs/?60138>
Summary: Dataframe package incorrectly parses csv entries
containing commas inside quotes
Project: GNU Octave
Submitted by: tpapastylianou
Submitted on: Sat 27 Feb 2021 04:21:02 PM UTC
Category: Octave Forge Package
Severity: 3 - Normal
Priority: 5 - Normal
Item Group: Incorrect Result
Status: None
Assigned to: None
Originator Name: Tasos Papastylianou
Originator Email:
Open/Closed: Open
Release: other
Discussion Lock: Any
Operating System: Any
_______________________________________________________
Details:
This bug is in reference to this stackoverflow post:
https://stackoverflow.com/q/66389166/4183191
There is another bug in that question, which is already being dealt with by
#56263, namely that the second column is inappropriately truncated if it only
contains strings. Apparently this was fixed in the dev repo a few years back,
but still not released. This report is not for this bug.
This report is for another bug resulting from the same csv data. I reproduce
the offending data below:
"TIME","GEO","UNIT","S_ADJ","NA_ITEM","Value","Flag and Footnotes"
"1995Q1","Greece","Chain linked volumes, index 2010=100","Seasonally and
calendar adjusted data","Gross domestic product at market prices","72.5",""
"1995Q2","Greece","Chain linked volumes, index 2010=100","Seasonally and
calendar adjusted data","Gross domestic product at market prices","73.2",""
Suppose one attempts to load this file, e.g. as
D = dataframe( 'data.csv' );
As you can see, the above should result in a field UNIT with value: "Chain
linked volumes, index 2010=100"
However dataframe incorrectly treats the comma here as a delimiter, and
assigns the second part of this element to the S_ADJ field instead.
As a workaround, I note that the cell2csv function from the io package deals
with this correctly instead.
Therefore:
dataframe( cell2csv( 'data.csv' ) );
works as expected (barring bug #56263, that is).
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?60138>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
- [Octave-bug-tracker] [bug #60138] Dataframe package incorrectly parses csv entries containing commas inside quotes,
Tasos Papastylianou <=