bug-datamash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-datamash] Regarding 'transpose'


From: Sanjeev Kumar Sharma
Subject: Re: [Bug-datamash] Regarding 'transpose'
Date: Mon, 26 Sep 2016 11:28:26 +0000

Dear Assaf,

Many thanks for fixing these issues so promptly and explaining everything in great detail. I haven't checked it on my data yet but, as you demonstrate, it should hopefully work fine.

Regards,
Sanjeev


-----Original Message-----
From: Assaf Gordon [mailto:address@hidden
Sent: 23 September 2016 03:46
To: Sanjeev Kumar Sharma
Cc: address@hidden
Subject: Re: [Bug-datamash] Regarding 'transpose'

Hello Sanjeev,

Thanks again for reporting this and providing easily reproducible examples.

I've added two bug fixes for issues relating to this issue.
Available in the git repository:
http://git.savannah.gnu.org/cgit/datamash.git/

And an non-official tarball here:
http://download-mirror.savannah.gnu.org/releases/datamash/src/datamash-1.1.0.3-3741.tar.gz

With these fixes, 'datamash transpose' (and also 'datamash check') works on your input file, without requiring "--no-strict".

However, I'd like to expand on the "--filler" option after these fixes.
The fixes accept lines that end in a delimiter at the end of the line, indicating an empty field (previously it was ignored).
This means that the field is no longer missing (it is there, just empty) - thus "filler" isn't used at all.

The will illustrate, using ':' as delimiter.

Using this input:

$ printf "a::c\n1:2:\nX:Y:Z\n"
a::c
1:2:
X:Y:Z

The middle field on the first line is empty, and the last field on the second line is empty.
The current datamash (1.1.0, before this bug fix) would reject the second line, thus requiring "--no-strict":

$ printf "a::c\n1:2:\nX:Y:Z\n" | datamash-1.1.0 -t: transpose
datamash: transpose input error: line 2 has 2 fields (previous lines had 3);
see --help to disable strict mode

$ printf "a::c\n1:2:\nX:Y:Z\n" | datamash-1.1.0 -t: --no-strict --filler '*' transpose
a:1:X
:2:Y
c:*:Z

With this bug fix, datamash treats the 3rd field of the second line as valid but empty, and 'just works':

$ printf "a::c\n1:2:\nX:Y:Z\n" | datamash -t: transpose
a:1:X
:2:Y
c::Z

This works the same for TAB (i.e. two consecutive tab characters indicate an empty field between them, and a tab at the end of the line indicate an empty last field).

I hope this addresses the issue.
If not, or if you spot other issues, please let me know.

regards,
- assaf


This email is from the James Hutton Institute, however the views expressed by the sender are not necessarily the views of the James Hutton Institute and its subsidiaries. This email and any attachments are confidential and are intended solely for the use of the recipient(s) to whom they are addressed.

If you are not the intended recipient, you should not read, copy, disclose or rely on any information contained in this email, and we would ask you to contact the sender immediately and delete the email from your system.  Although the James Hutton Institute has taken reasonable precautions to ensure no viruses are present in this email, neither the Institute nor the sender accepts any responsibility for any viruses, and it is your responsibility to scan the email and any attachments.

The James Hutton Institute is a Scottish charitable company limited by guarantee.
Registered in Scotland No. SC374831
Registered Office: The James Hutton Institute, Invergowrie Dundee DD2 5DA.
Charity No. SC041796


reply via email to

[Prev in Thread] Current Thread [Next in Thread]