[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
problems with 'join' command
From: |
Samir Wadhawan |
Subject: |
problems with 'join' command |
Date: |
Thu, 31 Jan 2008 16:41:11 -0500 |
Dear Mike Haertel,
We call to your attention what we find an unsual behaviour of join command
on ubuntu distributions (Dapper and above). Attached to this mail are the
two sample files (file1 and file2) which produces incorrect output when
joining column 5 of file1 with column 1 of file2. (There follows the
command we used to produce the join:
join -a1 -15 -21 file1.srt file2.srt).
As indicated in the join's manpage, we ensured that the columns on
which the join was being produced were sorted using these commands before
the join was conducted:
sort -k 5 file1 > file1.srt
sort -k 1 file2 > file2.srt
Surprisingly we notice that join proceeds WITHOUT errors when we use this
variant of sort:
sort -k 5,5 file1 > file1.srt
sort -k 1,1 file2 > file2.srt
Clearly, the only difference between the above two variants of sort command is
the additional sorting order of the columns following the ones on which the
sort is being generated. This behaviour puzzles us as the join seems to be
producing
different (inconsistent) outputs, and appears to be sensitive to the sorting
order of other columns in the file.
We tried to reproduce this behaviour on an AIX machine, but find that
both the variants of sorted files produces consistent
join results.
Please let us know if we are missing something.
Best Regards,
Samir Wadhawan.
************************************************************
Samir Wadhawan
PhD. Candidate
Dept. of Biochemistry, Microbiology and Molecular Biology
Centre for Comparitive Genomics and Bioinformatics
505 Wartik Lab
The Pennsylvania State University
E-mail: address@hidden; address@hidden
Ph#:(814)865-4754
************************************************************
file1
Description: Binary data
file2
Description: Binary data
- problems with 'join' command,
Samir Wadhawan <=