bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: comm: summary patch


From: Paul Eggert
Subject: Re: comm: summary patch
Date: Tue, 12 Jul 2005 09:18:25 -0700
User-agent: Gnus/5.1007 (Gnus v5.10.7) Emacs/21.4 (gnu/linux)

Andrew Stribblehill <address@hidden> writes:

> It can sometimes be coded with awk, sure:
>
> #! /bin/sh
> # usage: commsum <file(s)>
>
> awk '
> BEGIN {t[0]=0; t[1]=0; t[2]=0}
>       {match($0,/^\t*/); t[RLENGTH]++}
>   END {printf "%d\t%d\t%d\n",t[0],t[1],t[2]}
> ' "$@"
>
> However, this presumes that the input has no leading tabs in it.

Yes, that's a problem: the output of comm is ambiguous.  But how about
if we solve this more-general problem instead if your particular one?
That will let "comm" be useful in other situations.

One way to solve the problem is by having an option that lets "comm"
quote its output in some way, so that the output is not ambiguous.
For example, it might quote leading tabs using "\t" and backslashes
using "\\".  Or perhaps you can think of a better approach.

> there's no way to avoid that, short of preprocessing:

How about this?

  echo $(comm -23 f1 f2) $(comm -13 f1 f2) $(comm -12 f1 f2)

Admittedly it's not as efficient as one might like, but is there
really much of an efficiency issue here?

> Does anyone else agree with me, or shall I just crawl back under my
> rock? ;)

Let's see whether anyone else chimes in.

This email exchange is archived, so perhaps someone will read it in
2010 and say "Hey, Andrew was right!"  and fix things....




reply via email to

[Prev in Thread] Current Thread [Next in Thread]