[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: uniq without sort <-------------- GURU NEEDED
From: |
Thierry Volpiatto |
Subject: |
Re: uniq without sort <-------------- GURU NEEDED |
Date: |
Fri, 25 Jan 2008 08:56:10 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/23.0.60 (gnu/linux) |
gnuist006@gmail.com writes:
> This is a tough problem, and needs a guru.
>
> I know it is very easy to find uniq or non-uniq lines if you scramble
> all of them and sort them. Its trivially
>
> echo -e "a\nc\nd\nb\nc\nd" | sort | uniq
>
> $ echo -e "a\nc\nd\nb\nc\nd"
> a
> c
> d
> b
> c
> d
>
> $ echo -e "a\nc\nd\nb\nc\nd"|sort|uniq
> a
> b
> c
> d
>
>
> So it is TRIVIAL with sort.
>
> I want uniq without sorting the initial order.
>
> The algorithm is this. For every line, look above if there is another
> line like it. If so, then ignore it. If not, then output it. I am
> sure, I can spend some time to write this in C. But what is the
> solution using shell ? This way I can get an output that preserves the
> order of first occurrence. It is needed in many problems.
Here in python but the same can be done in lisp or shell
In [13]: B = ["a", "c", "d", "b", "e", "a", "d", "e"]
In [14]: A = []
In [15]: for i in B:
....: if i not in A: A.append(i)
In [16]: A
Out[16]: ['a', 'c', 'd', 'b', 'e']
--
A + Thierry
Pub key: http://pgp.mit.edu