[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gawk] Apparently buggy associative array behaviour
From: |
Blaise LI |
Subject: |
Re: [bug-gawk] Apparently buggy associative array behaviour |
Date: |
Tue, 1 Mar 2016 16:52:50 +0000 |
Sorry, someone showed me my mistake: an extraneous semicolumn in the END
action(s).
On 01/03/16 16:37, Blaise Li wrote:
> Making the histogram of one column in a large file, I came across a case
> that looks like a bug with awk.
>
> Counting using sort | uniq -c gives several values for 5-th column of my
> file:
>
> $ awk '{print $5}' awk_bug_test.txt | sort | uniq -c
> 60906306 0
> 6342558 1
> 16874518 3
> 74186425 50
>
> But using an associative array within awk only reports the counts for
> one of the values:
>
> $ awk '{hist[$5]++} END {for (score in hist); print hist[score],score}'
> awk_bug_test.txt
> 74186425 50
>
> Am I mis-using awk or is this really a bug ?
>
> I'm using awk version "GNU Awk 4.1.1, API: 1.1 (GNU MPFR 3.1.2-p3, GNU
> MP 6.0.0)" on debian.
>
> The file is huge (53G). So I cannot attach it to this mail.
>
>