bug-datamash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Failure of expensive test "datamash-valgrind.sh" for git HEAD


From: Erik Auerswald
Subject: Re: Failure of expensive test "datamash-valgrind.sh" for git HEAD
Date: Wed, 1 Jun 2022 09:24:13 +0200

Hi,

On Tue, May 31, 2022 at 02:03:30PM -0700, Shawn Wagner wrote:
> I am finding some memory leaks using valgrind/asan, btw. Will try to look
> more into fixing them tonight.

Thanks.

Would it make sense to set the pointers to NULL after they have been freed
in order to guard against introducing use-after-free type bugs later?

Thanks,
Erik

> On Tue, May 31, 2022 at 10:51 AM Erik Auerswald <auerswal@unix-ag.uni-kl.de>
> wrote:
> 
> > Hi,
> >
> > the "expensive" test "datamash-valgrind.sh" fails on my Ubuntu 18.04
> > GNU/Linux x86-64 laptop with current git HEAD:
> >
> > ```
> > $ make check-expensive TESTS=tests/datamash-valgrind.sh
> > [...]
> > custom-format failed
> > FAIL: tests/datamash-valgrind.sh
> > [...]
> > ```
> >
> > This is the last sub-test in the shell script.  The pertinent log
> > can be extracted with `sed -n '/--format/,$p' test-suite.log`.  The
> > test checks if the command `datamash --format "%05000.5000f" sum 1`
> > with input from file "wide" via standard input leaks memory.
> >
> > At least one input field of the generated data file "wide" contains
> > a number that is too large to be represented as an 80 bit extended
> > precision IEEE 754.  Thus GNU Datamash fails without creating output
> > in the specified custom format with the error message:
> >
> > ```
> > datamash: invalid numeric value in line 9 field 1: '123[...]545'
> > ```
> >
> > Well, there are three such lines:
> >
> > ```
> > $ for i in $(seq 1 193 2000) $(seq 500 -7 100); do \
> >  >   seq $i | paste -s -d "" - ; \
> >  > done \
> >  > | awk 'length($0)>4932 {
> >           print "line="NR "fields="NF" length="length($0)
> >           }'
> > line=9 fields=1 length=5073
> > line=10 fields=1 length=5845
> > line=11 fields=1 length=6617
> > ```
> >
> > Since datamash fails and exits with exit code 1, valgrind reports
> > exit code 1 as well, and the test fails.  I do not think this is
> > intended.
> >
> > Additionally, valgrind reports reachable memory:
> >
> > ```
> > ==6391== HEAP SUMMARY:
> > ==6391==     in use at exit: 13,400 bytes in 9 blocks
> > ==6391==   total heap usage: 25 allocs, 16 frees, 32,247 bytes allocated
> > [...]
> > ==6391== LEAK SUMMARY:
> > ==6391==    definitely lost: 0 bytes in 0 blocks
> > ==6391==    indirectly lost: 0 bytes in 0 blocks
> > ==6391==      possibly lost: 0 bytes in 0 blocks
> > ==6391==    still reachable: 13,400 bytes in 9 blocks
> > ==6391==         suppressed: 0 bytes in 0 blocks
> > ==6391==
> > ==6391== For counts of detected and suppressed errors, rerun with: -v
> > ==6391== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
> > ```
> >
> > I do not know if the valgrind report indicates a problem or not.
> >
> > I think that we should adjust this test.  We need to either make
> > datamash exit successfully, e.g., by adjusting the input, or to ignore
> > the datamash failure, but still detect a valgrind failure.  The former
> > might be possible by removing the problematic lines from the input via
> > sed.  The latter might be possible via valgrind's --error-exitcode=N
> > option.
> >
> > I have successfully tested the former using the following patch:
> >
> > ```
> > diff --git a/tests/datamash-valgrind.sh b/tests/datamash-valgrind.sh
> > index 6bb74af..43e9fd9 100755
> > --- a/tests/datamash-valgrind.sh
> > +++ b/tests/datamash-valgrind.sh
> > @@ -188,7 +188,7 @@ cmp wide wide_orig ||
> >       fail=1 ; }
> >
> >   ## Test large output formats
> > -cat wide | valgrind --track-origins=yes  --leak-check=full \
> > +sed 9,11d wide | valgrind --track-origins=yes  --leak-check=full \
> >                     --show-reachable=yes  --error-exitcode=1 \
> >                     datamash --format "%05000.5000f" sum 1 > /dev/null ||
> >     { warn_ "custom-format failed" ; fail=1 ; }
> > ```
> >
> > The valgrind report for this contains:
> >
> > ```
> > [...]
> > ==10177==    still reachable: 14 bytes in 1 blocks
> > [...]
> > ==10177== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
> > ```
> >
> > Again I am not sure about the meaning of the report.
> >
> > What do you think?
> >
> > Cheers,
> > Erik



reply via email to

[Prev in Thread] Current Thread [Next in Thread]