automake-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] parallel-tests: new recognized test result 'ERROR'


From: Stefano Lattarini
Subject: Re: [PATCH] parallel-tests: new recognized test result 'ERROR'
Date: Fri, 1 Jul 2011 10:06:15 +0200
User-agent: KMail/1.13.3 (Linux/2.6.30-2-686; KDE/4.4.4; i686; ; )

On Thursday 30 June 2011, Ralf Wildenhues wrote:
> * Stefano Lattarini wrote on Thu, Jun 30, 2011 at 04:58:21PM CEST:
> > I've applied the attached patch to the 'GSoC/experimental/test-results-work'
> > branch  Note that this is not an FYI, since that branch is temporary, and 
> > can
> > thus be amended and modified.  Reviews are welcome!
> 
> Just a couple of quick thoughts:
> - is ERROR the same term other testing frameworks use for similar
>   semantics?  Let's strive to be consistent.
>
py.test and xUnit-style frameworks seems to use a similar terminology: a
failure is when a test assertion fails, an error is when an unhandled
exception occurs:
 
<http://www.cis.upenn.edu:80/~matuszek/cit594-2004/Pages/eclipse-faq.html#failure_vs_error>
 <http://www.m-takagi.org/docs/php/pocket_guide/2.3/en/textui.html>
 <http://docs.python.org/library/unittest.html#organizing-test-code>

Well, in the case of py.test, an exception has to prevent the loading of
tests or the completion of tests execution in order to be flagged as an
error (but that's very in line with the py.test philosophy of trying to
be as unobtrusive and "API-less" as possible).  Example:

  $ cat > foo.py <<'END'
  def test_foo(): assert 1 == 0
  def test_bar(): assert 1 == 1
  END
  $ cat > bar.py <<'END'
  a = 0.0 / 0.0
  def test_baz(): assert 0 == 0
  END
  $ echo 'if 1:' > baz.py
  $ py.test -v foo.py bar.py baz.py
  ========================= test session starts ==========================
  platform linux2 -- Python 2.6.6 -- pytest-1.3.3 -- /usr/bin/python
  test path 1: foo.py
  test path 2: bar.py
  test path 3: baz.py
  
  foo.py:1: test_foo FAIL
  foo.py:2: test_bar PASS
  bar.py E
  baz.py E
  
  ================================ ERRORS ================================
  ________________ ERROR collecting /home/stefano/bar.py _________________
  bar.py:1: in <module>
  >   a = 0.0 / 0.0
  E   ZeroDivisionError: float division
  ________________ ERROR collecting /home/stefano/baz.py _________________
  /usr/lib/pymodules/python2.6/py/_test/pycollect.py:145: in _importtestmodule
  >           mod = self.fspath.pyimport(ensuresyspath=True)
  /usr/lib/pymodules/python2.6/py/_path/local.py:528: in pyimport
  >           mod = __import__(modname, None, None, ['__doc__'])
  E             File "/home/stefano/baz.py", line 2
  E               
  E              ^
  E           IndentationError: expected an indented block
  =============================== FAILURES ===============================
  _______________________________ test_foo _______________________________
  
  >   def test_foo(): assert 1 == 0
  E   AssertionError
  
  foo.py:1: AssertionError
  ============= 1 failed, 1 passed, 2 error in 0.28 seconds ==============

> - will users confuse FAIL and ERROR and try to use the latter where the
>   former is appropriate (or vice versa)?
>
ERROR and FAIL are mostly internal details; at this point, the user is only
exposed to them in the console progress output.  Currently ERRORs only get
produced by "hard errors" (a.k.a. exit statuses of 99) in test scripts, and
these are nothing new.  In the future, they will be produced by, e.g.,
"Bail out!" TAP directives, or malformed TAP or SubUnit input (so again,
the user doesn't have to produce them directly).  The only persons that
have to worry about and understand in detail the FAILURE/ERROR distinction
are the potental authors of third-party test drivers -- and requiring that
from them is not asking too much IMHO.
 
> Let's try to avoid inventing different terms where and when we can.
>
Well, the terminology I've chosen derives from my experience.  Still,
my experience is admittedly limited and unbalanced -- so, if you have
any suggestion, shoot.

> Changing semantics requires a NEWS entry.  It's actually an
> incompatibility, but a very minor one, so it's not a big problem.
>
Still a NEWS entry is a good idea anyway; what about what I've
done in the attached squash-in?

> Thanks,
> Ralf
> 
> > Subject: [PATCH] parallel-tests: new recognized test result 'ERROR'
> > 
> > * lib/am/check.am ($(TEST_SUITE_LOG)): Recognize a new test result
> > `ERROR'.  Use it when encountering unreadable test logs (previously
> > a simple `FAIL' was used in this situations).
> > * lib/test-driver: Set the global test result to `ERROR' when the
> > test exit status is 99.  When doing colorized output, color `ERROR'
> > results in magenta.
> > * doc/automake.texi (Log files generation and test results
> > recording): Update by also listing `ERROR' among the list of valid
> > `:test-results:' arguments.
> > * tests/trivial-test-driver: Update.
> > * tests/parallel-tests.test: Likewise.
> > * tests/parallel-tests-harderror.test: Likewise.
> > * tests/parallel-tests-no-spurious-summary.test: Likewise.
> > * tests/test-driver-global-log.test: Likewise.
> > * tests/test-driver-recheck.test: Likewise.
> > * tests/test-driver-custom-multitest-recheck.test: Likewise.
> > * tests/test-driver-custom-multitest-recheck2.test: Likewise.
> > * tests/test-driver-custom-multitest.test: Likewise.
> > * tests/test-driver-custom-no-html.test: Likewise.
> > * tests/test-driver-end-test-results.test: Likewise.
> > * tests/color.test: Likewise, and make stricter.
> > * tests/color2.test: Likewise, and improve syncing with color.test.
> > * tests/parallel-tests-exit-statuses.test: New test.
> > * tests/parallel-tests-console-output.test: New test.
> > * tests/Makefile.am (TESTS): Update.
> 
> > --- a/doc/automake.texi
> > +++ b/doc/automake.texi
> > @@ -9248,10 +9248,10 @@ leading whitespace will not be ignored.
> >  
> >  @c Keep this in sync with lib/am/check-am:$(TEST_SUITE_LOG).
> >  The only recognized test results are currently @code{PASS}, @code{XFAIL},
> > address@hidden, @code{FAIL} and @code{XPASS}.  These results, when declared
> > -with @code{:test-result:}, can be optionally followed by text holding
> > -the name and/or a brief description of the corresponding test; the
> > address@hidden harness will ignore such extra text when
> > address@hidden, @code{FAIL}, @code{XPASS} and @code{ERROR}.  These results,
> > +when declared with @code{:test-result:}, can be optionally followed by
> > +text holding the name and/or a brief description of the corresponding
> > +test; the @option{parallel-tests} harness will ignore such extra text when
> 
> ERROR semantics are not actually described.
>
Ah yes; well, that's tricky.  The problem is that the semantics for PASS,
FAIL, SKIP, XFAIL and XPASS are only documented in the "Simple Tests"
subsection, and we can't document ERROR in there, since we have not added
support for it in the older serial testsuite harness; and again, we can't
trivially add such support, since the serial testsuite harness doesn't
grok hard errors presently.  The right fix would probably be a revamping
and reorganization of the whole "Support for test suites" chapter, but I'm
a bit wary of attempting it now ...

> >  generating @file{test-suite.log} and preparing the testsuite summary.
> >  Also, @code{:test-result:} can be used with a special ``pseudo-result''
> >  @code{END}, that will instruct the testsuite harness to stop scanning
> 
> > --- a/lib/am/check.am
> > +++ b/lib/am/check.am
> > @@ -66,7 +66,7 @@ include inst-vars.am
> >  ## appended.
> >  ##
> >  ## In addition to the magic "exit 77 means SKIP" feature (which was
> > -## imported from automake), there is a magic "exit 99 means FAIL" feature
> > +## imported from automake), there is a magic "exit 99 means ERROR" feature
> >  ## which is useful if you need to issue a hard error no matter whether the
> >  ## test is XFAIL or not.  You can disable this feature by setting the
> >  ## variable DISABLE_HARD_ERRORS to a nonempty value.
> > @@ -153,7 +153,7 @@ $(TEST_SUITE_LOG): $(TEST_LOGS)
> >  ## Readable test logs.
> >     list2=`for f in $$list; do test ! -r $$f || echo $$f; done`; \
> >  ## Each unreadable test log counts as a failed test.
> > -   results1=`for f in $$list; do test -r $$f || echo FAIL; done`; \
> > +   results1=`for f in $$list; do test -r $$f || echo ERROR; done`; \
> >  ## Now we're going to extract the outcome of all the testcases from the
> >  ## test logs.
> >     results2=''; \
> > @@ -187,6 +187,10 @@ $(TEST_SUITE_LOG): $(TEST_LOGS)
> >     skip=`echo "$$results" | grep -c '^SKIP'`;                      \
> >     xfail=`echo "$$results" | grep -c '^XFAIL'`;                    \
> >     xpass=`echo "$$results" | grep -c '^XPASS'`;                    \
> > +   error=`echo "$$results" | grep -c '^ERROR'`;                    \
> > +## FIXME: for the moment, we count errors as failures, otherwise the code
> > +## that display the testsuite summary will become too much complicated.
> 
> grammar nits: s/display/&s/; s/much //
>
Fixed.

> > +   fail=`expr $$fail + $$error`;                                   \
> >     failures=`expr $$fail + $$xpass`;                               \
> >     all=`expr $$all - $$skip`;                                      \
> >     if test "$$all" -eq 1; then tests=test; All=;                   \
> 
> > --- /dev/null
> > +++ b/tests/parallel-tests-console-output.test
> 
> > +# parallel-tests: some checks on console output about testsuite
> > +# advancement.
> 
> What do you mean by "advancement"?  Maybe "progress" or "output order!?
>
Yes, "progress"; sorry.  Fixed. now.

> > +parallel_tests=yes
> > +. ./defs || Exit 1
> > +
> > +cat >> configure.in << 'END'
> > +AC_OUTPUT
> > +END
> > +
> > +cat > Makefile.am << 'END'
> > +XFAIL_TESTS = sub/xpass.test xfail.test error.test
> > +TESTS = $(XFAIL_TESTS) fail.test pass.test a/b/skip.test sub/error2.test
> > +pass.log: fail.log
> > +error.log: pass.log
> > +sub/xpass.log: error.log
> > +sub/error2.log: xfail.log
> > +a/b/skip.log: sub/error2.log
> > +END
> >
> > [SNIP]
> >
> > +cat > xfail.test << 'END'
> > +#!/bin/sh
> > +# The sleep should ensure expected execution order of tests
> > +# even when make is run in parallel mode.
> > +# FIXME: quotes below required by maintainer-check
> > +sleep '10'
> 
> This looks like it would take long and/or be racy?
>
Yes, but I wanted to try another "code path", i.e., introducing a temporal
delay element, rather than forcing all the output order explicitly through
the use of inter-test dependencies...  Is this a bad idea after all?

> > [SNIP]

> > --- /dev/null
> > +++ b/tests/parallel-tests-exit-statuses.test
> 
> > +# Check parallel-tests features: normal and special exit statuses
> > +# in the test scripts.
> > +
> > +parallel_tests=yes
> > +. ./defs || Exit 1
> > +
> > +cat >> configure.in << 'END'
> > +AC_OUTPUT
> > +END
> > +
> > +# $failure_statuses should be defined to the list of all integers between
> > +# 1 and 255 (inclusive), excluded 77 and 99.
> 
> Actually, I wouldn't go over 125 for portability reasons.
>
But going over that that was meant in order to verfy behaviours with, e.g.,
failures due to signals or commands not found or not executables.  What are
the portability problems you're referring about?

> > +# Let's use `seq' if available, it's faster than the loop.
> > +failure_statuses=`seq 1 255 \
> 
> `(seq 255) 2>/dev/null \
> 
> so that the error from nonpresent seq does not escape even with
> ksh/dash.
>
But why should we care if it escapes?  On the contrary, I'd like to see
the failure message -- our logs are already quite verbose anyway ...

> > +  || { i=1; while test $i -le 255; do echo $i; i=\`expr $i + 1\`; done; }`
> > +failure_statuses=`
> > +  for i in $failure_statuses; do
> > +    test $i -eq 77 || test $i -eq 99 || echo $i
> > +  done | tr "$nl" ' '`
> > +# For debugging.
> > +echo "failure_statuses: $failure_statuses"
> >
> > [MEGA-SNIP]

> > +  rm -f *.run *.err *.ok
> > +
> > +  : Nothing should be rerun anymore, as all tests have been evenatually
> 
> eventually
> 
> > +  : succesfull.
> 
> successful
> 
Both fixed.

> > [MEGA-SNIP]

Thanks,
  Stefano
diff --git a/ChangeLog b/ChangeLog
index 22e7087..05630cf 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,4 +1,4 @@
-2011-06-30  Stefano Lattarini  <address@hidden>
+2011-07-01  Stefano Lattarini  <address@hidden>
 
        parallel-tests: new recognized test result 'ERROR'
        * lib/am/check.am ($(TEST_SUITE_LOG)): Recognize a new test result
@@ -10,6 +10,7 @@
        * doc/automake.texi (Log files generation and test results
        recording): Update by listing `ERROR' too among the list of valid
        `:test-results:' arguments.
+       * NEWS: Update.
        * tests/trivial-test-driver: Update.
        * tests/parallel-tests.test: Likewise.
        * tests/parallel-tests-harderror.test: Likewise.
@@ -24,7 +25,7 @@
        * tests/color.test: Likewise, and make stricter.
        * tests/color2.test: Likewise, and improve syncing with color.test.
        * tests/parallel-tests-exit-statuses.test: New test.
-       * tests/parallel-tests-console-output.test: New test.
+       * tests/parallel-tests-console-output.test: Likewise.
        * tests/Makefile.am (TESTS): Update.
 
 2011-06-29  Stefano Lattarini  <address@hidden>
diff --git a/NEWS b/NEWS
index 37af53a..56250ab 100644
--- a/NEWS
+++ b/NEWS
@@ -13,6 +13,10 @@ New in 1.11a:
 
 * Changes to Automake-generated testsuite harnesses:
 
+  - The parallel-tests harness has a new test result 'ERROR', that can be
+    used to signal e.g., unexpected or internal errors, or failures to set
+    up test case scenarios.
+
   - The default testsuite driver offered by the 'parallel-tests' option is
     now implemented (partly at least) with the help of automake-provided
     auxiliary scripts (e.g., `test-driver'), instead of relying entirely
diff --git a/lib/am/check.am b/lib/am/check.am
index 76609b5..3cedd46 100644
--- a/lib/am/check.am
+++ b/lib/am/check.am
@@ -189,7 +189,7 @@ $(TEST_SUITE_LOG): $(TEST_LOGS)
        xpass=`echo "$$results" | grep -c '^XPASS'`;                    \
        error=`echo "$$results" | grep -c '^ERROR'`;                    \
 ## FIXME: for the moment, we count errors as failures, otherwise the code
-## that display the testsuite summary will become too much complicated.
+## that displays the testsuite summary will become too complicated.
        fail=`expr $$fail + $$error`;                                   \
        failures=`expr $$fail + $$xpass`;                               \
        all=`expr $$all - $$skip`;                                      \
diff --git a/tests/parallel-tests-console-output.test 
b/tests/parallel-tests-console-output.test
index 5c59b72..8122bfd 100755
--- a/tests/parallel-tests-console-output.test
+++ b/tests/parallel-tests-console-output.test
@@ -15,7 +15,7 @@
 # along with this program.  If not, see <http://www.gnu.org/licenses/>.
 
 # parallel-tests: some checks on console output about testsuite
-# advancement.
+# progress.
 
 parallel_tests=yes
 . ./defs || Exit 1
@@ -61,7 +61,7 @@ cat > xfail.test << 'END'
 #!/bin/sh
 # The sleep should ensure expected execution order of tests
 # even when make is run in parallel mode.
-# FIXME: quotes below required by maintainer-check
+# FIXME: quotes below required by maintainer-check.
 sleep '10'
 exit 1
 END
diff --git a/tests/test-driver-custom-multitest-recheck2.test 
b/tests/test-driver-custom-multitest-recheck2.test
index d2e979e..36cb302 100755
--- a/tests/test-driver-custom-multitest-recheck2.test
+++ b/tests/test-driver-custom-multitest-recheck2.test
@@ -182,8 +182,8 @@ for vpath in : false; do
 
   rm -f *.run *.err *.ok
 
-  : Nothing should be rerun anymore, as all tests have been evenatually
-  : succesfull.
+  : Nothing should be rerun anymore, as all tests have been eventually
+  : succesful.
   $MAKE recheck >stdout || { cat stdout; Exit 1; }
   cat stdout
   do_count pass=0 fail=0 xpass=0 xfail=0 skip=0 error=0

reply via email to

[Prev in Thread] Current Thread [Next in Thread]