[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Testing approach (was Re: [Gnu-arch-users] more on the merge-fest)
From: |
Samuel A. Falvo II |
Subject: |
Re: Testing approach (was Re: [Gnu-arch-users] more on the merge-fest) |
Date: |
Wed, 26 Nov 2003 09:00:09 -0800 |
User-agent: |
KMail/1.5 |
On Wednesday 26 November 2003 02:27 am, Misha Dorman wrote:
> Test-then-code is an attempt to avoid this problem/temptation, as well
> as a recognition that (as zander said) in many cases (though not all)
> writing a test is a good way to codify the required functionality. Of
> course, the risk is then (as Mark noted) that the code is written from
> the test, which (and this is the point that Samuel may have missed)
> can only be a _partial_ codification of the specification (how many
> test cases do you need to _exhaustively_ test even something as simple
> as A + B?). Test-first advocates would say that this risk is lower
> than the risk of omitted or "cheated" tests in code-first approaches.
I need to clarify my position with respect to unit testing.
First, it is important to realize that unit tests permits one to reason
about the code under test. But this problem extends to various
mathematical disciplines too: how many tests does one need to check
y=mx+b for continuity? The answer is, to use the above logic precisely,
an infinite number of tests. So we don't bother in the case of a line;
we *know* it's continuous over the set of real numbers. No further
testing is truely required. Consequently, if writing a program that
evaluates teh above function, there is NO reason to test it. We know it
already works.
When one writes unit tests, you must recognize that there are reasonable
limitations to what you can do. Therefore, code and tests are written
to primarily exercise *border cases*, and a few non-border cases to just
make sure. It is always the border cases that will likely result in a
bug. The more "discontinuous" the function, the more likely it'll break
in actual use, and therefore, the more unit tests you need to make sure
said implementation's discontinuities occur where you *expect* them to
be, not where you *don't* expect them to be.
(Yes, unit tests are often used for both positive AND negative testing.)
Also, you write tests only for what can break. Obviously, with addition
implemented in the hardware, we know that it has to work. Likewise,
since fopen() is, ultimately, implemented in the operating system, we
can safely *assume* that it works. By the definition (e.g.,
specification) of its API, it works a certain way. If it does not, then
the OS is known to be at fault. (or libc, or whatever. You folks can
think for yourselves, so you get the idea.)
Likewise, in your software, you write unit tests for as many things as
you possibly can get away with. From major program features, to things
like doubly-linked list implementations, maybe even down to certain
pointer-arithmetic functions. Just as a program is built from smaller
sub-programs, so tests can be composed from lower-level tests. It is
this principle which enables one to logically reason about the code
under test, to find critical bugs early, often, and with a precision
that alleviates the need for a single-stepping debugger. Yes, you spend
more time writing code than otherwise; this time would have been spent
in the debugger anyway.
Consequently, if we implement a new abstract data type in the software,
then it needs unit tests which exercises how that data type will be
used. The unit tests for that abstract data type cover all the border
cases: given a new ADT instance, how does it respond to a search for a
node? Given a partially full ADT, how does it respond to the insertion
of a NULL node? A node with invalid fields? Given a full ADT, how does
it respond to deletion of head, tail, and *some* middle element? If it
deletes some arbitrary middle element correctly, then we can *safely*
assume that it works for *all* middle elements. If you can't, then you
might want to consider re-implementing the code so that you can. This
ties somewhat into the Taguchi method of making robust systems.
This is what I mean when I say unit tests are largely, if not entirely,
the same basic thing as Formal Methods. I've read documentation on
them, and I see no critical differences between FMs and unit tests.
They both exercise software under test. They both establish well-known
inputs. They both compare results. They both flag errors to the
programmer.
If a test case doesn't cover a border case, AND the test was a direct
codification of the specification, then it follows that the
specification is broken. Fix the spec, fix the test, and continue.
Otherwise, if it was NOT a direct codification from the specification,
then the programmer might be at fault (e.g., lack of experience with
unit testing is the biggest cause of this).
You can't just try unit testing once or twice and expect to know
everything about it. It takes practice, just like any other human
endeavor.
However, it's also stated that the spec is often times sufficiently vague
to make unit testing not worthwhile. This is false, because the coder
STILL has to use wetware to "compile" the vague spec into some concrete
description of the software. That software's reliability must be
enforced, regardless of how vague the specification is. The same
fill-in-the-gaps kind of reasoning ability the programmer exercises to
write production code from a vague specification can still be used to
implement unit tests for that filled-in code. Obviously.
Unit tests are primarily for the benefit of the programmer(s), to know
that a hunk of code is certified to work against some set of
specifications. Any modifications to the code must be verified again
against the unit tests. If the tests fail, then either the
specification is in error (relatively unlikely), or the change caused
some side effect which moved a 'discontinuity' into the expected
'working interval' of the function (this is much more likely, based on
my observations of my own coding efforts and those of my peers). Just
suck it up, and fix it. But in ALL cases, whenever a discontinuity
occurs, you should (of course!) verify the test cases are in fact
correct as well.
Remember, the purpose of a unit test is NOT to be the end-all
specification of the project. It's intended to double-check the result
of the production code, without changing the production code in the
process.
--
Samuel A. Falvo II