Re: Testing approach (was Re: [Gnu-arch-users] more on the merge-fest)

gnu-arch-users

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Testing approach (was Re: [Gnu-arch-users] more on the merge-fest)

From:	Samuel A. Falvo II
Subject:	Re: Testing approach (was Re: [Gnu-arch-users] more on the merge-fest)
Date:	Wed, 26 Nov 2003 09:00:09 -0800
User-agent:	KMail/1.5

On Wednesday 26 November 2003 02:27 am, Misha Dorman wrote:
> Test-then-code is an attempt to avoid this problem/temptation, as well
> as a recognition that (as zander said) in many cases (though not all)
> writing a test is a good way to codify the required functionality. Of
> course, the risk is then (as Mark noted) that the code is written from
> the test, which (and this is the point that Samuel may have missed)
> can only be a _partial_ codification of the specification (how many
> test cases do you need to _exhaustively_ test even something as simple
> as A + B?). Test-first advocates would say that this risk is lower
> than the risk of omitted or "cheated" tests in code-first approaches.

I need to clarify my position with respect to unit testing.

First, it is important to realize that unit tests permits one to reason 
about the code under test.  But this problem extends to various 
mathematical disciplines too: how many tests does one need to check 
y=mx+b for continuity?  The answer is, to use the above logic precisely, 
an infinite number of tests.  So we don't bother in the case of a line; 
we *know* it's continuous over the set of real numbers.  No further 
testing is truely required.  Consequently, if writing a program that 
evaluates teh above function, there is NO reason to test it.  We know it 
already works.

When one writes unit tests, you must recognize that there are reasonable 
limitations to what you can do.  Therefore, code and tests are written 
to primarily exercise *border cases*, and a few non-border cases to just 
make sure.  It is always the border cases that will likely result in a 
bug.  The more "discontinuous" the function, the more likely it'll break 
in actual use, and therefore, the more unit tests you need to make sure 
said implementation's discontinuities occur where you *expect* them to 
be, not where you *don't* expect them to be.

(Yes, unit tests are often used for both positive AND negative testing.)

Also, you write tests only for what can break.  Obviously, with addition 
implemented in the hardware, we know that it has to work.  Likewise, 
since fopen() is, ultimately, implemented in the operating system, we 
can safely *assume* that it works.  By the definition (e.g., 
specification) of its API, it works a certain way.  If it does not, then 
the OS is known to be at fault.  (or libc, or whatever.  You folks can 
think for yourselves, so you get the idea.)

Likewise, in your software, you write unit tests for as many things as 
you possibly can get away with.  From major program features, to things 
like doubly-linked list implementations, maybe even down to certain 
pointer-arithmetic functions.  Just as a program is built from smaller 
sub-programs, so tests can be composed from lower-level tests.  It is 
this principle which enables one to logically reason about the code 
under test, to find critical bugs early, often, and with a precision 
that alleviates the need for a single-stepping debugger.  Yes, you spend 
more time writing code than otherwise; this time would have been spent 
in the debugger anyway.

Consequently, if we implement a new abstract data type in the software, 
then it needs unit tests which exercises how that data type will be 
used.  The unit tests for that abstract data type cover all the border 
cases: given a new ADT instance, how does it respond to a search for a 
node?  Given a partially full ADT, how does it respond to the insertion 
of a NULL node?  A node with invalid fields?  Given a full ADT, how does 
it respond to deletion of head, tail, and *some* middle element?  If it 
deletes some arbitrary middle element correctly, then we can *safely* 
assume that it works for *all* middle elements.  If you can't, then you 
might want to consider re-implementing the code so that you can.  This 
ties somewhat into the Taguchi method of making robust systems.

This is what I mean when I say unit tests are largely, if not entirely, 
the same basic thing as Formal Methods.  I've read documentation on 
them, and I see no critical differences between FMs and unit tests.  
They both exercise software under test.  They both establish well-known 
inputs.  They both compare results.  They both flag errors to the 
programmer.

If a test case doesn't cover a border case, AND the test was a direct 
codification of the specification, then it follows that the 
specification is broken.  Fix the spec, fix the test, and continue.  
Otherwise, if it was NOT a direct codification from the specification, 
then the programmer might be at fault (e.g., lack of experience with 
unit testing is the biggest cause of this).

You can't just try unit testing once or twice and expect to know 
everything about it.  It takes practice, just like any other human 
endeavor.

However, it's also stated that the spec is often times sufficiently vague 
to make unit testing not worthwhile.  This is false, because the coder 
STILL has to use wetware to "compile" the vague spec into some concrete 
description of the software.  That software's reliability must be 
enforced, regardless of how vague the specification is.  The same 
fill-in-the-gaps kind of reasoning ability the programmer exercises to 
write production code from a vague specification can still be used to 
implement unit tests for that filled-in code.  Obviously.

Unit tests are primarily for the benefit of the programmer(s), to know 
that a hunk of code is certified to work against some set of 
specifications.  Any modifications to the code must be verified again 
against the unit tests.  If the tests fail, then either the 
specification is in error (relatively unlikely), or the change caused 
some side effect which moved a 'discontinuity' into the expected 
'working interval' of the function (this is much more likely, based on 
my observations of my own coding efforts and those of my peers).  Just 
suck it up, and fix it.  But in ALL cases, whenever a discontinuity 
occurs, you should (of course!) verify the test cases are in fact 
correct as well.

Remember, the purpose of a unit test is NOT to be the end-all 
specification of the project.  It's intended to double-check the result 
of the production code, without changing the production code in the 
process.

--
Samuel A. Falvo II

[Prev in Thread]

Current Thread

[Next in Thread]

Testing approach (was Re: [Gnu-arch-users] more on the merge-fest), Misha Dorman, 2003/11/26
- Re: Testing approach (was Re: [Gnu-arch-users] more on the merge-fest), Samuel A. Falvo II <=
- Re: Testing approach (was Re: [Gnu-arch-users] more on the merge-fest), Misha Dorman, 2003/11/28
  - [Gnu-arch-users] Re: Testing approach, Stephen J. Turnbull, 2003/11/28
    - [Gnu-arch-users] Re: Testing approach, Misha Dorman, 2003/11/29

Prev by Date: Re: [Gnu-arch-users] more on the merge-fest
Next by Date: Re: [Gnu-arch-users] more on the merge-fest
Previous by thread: Testing approach (was Re: [Gnu-arch-users] more on the merge-fest)
Next by thread: Re: Testing approach (was Re: [Gnu-arch-users] more on the merge-fest)
Index(es):
- Date
- Thread