Software-Quality Discussion List
Digest # 012

Tenberry Home
Software-Quality Home
Reducing Defects
Automated QA
Site Index

           Software-Quality Discussion List

      S O F T W A R E - Q U A L I T Y   D I G E S T

      "Cost-Effective Quality Techniques that Work"
List Moderator:                      Supported by:
Terry Colligan                       Tenberry Software, Inc.     
March 26, 1998                       Digest # 012



    Status Report

    ==== CONTINUING ====

      David Bennett 

    Re:write testing logic right into the product
      John McGrail 

    Clever Bugs
      Kevin Arthur Mitchell 

    Re: Software-Quality Digest # 011 (Mark Wiley)

    Re:  Self diagnostics
      "Bill Hensley" 

    Re: attempts at controversial statements
      John McGrail 

    ===== NEW POST(S) =====

    QAI Certification

    Timers, Events and Error Trapping
      "Phillip Senn" 

    ==== BOOK REVIEW ====



  Status Report

  Due to the volume of postings, we have another full issue.
  Keep up the good work!

  SQ 011 went to 330 people, so we are still growing.  As
  soon as I finish my current mini-death-march (and deal with
  the micro-death-marches that are piling up), I will start
  posting Requests for Participation in the various programming
  news groups.  In the meantime, I'm really pleased with the
  quantity, and particularly the quality, of the posts in
  Software-Quality.  Many thanks!  I'm learning a lot -- I
  hope you are!

==== CONTINUING ====

++++ New Post, New Topic ++++

From: David Bennett 
Subject: Comments

Hi Terry

Responses on two points.

Point 1.  (From: "Phillip Senn" )

* I used to work on an accounting package (called FACTS),
* that was overall a pretty good bookkeeping system.  Although
* it was complete in its ability to regurgitated the same
* information that you have so tediously typed in, it had the
* annoying habit of locking up every once in a while when
* something that should NEVER happened actually did happen.

* Let's say for instance you're reading along through an
* index, and the record that the index is pointing to just
* isn't there.  This should NEVER happen, right?  Well,
* they're conclusion was "If you've got a corrupted file, then
* you need to fix it before the system can finish this report"
* (My wording). I guess they were consumed with the fear of
* producing a report that had a wrong total or something, so
* their solution was to abend the program by branching to a
* stmt that said X=1/0 REM Corrupted Index.

A little more thought would convince you that this is not in
the category of "can't happen".  It deserves better

In my experience, errors (abnormal status, unexpected
condition, call it what you will) in production software
fall into 3 broad types.

1. Expected.  The user typed something silly, or the batch
total doesn't add up, or there are not enough widgets to
fill the order.

2. Unexpected, but forseeable.  The modem line disconnected,
or the disk is full, or the index does not match the data
file, or an operating service returned an error code which
it usually doesn't.

3. Can't happen.  The program added 2+2 and got 5.

Everyone should be able to handle type 1.  They are bread
and butter stuff, and they need to be tested (and regression
tested) thoroughly. Type 2 is tough.  Sometimes you can
recover, other times not.  They are hard to trigger when you
need them, hard to test, impossible to be certain that you
have full coverage.  The one thing you can be sure of is
that your system is internally consistent (for now), so you
should be able to send detailed information about the
problem to the user, an operator, the log or wherever. Type
3 is easy.  Your system is internally corrupt.  You need to
crash, NOW!  Before you do any more damage.  Issue whatever
error message you safely can, and get the hell out.  These
are virtually impossible to test, defeat most automated
testing strategies and (fortunately) don't happen often.
This type is what you see as a panic, or BSOD, or ABEND,
depending on your system.

The problem is, the designers of FACTS treated a type 2 as a
type 3.  That was not a good idea.

Point 2.  (From: your moderator)

* We do about 10% internal checking code.  We also budget
* 3-5% for record, playback and logging code to support fully
* automated QA.  We would never consider removing it from a
* production build, since we feel like we wouldn't have been
* testing what we ship!  Since we don't remove it, we use
* run-time checks to enable, rather than preprocessor.

You've got me interested, but not yet convinced.  Our budget
in a development system is more like 10-30% overhead for
code, same overhead for data and 200% overhead for
processor.  That's right - we are quite happy to do all unit
testing with a heap of additional self-checking code which
checks and cross-checks internal data structures and runs 3
times slower than production.  We really couldn't ship that

* 4.    We have both the programmers *and* the QA team
* create tests.  I find that they think sufficiently
* differently that you don't really get much duplication.
* Since we push full automation, too many tests is not a
* problem.

Actually it seems we do that too.  I agree.  And yes, we run
literally millions of tests in some places, but who cares?
CPU cycles are for burning!!!

David Bennett
[ POWERflex Corporation     Developers of PFXplus ]
[ Tel:  +61-3-9888-5833     Fax:  +61-3-9888-5451 ]
[ E-mail: ]
[ Web home:   me: ]

++++ Moderator Comment ++++

  I'm not quite sure of what I'm supposed to be convincing
  you of, but here goes:

  Our strategy is to build record, playback and logging into
  every application -- to support very thorough test
  automation.  This code typically costs 3-5% (more if the
  application is small), and is always in the application
  production or otherwise.

  We have asserts and other self-checking code, my estimate
  was that this is less than 10% of the whole system.  Some
  of these checks add more than a 200% CPU overhead.  We have
  all of these tests connected to some sort of run-time
  environment control, so that they can be enabled or disabled
  dynamically.  In the disabled state, they don't add much
  CPU (maybe 1%), and rarely as much as 10% data space.

  While I realize that some of us are still living under
  severe memory constraints (I once spent an entire summer
  trying to fit a RPG compiler into 4K 12-bit words!), my
  point is that for a growing strong majority of systems and a
  growing strong majority of the time, 10% overhead in code, data
  and CPU time is well worth the resulting improvement in
  quality and customer support.

  In particular, our programmers are using the *same* system
  that our customers use;  testing doesn't have to be done
  twice;  you can field-debug those hard-to-reproduce
  problems;  the release process is *much* simpler; ...

  We find these benefits overwhelming.

  No, everybody can't leave the checking in all of the time.
  Yes, almost everyone can do so almost all of the time.

++++ New Post, Same Topic ++++

From: John McGrail 
Subject: Re:write testing logic right into the product

PFXplus (our major product) contains about 10% of the source
>code volume in testing, logging, assertions, validation
>hooks, etc.
>Comments anyone?  Is this too much?  Not enough?  About

Depends ;-)  I think the nature of the product space and the
product requirements will drive a lot of the need for these
types of code. Consider the product I test now and the
previous product I tested ...

Most mature subsystems in our product have assertions.
Although I can't put a number on the percentage of asserts
to real code, most procedures have several assertions in
them.  On the flip side, newer subsystems tend not to be as
robust (learn as you go  syndrome).  At some point in the
release cycle, we decide the code has stabilized enough that
assertions aren't firing anymore so we turn them off.  This
works well for us.

Our system also has quite a bit of logging built in.
Probably 1/3 of the code is logging related.  None of which
is turned off for production releases.  Instead,
configuration variables control the level of logging (0 to
10) and each log msg has a level.

My experience has been that the overhead of a function call
& a check of a variable is not that much.  The overhead of
writing to a file is much. When the product is run in a
production environment logging is left off - except errors
are always logged.  If the customer encounters a problem we
instruct them to turn logging on, reproduce the problem and
send us the log file.  This doesn't help in all cases but it
helps many.

I would say probably 50% of our code is related to testing,
logging, assertions, and validation.  Interestingly enough,
less than (about) 5% is testing specific.

Contrast that with my last company where there were zero
assertions, zero logging, and maybe 1% or 2% testing related
code.  Probably 1/3 or more of the product was validation

The current company makes high performance infrastructure
software for large IP networks.  The previous company makes
a cross-platform, 3Tier, GUI application building tool.  In
the case of testing and validation, I think the nature of
the products contribute to the numbers I've mentioned.  Both
products are very flexible have high user exposure and
configuration so we have to validate their actions.  the
"doing" is easy if the configuration and
actions-to-be-performed are valid.  And, both products are
testable with external tools with a minimum of testing
functionality built in to the products.

John McGrail
American Internet Corporation -
1 (781) 276-4568

++++ Moderator Comment ++++

  I'm curious -- do you see any benefit from having so much
  logging code?  I thought you were saying so up until the
  last paragraph, where you imply that both companies have a
  similar level of quality.

  Can you contrast the two companies, from a quality and
  reliability point of view?

++++ New Post, New Topic ++++

From: Kevin Arthur Mitchell 
Subject: Clever Bugs

>  Have you ever experienced a smart or clever bug?
>  If so, would you please share it with us?

How about the ones built into our programming languages?

[Give us an example, Kevin!]

OK, in C/C++, the number 0 tests as false and all others
test as true.

while (0) {/* nothing ever happens */}
while (1) {/* an endless loop        */}

This is nice and clever but, IMHO,  an invitation to bugs.
Return codes are particularly difficult for me. I frequently
want my functions to return a value indicating "was this
done successfully" [ex: if InitialConditionsSet (do
something) else (why not?).

But... if your function succeeds, who ever wants to know why?
Wouldn't you prefer to return a single value for True and
anything else for False? I would.

Now I could, of course, reverse things and say if
InititalConditionsFailed (why?) else (do something) but I
find this counter-intuitive and an impediment to my


++++ Moderator Comment ++++

  While I agree with your desire for the opposite meanings
  of true and false in C/C++, I don't think that:

   - it is a bug

   - or that it is clever

  I'm still searching for that clever bug...

++++ New Post, New Topic ++++

From: (Mark Wiley)
Subject: Re: Software-Quality Digest # 011

> ++++ Moderator Comment ++++
>   Mark's doing OS testing "in the large."

I'm not familiar with this term, can you elaborate?
>   I'm not sure I agree with the idea that you can't leave
>   most, if not all, of the instrumentation in today's OS's.
>   I doubt the extra code is more than a megabyte, and I
>   really doubt that anyone other than you, Mark, would notice.

Most everyone here would notice. Many of our nodes have only
32 MB and leaving a MB of memory in the kernel that wasn't
used during normal operation would be a crime. The system is
dedicated to running a video server application and memory
(for buffers and such) is not to be 'wasted'.

>   (I presume you can't buy a nCUBE with only 4MB of RAM? ;-)

No, I don't think you can :-).

>   Do you use a torture test, like the one that seemed to
>   bring so much stability to Linux?

I am not familiar with the Linux test you are referring to,
can you give me a pointer?

But, most of my tests involve high stress, so I suppose you
could call them 'torture tests'.


++++ Moderator Comment ++++

  Although performance curves with "knees" or "elbows" in
  them can provide counter-examples, can you really say that
  anyone would notice if your system ran 3% slower?  If you
  could achieve 10X reliability, wouldn't all your users
  sign up in a flash?

  I understand that a megabyte is a terrible thing to waste,
  or at least to talk about wasting, but you get something
  for spending that megabyte.  (Besides, it's probably much
  less than a megabyte!)

  By "in the large", I simply meant that you are working
  on large, or very large, systems, while John Cameron is
  working on much smaller systems.  Nothing more subtle than

  In spite of the size differences, I suspect you and John
  and everyone else who deals with hardware interrupt stuff
  have a lot in common.

  As to the Linux test, I am only aware of it via second-
  hand stories and postings in the Linux newsgroups.  As I
  understand the story, Linux was growing rapidly, in code,
  in features, and in users.  Although very well written, the
  volume and frequency of updates and upgrades was causing
  reliability problems.  To solve this problem, one or more
  stress testing programs were created, which, as rapidly as
  possible, exercised all of the "hard" parts of the system
  API (spawning, forks, large memory, shared memory, process
  kills, etc.)  I think the ideas came from current crashes,
  but I'm not sure.  In any event, it pokes the dark corners
  of the system, and actively tries to crash the system
  and/or to break security.

  As the story was told to me, when this torture test was
  first unleased on Linux, the system only lasted an average
  of 10 seconds.  Now, each new version has to run multiple
  torture testers for more than a day before the version is
  allowed out.

  An interesting, and possibly true, addendum to the story
  is that some soul at Sun tried the torture test on Solaris.
  That version of Solaris lasted 10 seconds before crashing!

++++ New Post, New Topic ++++

From: "Bill Hensley" 
Subject: Re:  Self diagnostics

From: "Phillip Senn" 
>I can hear what you're thinking.  "You give users that
>ability and they're going to be rebuilding indices every day
>and you'll never know that there's a bug in the changes you
>made."  Really?  I don't think so.  But let's say they've
>done it twice this past month.  That's where you've got to
>log these errors.  Nothing fancy, just a plain ASCII file
>with date, time, user, and error condition. This log file
>makes you look like a cybergod.  At your next visit, as soon
>as you sit down at your client's desk, you will say
>something like, "I see you had and error on the 28th.  What
>happened?" Your client won't remember, trust me, but in
>their eyes you've grown in stature.

This is one of the most powerful capabilities that you can
add to a project; I routinely do.  I log program
starts/ends, configuration changes (for example, a default
port change from COM1: to COM2:), any anomalous conditions
detected, and all sorts of things that I think I might need
to solve a problem.  I've even been known to log record
additions and deletions (not the data, just the fact that
record X with key K was affected).  For communications
programs, I've logged modem configs at startup and shutdown
to account for users playing with them.  This works for
databases, also; run a checksum of the database at shutdown,
and another at startup, and you can tell if the database was
changed by some software other than your own.  Lots of stuff
can be put in, case-by-case.  Most of this is transparent to
the users.

A user can email the file to you for examination, saving
travel time.  One drawback is that the file can be very
large over time, depending on how long your software runs,
how many users you have, and how much data you collect.
Depending on the situation, I've done FIFO limiting of the
log file (deleting the oldest data); an automated delete of
the log file after a fixed time period; or a manual function
that I perform when I visit the site regularly.

Cheers, Bill

++++ New Post, New Topic ++++

From: John McGrail 
Subject: Re: attempts at controversial statements


>1. Every production program is granted a budget of at least
>5% of code volume, 5% of data volume and 5% of execution
>time reserved for internal testing, consistency checks,
>post-production debugging and similar functions (which are
>not required to contribute at all to the primary purpose of
>the software).

I disagree with "which are not required to contribute at all
to the primary purpose of the software" part.  I don't
consider the primary purpose of product what is written into
the spec or protocol.  I consider the primary purpose of a
product is to solve a customers problem or need.

To put is simply - an unhappy customer is an unsuccessful

Internal testing and consistency checks provide robustness
by allow a product to recover from and correct errors and
problems making the customer happier.  Post-production
debugging and similar functions contribute to customer
satisfaction by allow the developers to quickly solve

>2. Any defect detected in QA testing *must* be incorporated
>into a process of automated testing in such a way that the
>chances of that same defect appearing and not being detected
>in a future QA testing of that product is less than 1 in

I disagree with the word "*must*".  While it is a lofty
goal, it is unachievable.  For one thing, the cost of
automating certain tests can be prohibitive.  For example, a
small company might not be able to afford all the X.10
equipment needed to automate a test that power-cycles
several machines to verify a distributed database recovers
properly from a brownout just as it is beginning the 2nd
commit in a two phase commit.

For another: certain defects fall into a class of
interesting but unimportant.  For example: a logging
facility that only prints 2 of 5 significant digits of
your-favorite-floating-point-variable.  It is more important
to automate high visibility and high impact bugs than
unimportant or low-visibility bugs.

>3. Any defect notified by a customer and successfully
>reproduced and corrected must be treated as in item (2).

Again I disagree.  See my response to number 2.

>4. A programmer (or team) responsible for designing and
>implementing a function (or program or module) must also be
>responsible for designing and implementing test procedures.
>QA is responsible both for measuring the accuracy and
>coverage of test procedures, and (by running the test
>procedures) for measuring the degree to which the function
>is in compliance.

I disagree.  Both look at the code|feature|function with
different perspectives.   Read Testing Computer Software by
C. Kaner, Jack Falk, and Hung Quoc Nguyen.  I'd rather not
repeat them.  Personally, if I wasn't writing automated
testing code, I'd be developing again.  I consider the
running of the automated tests I develop a fairly rote and
boring part of my job (unless they fail and I get to figure
out whether there's a bug in my tests or the product ;-> )

>5. Every function or feature that has ever been in a product
>and is still part of its specification, and every defect
>which has ever been corrected, *must* be validated in every
>release of the product.  This requirement demands automated

While I agree with the last sentence, I disagree with
"*must*" again for pretty much the same reasons as numbers 2
and 3.  It is a lofty goal.  But, time, money, and resources
will prevent it from ever being achieved. Assuming there is
software between the button and the bomb, how do you truly
automate the test where pushing the right button in the
control room triggers the nuclear bomb to take off?

John McGrail
American Internet Corporation -
1 (781) 276-4568

++++ Moderator Comment ++++

  John is technically correct, but I wonder what you recommend

  I find this attitude (which John may not have) very common
  among QA people and some programmers.  They concentrate on
  the theoretical limit, which in this kind of argument can
  be shown to be too expensive in some dimension.  The
  argument seems to be "If it's theoretically impossible,
  we should reject it as a goal."

  To me this seems like saying "Since it can be shown that
  no one can be completely happy, no one should try for
  happiness", or "Since it can be shown that no one lives
  forever, medicine, diet and exercise are useless."

  I find that the journey (to defect-free quality software)
  is worth the effort, even if I never get there!

  It all boils down to the interpretation of "*must*":
  If "*must*" means "anyone violating this rule will be
  summarily terminated", then I agree with John McGrail.
  If, as I suspect, "*must*" means "you have to get a waiver
  from your manager to do this, but we'll be business-like
  in deciding", then I agree with David Bennett.

===== NEW POST(S) =====

++++ New Post, New Topic ++++

Subject: QAI Certification

Has anyone taken the QAI exam for CQA - certified quality
assurance? The study guide appears to be rather thick. I've
also applied for the CSTE certificate they offer - but at
this time that is not an exam but based on a resume and

My consulting firm (Interim Technologies) is paying for the
certification as they are on the QAI.

Leslie Pearson (for now)

++++ New Post, New Topic ++++

From: "Phillip Senn" 
Subject: Timers, Events and Error Trapping

I've recently learned a new programming language called
OVAL. It's really great because it looks just like Visual
Basic in terms of programming.  The only problem I've had is
that all the work is done using timers, events and error

In the old days (good? Bad?), you could print out a program
and take it with you to study for debugging.  We had the
bane of programming called the 'GOTO' stmt, but at least it
was all there on the piece of paper.  You could follow the
program flow if you thought logically.

Nowadays, it's very hard to debug a program without the aid
of the interpreter!  Let's take error trapping first,
because it's been around the longest.  At the beginning of
your (BASIC) program, you can assign a stmt such as 'ON
ERROR GOTO ERR_TRAP', to which you can examine the type of
error and take appropriate action to handle the error or
exit the program as gracefully as you can. So you're going
along in a program when you write to a file, and blamo!
Suddenly you're in the error routine because the file's
write-protected.  This type of 'event' is tolerable because
it's only handling errors, thus exceptions, and doesn't
interfere with the logic of the program.  I have seen some
programs however, that use the error trapping subroutine to
handle normal housekeeping type stuff, which drives me
crazy.  Some programmer must be thinking: "Why bother
checking to see if the file is there before opening it, when
I can just let the open stmt bomb off and exit the
subroutine?" Well, this drives me crazy because you're using
the error trap not for errors, but for normal program flow.
Part of having a self-documenting program is to not rely on
these little tricks.  It may take an extra line or two to
check for the existence of a file before opening it, but
that keeps you from having to write 5 lines of comments
like, "Normally the file isn't there, so this error routine
is how I exit the procedure".

The next big nuisance is the 'event'.  Event driven
programming kinda forces programs to be written in smaller
chunks.  I think a considerable amount of conversation has
already occurred in the archives of this listserv about
keeping your subroutines manageable. Programming for events
such as when the mouse clicks over a certain area, or when a
field's value has changed has cut our procedural style of
answer question A, then question B, then question C into a
bunch of little snips.  In essence, "Whenever the user
answers question A, then do this.  Whenever the user answers
question B...."

These events are tolerable because you can plainly see when
they will be run.  In other words, the command1_click()
event is run - anybody? Anybody?  That's right, when the
user clicks down on the command1 button (directly after the
command1_mousedown event).

Which leads us to timers.  The timer1_timer() event is fired
every X number of milliseconds.  This one is a bit trickier
to watch out for if you're debugging your program from a
printout.  In fact, it's a downright nuisance when you're
debugging with the interpreter as well, because chances are
the timer is going to fire every chance it gets.  But timers
are essential, and it's just a necessary evil that we
programmers have to be deal with.

Why am I bothering to write about these events?  I can hear
what you're thinking, "What's the point in all this"?  I
guess I just wanted to say that some of the old paradigms
are not true depending upon the newer languages.  For
instance, when everything is visual, such as in MS ACCESS,
how can you limit your code to one page?

If you're programs look more like flowcharts than
algorithms, maybe it's time to re-evaluate things.

==== BOOK REVIEW ====

  (Coming soon to a Software-Quality Digest near you!)

The Software-Quality Digest is edited by:
Terry Colligan, Moderator.

And published by:
Tenberry Software, Inc.     

Information about this list is at our web site,
maintained by Tenberry Software:

To post a message ==>

To subscribe ==>

To unsubscribe ==>

Suggestions and comments ==>

=============  End of Software-Quality Digest ===============

[Tenberry] * [Software-Quality] * [Zero Defects] * [Automated QA] * [Consulting] * [Site Map]
Last modified 1998.3.27. Your questions, comments, and feedback are welcome.