Sunday, June 20, 2010

4 Absence of Evidence is Not Evidence of Absence – The System Testing Conundrum

Jerry’s discussion of the pitfalls and psychological traps surrounding testing struck home, especially bringing to mind one episode I have always referred to as the “No Show Stopper Case.”

I had recently been promoted to Vice President of Quality Management in GTE’s Telenet division.  They had recently launched the first commercial e-mail system, TeleMail, and were poised to release the full-featured version 2—with much fanfare.  Some of the maintenance releases of version 1 had not gone well, with much bad will among early customers and a spreading reputation for poor quality.

It was in this atmosphere that the CEO directed me to conduct an independent review of the recommendation to immediately release the new version into production.  This was not a product that could be rolled out to isolated beta customer sites, but rather a network-based service, so any problems would affect all customers and further erode our fragile industry reputation.

The director of the development shop and his staff were apparently unaware that I had spent three years of my GTE career heading the corporate-wide effort at standardization of methods and tools for large system development.  I suspect that they thought an hour or so of boring technical slides would make the “Quality Control” guy glaze over in boredom and retreat, in awe of their brilliance.  Then they could get on with their release.

The meeting started to deteriorate (at least from the project team’s perspective) when I started probing for such practices as test case generation and test coverage metrics.  By the time I got around to asking to see their development process standards, the room had developed a noticeable chill.  When the director stated that the project was under too tight a deadline for such “overhead”, I asked what was, then, the criterion on which he was recommending to go into production.  His answer was, “We tested until there were no more show stoppers.”

It was at this point that I channeled my high school Jesuit teachers and applied the principle of “reductio ad absurdum” to state, “Well, then, I plan to recommend that we hold the release until your staff delivers to me the exhaustive list of ’show stoppers’ for which you tested, thereby proving they are all absent.”  The director began sputtering at the absurdity of my request, at which point I suggested that maybe the absurdity was lodged in the claim that there were none in the program, given that there was no definition and no test to which he could point to prove the assertion.

After things settled down, we agreed that the delicacy of the situation with our customers and our historically poor industry reputation probably did justify a reasonably short delay to apply some additional rigor to the testing and test results analysis.  When we did, it turns out that the system functioned as specified, but the performance was dismally inadequate for full production use, so performance optimization was undertaken in parallel to the upgrading of the testing discipline.

The blind spot exposed in this case was hardly unique to this team, but seems to be rather common, as continuing industry experience with software project failures indicates.

By the way, over three years time, including this incident, Telenet went from last place to first place in industry ratings for our products and services.

No comments:

Post a Comment

Please be kind to an old software guy and considerate of the ladies and gentlmen who may read these posts.