The Search for Quality

Tuesday, November 13, 2007

Quality and Perception

Well it's a new start for me. I have a new job with Xobni as their Quality Jedi, and it's a nice change. The startup life has been something I've been talking about for a while, and the time for action presented itself in the form of a ripe opportunity.

I've been thinking a lot more about product quality lately, in the general sense. This is probably because product quality will be much more on my shoulders than before, as I'm working with a relatively new product that has already been launched (in the form of a beta). I've been thinking about how one can attribute the status of "high quality" to any product, and I've realized that the only person who can bestow that status is the consumer.

Ultimately, no matter how much a product is tested and sent through the wringers of QA, the consumer is the one who decides whether your product is of high quality. It is the perception of quality that makes something of high quality. Everyone I know who has owned a BMW has moaned a little about how often it has to be taken into the shop for repairs. And yet BMW retains the status that their cars are of high quality, due to the fact that they are "german-engineered".

The perception of quality is powerful, and can directly contribute to product success. The first product put out by Xobni is called "Insight", and is a plug-in for Outlook that gives you a "people-centric" view of email. Email is data that is important to the consumer (vitally important to some), and anything that builds on top of that data must provide value while not corrupting or interfering with the basic tasks (emailing) in any way.

That is the quality challenge with this product, as I see it from a high-level. What is more important than verifying the product functionality is working as expected, is making sure that the consumers current tasks and environments aren't disturbed by the product.

That being said, now the question is, how can we affect the perception of quality to the user? This is a question I'll have to ponder more. The value we're building into Outlook will allow them to accomplish their tasks faster, and more efficiently. I think this increase in efficiency is the key to the quality perception for our product: the users are getting more done by using our product, without having their current email environment disturbed. This may sound quite basic and obvious, but I think it's good to reinforce these base points as to why we're building what we're building.

Interesting times are ahead.

Thursday, August 23, 2007

GTAC!

Sorry for the lack of posts -- I've been out of town recently.

I'm currently at the Google Test Automation Conference, at Google's New York office. Thus far, it's been a great experience. There are two things I've learned about (this morning) that I wanted to blog about: how to design a conference, and Google's testing philosophy.

Allen Hutchinson, an engineering manager at Google, gave some good insight into how they designed the conference, and why they made the decisions they did. I pulled lots of good insights from this talk.
*Keep the conference to under 150 people. Sociological research has shown this is an upper bound (and a magic number) that allows a group to keep a sense of community
*Provide outlets for people to discuss the conference. Their high-profile testing blog is open for unmoderated comments on this conference, and a google group was created as well.
*Make the talks available online after the conference
*Try to decrease the amount of repeat speakers from past conferences. They want fresh blood and new ideas introduced in their conference
*Ask old speakers to help select new speakers - or just have people outside your organization help select the speakers

Allen also mentioned that they kept the conference to a single track so that you can see all the talks, but to be honest I rather like the multiple track system. It allows more speakers, and allows you to skip talks that do not interest you.

Pat Copeland, a Director in Google, then spoke about the testing philosophy that Google maintains, which I found quite interesting.

*Measure everything you can. More data allows for more analysis.
*Continuous builds and faster builds allow more time for testing
*Focus on improvement of the system, and not the bits and pieces
*Testing Goal: fully test a product in one day

Pat also mentioned some challenges that they face as an organization, which I think applies to pretty much everyone:
1. Simulating real-world load, scale, and chaos is difficult
2. Deciding what to measure is difficult
3. Complex failure scenarios are expensive to test

The talks (and more importantly, the people) have been quite interesting thus far. I'll hopefully have more to update after tomorrow.

Tuesday, July 24, 2007

My Favorite Laws of Software Development

There was a great blog on the Laws of Software Development, and there were a few I saw as related to testing that I thought I should share.

Brooks' Law: Adding manpower to a late software project makes it later. Or testers, in this case.

Conway's Law: Any piece of software reflects the organizational structure that produced it. Good to keep in mind when testing.

Heisenbug Uncertainty Principle: Most production software bugs are soft: they go away when you look at them.

Hoare's Law of Large Programs: Inside every large problem is a small problem struggling to get out. Finding those small problems is worth searching for.

Lister's Law: People under time pressure don’t think faster. Or test faster.

Nathan's First Law: Software is a gas; it expands to fill its container.

Sunday, July 15, 2007

Scalable Test Selection

A few weeks ago I gave a presentation at the Google Scalability Conference on the idea of Scalable Test Selection (video can be found here). Now that the video is up, I thought I'd share with you all this idea developed within my team. I must preface this by saying that this was not my idea -- the credit belongs to Amit and Fiona.

As Cem Kaner has said before me, there is never enough time to do all the testing. Time, in essence, is your scarce resource that must be allocated intelligently. There are many test strategies one can choose from when testing a product, however the question must be raised: how do you know whether you're testing the right things, given the amount of time you have to test? This is an impossible question to answer, however there are situations where the question is pertinent. For instance: when a change is introduced to your product at the last minute before release, and you have to decide what to test, how do you choose? There are probably some intelligent guesses you can make on what to test based on what has changed, but how can you know that this change hasn't broken a distant dependency?

This is the situation where we believe Scalable Test Selection can help. Given a source code change, what test cases are associated with that source code? Essentially, how can we link test cases to source code using test artifacts?

We have identified (and presented on) three ways to associate test cases with source code:

Requirements: If source code is checked in to satisfy requirements, and test cases are checked in to satisfy requirements, then a connection can be made.
Defects: A test case fails, which then is associated with a new defect. When source code is checked in to fix that defect, an association is made between the code and the test case
Build Correlation: For a single build, you can associate a set of source code changes with a set of test case failures. Now iterate that over successive builds, and you have a large set of source code and test case associations

With all this data that you can use to associate source code to test cases, when future source code changes are checked in, a tool can be written that can find all the test cases that are associated with that source code. In the case that you're in a time-crunched situation, you can have another source that suggests what test cases you should run in your limited amount of time.

What we're doing is applying simple data-mining techniques to the test data. There is much more to the idea that I'm not talking about (prioritization of test cases, implementation ideas, etc), however I hope you get the jist. I fully recommend you watch the video if this topic interests you, and feel free to email me if you want the slides :).

Monday, July 9, 2007

CAST Conference

I just finished Day 1 of the AST Conference (CAST), and boy did I have a good time. The talks were good, but the people were better. There was definitely a general feeling of community there, which is something I haven't seen much at conferences I've been to lately. The crowd was relatively small (~180), but all good people.

Highlights for me:
* Hearing Lee Copeland talk about the role of QA as a services group (this makes almost too much sense to me)
* Talking to Jon Bach about the future of AST, CAST, and the other conferences that AST is hosting
* Watching a small group gather after my talk on Meta-Frameworks to share their experience with Meta-Frameworks

Anyway, now that I'm leaving tomorrow, I wish I was staying another day.

Saturday, July 7, 2007

The Time Trade-Off

I started prepping for the CAST conference next week by reading up on some test patterns that the AST group has produced in the past. I was reading some great stuff this weekend by Cem Kaner on Scenario Testing, when I came across a fantastic quote:

"The fundamental challenge of all software testing is the time tradeoff. There is never enough time to do all of the testing, test planning, test documentation, test result reporting, and other test-related work that you rationally want to do. Any minute you spend on one task is a minute that cannot be spent on the other tasks. Once a program has become reasonably stable, you have the potential to put it through complex, challenging tests. It can take a lot of time to learn enough about the customers, the environment, the risks, the subject matter of the program, etc. in order to write truly challenging and informative tests."

It's so true, it's painful. Deciding on what to test is becoming increasingly important in my own work, as the amount of work stacks up, and the amount of time to test it decreases. It's an interesting balancing act.

Friday, June 29, 2007

An interesting bug in Java binarySearch

Josh Bloch, author of Effective Java Programming and one of the best in the biz, blogged last summer about an error in his own implementation of binary search within Sun's JDK. It turns out that the most straightforward implementation admits a flaw so subtle that the version in java.util.Arrays, in particular, lay dormant for almost a decade. It is one of the most interesting bugs I have seen in a long time, and in fact I didn't notice it myself. Simplified slightly from the original:


// locate an index of key k in sorted array a[], 
// or if k is not present, return -1
int binarySearch( int[] a, int k ){
  int i = 0;
  int j = a.length - 1;
    
  while( i <= j ){
    int m = (i + j) / 2;

    if( k < a[m] ){         // left half
      j = m - 1;

    } else if( a[m] < k ){  // right half
      i = m + 1;

    } else {                // found
      return m;
    }

  }
  return -1;                // not present
}

The culprit is sly: Operating on a large enough array, computing m = (i+j)/2 is subject to integer overflow, becoming negative due to wraparound, e.g.


  int M = 0x7fffffff;         // 2^31 - 1
  int[] a = new int[ M ];
  a[ M - 1 ] = 1;

  // the call...
  int r = binarySearch( a, 1 ); 
  // ...throws an ArrayIndexOutOfBoundsException !

Fortunately the midpoint can be computed safely via m = i + (j-i)/2. Already there are many blog posts on this blemish, but none concerning quality. What might a software engineer learn about testing from this long-hidden oversight?

1. A lesson from numerical analysis, that time-honored formulas aren't necessarily robust when copied to finite arithmetic.

2. Vary your test data. Surely the Sun test battery for binarySearch passed, again and again, on the same sets of arrays and keys. Fine for 9 years -- but static lists of tests become stale. Which arrays should we select, or search for? Read on...

3. Specialized conditions are required to pinpoint rare defects. This flaw cannot be exposed for sizes below 2^30, about 1 billion; no wonder nobody noticed. The truth is that the endpoints of anything almost always provide interesting insights. Here, searching far to the left always passes, and only searching far enough to the right can throw the Exception. Aha. This leads to a favorite maxim...

4. Design tests specific to the implementation. The greatest of the Java int primitives is M -- nothing to do with binary search -- instead, a boundary in the language itself! Therefore, an array of that length makes a worthwhile test case in Java (cf. Mathematica, Python). Something to remember when writing your own test suites.

Professor Kahan at Berkeley once summed up his lessons about testing (from the infamous Pentium divide glitch) into a fitting quote:

Seek singularities!  There lie all the errors.

Wisely put, as are Josh's remarks about programming defensively, vigilance, code reviews, static analysis, solid proofs, and formal verification. (I would disagree with his view that there will always be bugs, but opposing formidable Dr. Bloch isn't really something one ought to do).

The paradox is that although testing can't exterminate our errors, it still needs to be done to provide corroborative evidence of correctness, even for well-proved algorithms! Later, I'll augment this post with test strategies which might have exposed the flaw. With test methods in mind, we might be able to expose ever more errors, and add to our QA repertoire.

An exercise: Using 32-bit two's compliment integer arithmetic, compute the average of i and j, correctly rounding to even. Test it with i and j at the extremes. My solution is not short.

The Search for Quality

Blog Archive

Contributors

Blog's

Technorati

spotplex

MadKast

Google Analytics