Sunday, December 7, 2008

Measuring the Value of Testing

One of the best and most obvious ways to excel in any company is to show how your work is making the company money, or providing value that directly leads to making the company money. The more directly you can show your connection to how much money is being made, the better position you're in. For instance, if you can easily show that the feature you developed, or the feature you designed and brought to market, made the company $X over a time frame of Y years, you're doing well. If you can optimize the function such that X is large and Y is small, you're on a fast track to upper-management.

Showing this sort of result in testing is a much harder proposition. It's easy to say that you tested a feature or product that made $X over a time frame of Y years, but lets be honest here: being the product manager or lead developer and making the same statement definitely carries more weight when attached to a resume. The person generating the idea and the person originally implementing the idea will always be seen as being closer to the end result and positive cash flow than the person ensuring the value of the idea, and ensuring that the mass market can actually use your idea as expected. I'm not saying that this is "the way things should be", I'm just stating it's the way things are.

That being said, here's why I think demonstrating the value you've directly provided to a product is harder for testers: you can't show the diff between what the product would have made had the bugs you found been released to the public. That is an alternate reality we don't have access to; it's an A-B test that can't be run.
If it were possible to conclusively show that the company made $Z more because of the bugs that had been found, I think testing would be taken more seriously across the industry.

Friday, December 5, 2008

Testing in Layers

My old boss and mentor was chatting with me over Facebook recently about a concept that he termed, "testing in layers". It is the basic idea of slowly testing out a new release with people who are close to the code, then providing the release to people a little further away (friends/family/investors), then a little further (active members on your forums), while eventually moving out toward your normal everyday users. This idea struck me because it provided a simple mental model for actions that I think many in the testing community are already doing at a basic level, and provided much room for complexity if need be. It's nice to provide a concept around an existing process, so that it can be better codified and thought out.

In my world, I use this idea in the following context:
  • Test internally with engineering (QA)
  • Test internally with non-engineering (Marketing, etc)
  • Test by releasing to trusted users on our forums
  • Test by sending the release to particular people who email into Support, whose problems may be fixed with this release
  • Test by releasing to everyone on our forums
  • Test by releasing to a small number of people who download through the website
One of the issues with this idea is that a feedback loop needs to be in place at each layer. If we should release a build to the forums and don't listen to the complaints and issues that this layer is having, then this testing is pointless. However, the further out you get from the center, the harder it is to get solid feedback. When we release to a small number of people through the website, they probably don't know that they're getting a brand-spanking new release. In this case, how do we get their feedback? As of right now, we simply watch to see if they uninstall, and if the stars align, they might even leave us feedback. It's not perfect, and can use improvement, however so far it's a process that is working adequately to help us determine how the release is doing.

The idea of quality feedback loops for your product will be saved for another blog post. Lets just say that twitter has been proving quite useful in getting product feedback lately.

Thursday, December 4, 2008

When has Outlook "Started"?

I have recently had the opportunity to work with a small team to define and run some basic performance tests for Xobni. If you've done some serious performance tests before, I probably don't need to tell you what a peculiar beast performance testing can be. The fun part about this project is that we are measuring areas where are customers are having pain points, and building tools to automate the running of those measurements. On the other side of the coin, creating a consistent and controlled environment where performance measurements can be taken without fear that something external is affecting your measurements keeps me up at nights. No, not literally, but environmental control is one of our biggest problems.

One particular measurement which we've been stuggling with for some time is simply known as "Outlook Startup Time". How do you know when Outlook has fully, really, finally, finished starting? Most importantly, when do our users think that Outlook has really "started"? This is an important question for us: if we're going to improve Outlook startup time with Xobni installed, we have to know what that means. Well here are a few ideas for measuring Outlook startup that we've implemented in a tool of ours:
  • When the "Reading Pane" is visible and has text
  • When the Xobni sidebar appears AND the Reading Pane is visible
  • When the Application.Startup event fires in Outlook
  • When Outlook has finished syncing with Exchange
  • When you are able to move to the next mail and have it load within a certain period of time
  • When the CPU usage drops back to a level on-par with usage before you started Outlook
As you can see, we're measuring a lot and trying to see what sticks. What are your thoughts? When is Outlook usable, by your definition?