Tuesday, January 23, 2007

Ethical Hacking

A common question I get from people is, "What do you like about QA?". Actually, it's also a common question I ask people during interviews, but I digress. It's a question I've thought about much during my time in the industry, and I have a few answers, but my favorite one is that I consider testing to be a form of ethical hacking.

Putting the negative connotations aside, hacking (in my opinion) is understanding a system so thoroughly that you can discover weaknesses to exploit in order to gain some sort of advantage. These advantages could be an escalation of privileges, using the system for ways not intended or devised, or simply disturbing the data in the system. Not to be too picky, but the actual term for this is "cracking". The idea of hacking extends well beyond computers, as many of you know I'm sure. Using the correct combination of coupons to get a good deal at Best Buy (as seen frequently on Techbargains) is using the system in a way that wasn't intended, and is a form of hacking.

Since hacking unfortunately already has negative connotations with it, I like to use the phrase "ethical hacking". As part of my job, I must understand a system well enough to exploit it. Truly there are an infinite number of combinations of actions that one can try to exploit a system. As members of a larger system, one must understand more than just the application under test. You must also have knowledge of the underlying operating system(s) it may be used on, any network protocols that may be in use, the pitfalls of the underlying architecture, the caveats of the language it was written in, etc., etc. It is the understanding of the external world housing the application that one must learn, and in that, there is so much to learn. Which keeps life interesting.

Friday, January 19, 2007

Inducing Low Memory Conditions

Testing under stress and load conditions are important for every application. Having empirical knowledge of what the breaking point is for any computer hosting an application helps you to plan for such an occaision, as well as giving you an idea of what sort of load the machine can handle before it buckles. If you're aware that having 1000 simultaneous users on a web application you've built is the down-hill tipping point for failure, and you're seeing peak usage at 1200 users a day, you know that it's time to either get another machine or tweak the code to allow for more users on the machine.

Recently I did some searching for a program that would induce low memory conditions on my linux box. I wanted to do some stress testing on a web app while the machine was low in memory. Unfortunately I couldn't find a decent program, so I wrote one! Since I've found that "tutorial" sites are very popular (I visit them myself daily), I thought posting this might be good for the blog. And what's good for the blog, is good for Ryan.


#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MEGABYTE 1024*1024
#define LOWMEMORY 30000

/*
 * SUMMARY: Induce low memory conditions by allocating memory in 1 MB chunks and checking free memory until there is only LOWMEMORY KB of free memory left
 * REQUIREMENTS: Must be run on a Linux machine, and the LOWMEMORY number should be in KB
 * AUTHOR: Ryan Gerard, ryan dot gerard at gmail dot com
 */

int main(int argc, char *argv[]) {
     void *myblock = NULL;
     int count = 0;
     int freeMem = 0;

     //Allocate memory in 1 MB chunks as long as the free mem can be found and the free mem is greater than LOWMEMORY KB
     while( (freeMem = getFreeMem()) != -1 && freeMem > LOWMEMORY)
     {
          myblock = (void *) malloc(MEGABYTE);
          if (!myblock) break;
          memset(myblock,1, MEGABYTE);
          printf("Currently allocating %d MB\n",++count);
          printf("Free mem is %d KB\n", freeMem);
     }

     //Print out amount of free memory left
     printf("%d free memory left\n", getFreeMem());
     exit(0);
}

/*
 * Parse the output of 'vmstat' to determine the amount of free memory left on the system
 */
int getFreeMem()
{
     FILE *readFile;
     char string[256];
     int cnt = 0;
     int freeMem = -1;

     //Output memory information to file tmpfile
     system("vmstat -a 1 1 > tmpfile");

     //Open the file for reading
     if ( (readFile = fopen("tmpfile", "r")) != NULL )
     {
          //Get the third line of text from the file
          while( fgets(string, 256, readFile) && cnt < 2 )
               cnt++

          cnt = 0;
          char* token = strtok(string, " ");
          while(token != NULL && cnt < 3)
          {
               token = strtok(NULL, " ");
               cnt++;
          }

          //Convert to an integer to return
          freeMem = atoi(token);

          //Close the file
          fclose(readFile);
     }

     return freeMem;
}

Monday, January 15, 2007

A Timeout for Amazon

I wanted to take a timeout from testing to discuss an area I find interesting: the web services platforms being built by Amazon. I am, of course, referring to S3, EC2, SQS, and Mechanical Turk. I was reminded of these services by Marc Hedlund's post on the O'Reilly Radar today.

The game is afoot in the race to develop the next best thing, and Amazon has released these rather interesting technologies that enable developers to build that next best thing cheaply. Jeff Bezos strikes again with services that leverage the idle parts of Amazon's computing infrastructure and provides it to the people at extremely reasonable prices. Here is a quick overview of the various technologies described above.

Mechanical Turk

"Developers use the Amazon Mechanical Turk web services API to submit tasks to the Amazon Mechanical Turk web site, approve completed tasks, and incorporate the answers into their software applications."

This web service is in reality a masked human-computer interface. Instead of humans asking computers to perform actions, computers are asking humans to perform actions, and using the results in whatever computations are being performed. The entire developing area of funnelling work off that is easy for humans to do but hard for computers (visual pattern recognition, reading obscured text, etc) is interesting, and efficient, assuming you can assemble people to actually do the work.

S3: Simple Storage Service

This is truly a simple web service that allows you to store and retrieve any amount of data from the Amazon infrastructure, and you only pay for what you use.

"It gives any developer access to the same highly scalable, reliable, fast, inexpensive data storage infrastructure that Amazon uses to run its own global network of web sites."

This is a very simple idea, and easy to grasp the significance. Storage is cheap, so make is available anywhere through standardized methods. The first obvious idea around this technology, online backup, has been done by a few companies already. The cost, you ask? $0.15 per GB-month of storage used, and $0.20 per GB of data transferred.

E2: Elastic Computing Cloud

E2 is to computing what S3 is to storage. E2 provides a web service that allows you to increase your computing capacity on the fly.

"Amazon EC2 reduces the time required to obtain and boot new server instances to minutes, allowing you to quickly scale capacity, both up and down, as your computing requirements change."

The first obvious idea I can see with this is reducing the Digg /Slasdot / reddit effect on smaller sites. I hate clicking on a link from one of those sites to only find out that it maxed out its bandwidth limits two hours ago.

Virtualizing all computer resources like this is quite interesting - I'd be interested in reading a story by someone who completely runs a site from Amazon's infrastructure. Requests made to the domain would use the EC2 computing cloud to retrieve the page requested from the S3 storage service. Traffic would be monitored, and in the case of a statistical spike, more computer resources could be called within minutes to serve your new readers.

Again, the cost is very reasonable. $0.10 per instance-hour consumed, $0.20 per GB of data transferred, and any associated S3 costs.

SQS: Simple Queue Service

"Amazon SQS offers a reliable, highly scalable hosted queue for storing messages as they travel between computers."

SQS is a giant distributed queue using Amazon's messaging infrastructure. This service-based asynchronous message technology is a little less obvious in terms of realistic functionality, but very geeky, and hence, very interesting. It reminds me of when Google released Google Sets. My first thought was, "What sane person would use this?", and my second thought was, "I wish I could think of a reason to use this".

Friday, January 12, 2007

Extreme Testing

I'm sure many to most of you are familiar with the principles of extreme programming. It can be quickly characterized by writing unit tests before coding and pair programming, but you can read more about it from this webpage.

I was thinking on the way home today about possibly modifying the rules of engagement into something called Extreme Testing. It would focus more on the relationship of the two people in the pair. In particular, one of the two would be a QA engineer. They would work closely together in the planning and designing phase - not much would change there. During the coding phase, the two would spend half their time together in a modified pair programming mode; most likely the main developer would do most of the coding, and the QA person would be there making recommendations and understanding what's going on. The other half of the time would be spent on their own: the developer doing his thing, and the QA person developing test plans, documenting the design, and creating process flows and timing analyses. Then in the end there is a big celebration with much rejoicing.

Why would one think of such a scheme? What is the possible value of this to you? A more effective method of knowledge transfer. Very loosely, here is how the software development process works today: the developer designs the component and hands you a design doc. That developer then builds his component, and hands you (the QA person) a finished product to test. Meanwhile, you must learn the intricacies of this component on your own, for the most part. You have a design document, and checked it code to sift through to understand the inner magic.

Being in QA, understanding the component better than the author is your quest. You must know it so well that you can break it with your left pinky. However, the process of transferring that understanding and knowledge is somewhat broken. Here is the current process: developer's head --> design document --> QA's head. Stuff is lost inbetween, I assure you. Developers stereotypically dislike documenting their component, making the knowledge transfer process more difficult. Seriously, I love a well-written design doc. It's like crack-cocaine for me. I would mainline a well-written design doc if I could only fit it on that little spoon.

So what is the solution? Well how about some extreme testing? Working side by side with the developer while he's working, and documenting the component and processes involved are a better way in my opinion of guaranteeing that the knowledge is tranferred more effectively.

Tuesday, January 9, 2007

Quality Problem Solving

All good engineers are at their roots good problems solvers. Despite the fact that my degree is in computer science, I feel that I should have received a bachelor's in "Technical Problem Solving". This may be obvious to some of those reading this blog, but this point was definitely not emphasized going through school. Until you're pretty deep into the program, you are led to believe that this education is in how to program, and nothing else. I'll have to write later about the issues I have with computer science education before I digress any further, as this post is about a book.

I've started a book that has turned out to be quite interesting. The book is called Quality Problem Solving by Gerald Smith. Now before you laugh at the fact that I found this interesting, hear me out. Being that I'm a member of the QA community, it is problems of quality that currently concern me. I find the book to be very realistic in its view of quality and problem solving in general, without any hokey aphorism's or a-b-c methods of solving problems. For instance, take this gem:

"The most common weakness in practical reasoning, as in problem solving, is incompleteness. Poor outcomes result not so much from the mistakes we make as from the possibilities we overlook."

Gerald starts the book with a very sober view of the word "quality" and how it has been manipulated and had its meaning raided by corporate vandals, citing TQM and other quality fads frequently. He rightly states that these movements had good principles and ideas behind them, and if used in the right context were very useful. However many were applied to solve every problem known to man and beast, and hence, fell out of favor after failure in those contexts.

He continues on to lay down a philosophical yet clear foundation for solving problems, discussing problem identification, problem definition, diagnosis, and alternative generation. He discusses these topics in general terms, and provides good examples for each.

The last part of the book applies the problem solving foundations to specific problems related to quality. I can't evaluate that part of the book, as I have not yet read it, but it looks promising. The TOC lists the topics of conformance, efficiency, product design, process design, and unstructured performance problems.

Basically what I'm saying is this book is good, and I recommend it.

Saturday, January 6, 2007

More learning materials

Continuing with the last past of highlighting good learning materials, I saw on Michael Howard's blog a post about online security sessions from the Microsoft TechEd IT Forum. It looks like pretty awesome material. There are videos profiling "real" hacking techniques live, Vista UAC internals, and Vista kernel changes.

For anyone in QA that will be working on Vista, these look like good resources.

Google London Test Automation Conference Video's

I realize that this is old news (September 2006), but I wanted to remind people that there are some fantastic videos available through Google Video. Specifically, the London Test Automation Conference video's being hosted. Some great stuff is covered if you have the time, including a video on Selenium (a programmatic cross-browser web GUI test tool), and distributed testing with SmartFrog.

These videos are highly recommended if you have the time and inclination.

Wednesday, January 3, 2007

Exploratory Eating

My girlfriend and I sometimes perform these food experiments. For one month, we do or eat something differently just to see what happens. For instance, we were vegetarians for one month. This month of January, we're cutting all sweets and desserts out of our diet.

I guess these are more like body experiments. The point is just to see what happens when we change our diet. You can call it "exploratory eating". Do we feel more energized? Healthier? More tired? Is there no change?

The point I'm trying to make is that the testing mindset can (and should) be applied outside of software. Applying your skills in testing on your finances, health, and mental well-being could be a interesting (and possibly rewarding) experience.