Automated Unit Tests and the Dreaded Random Data

Recently I’ve been reviewing some unit tests for one of our major projects. And a coding practice I have noticed in these tests has started to annoy me. Sadly I am the one to blame for this coding practice, as I started it, but it is one that I hope to fix for all future tests (and some of the old ones). It is the use of random data in an automated test case. This practice emerged from our tests needing be repeatable on a system that has unique constraints on certain tables in the database, such as a Username, the test cases would randomly generate a valid username and register the user as a part of the test. Ok this is fine, but then the practice spread to all data being put into the system, including measurements, dates company names, even phone numbers. We wrote functions to create valid random data, and even randomly pick codes.

This gives great looking test data not just for your automated tests but also for any user-interactive testing etc. The problem for me lies in the fact that in a Unit Test you want to keep your variables to a minimum. A well designed Unit Test should be repeatable now or 10 years from now giving exactly the same results every time you run it. It should be able to specifically check that certain things do or do not happen. When random data is used for values that the test depends on, this can lead to tests giving false negatives, or worse, false positives (but only some of the time).

Example

   1:  [Test]
   2:  private void UpdateUserAge()
   3:  {
   4:      Random rand = new Random();
   5:      User user = UserData.GetUserByID(1);
   6:      int originalAge = user.Age;
   7:      user.Age = rand.Next(100);
   8:      UserData.UpdateUser(user);
   9:      User updatedUser = UserData.GetUserByID(1);
  10:      Assert.AreNotEqual(originalAge, updatedUser.Age);
  11:  }

In this example, a simple function gets out a User by ID 1 from a database (please ignore the fact the ID is hard-coded), updates the Age in the database with a random number between 0 and 100 and checks that the UserData.UpdateUser() function properly performed the update.

So what’s wrong with this code?

Well imagine that the UpdateUser function accidently ommitted the Age property in it’s updates (this actually happened to us a while back where a field was added to a database but it was left out of the update logic). Roughly 1 in 100 times this function would give a false positive that the updateUser.Age was correctly changed. Now if you were unlucky enough to have this occur when you were initially developing you may never have noticed that UserData.UpdateUser() did nothing.

Ok, this is a simple example but it can show how your Unit Tests are creating (or hiding) bugs of their own. Here’s the point I’m trying to make… Go to the effort of setting up a known scenario for your test, then change that known scenario, and double check everything equals what you expected it to. In the above example I’m also assuming an existing User… what if two developers were running this test at once? Create the data from scratch, it makes your tests larger (much larger) and if this becomes a problem create an ordered test where the scenario is created for a whole suite of test cases, that are run in sequence.

Advertisements
  1. Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: