News on Monday
more>>
SharePoint Tech Report
more>>


   

 
 
Download Current Issue
ISSUE 2/1/2010 PDF

Need Back Issues?
DOWNLOAD HERE

Receive the print Edition?


 
blogs tab
Visual Studio 2010 Release Candidate Available Today
A Visual Studio 2010 release candidate is available on MSDN.
02/09/2010 09:45 AM EST

Is Microsoft eyeing Office subscription pricing?
Microsoft may be preparing to offer a new Office pricing option called "union," which charges the same for cloud as on-premises.
02/01/2010 09:38 AM EST

Facebook rewrites PHP runtime
Facebook is about to open source its own PHP runtime, written from scratch for speed.
01/30/2010 08:53 PM EST

 

Events calendar tab
2/9/2010 to 2/13/2010
San Francisco
IDG World Expo

2/10/2010 to 2/12/2010
San Francisco
BZ Media

2/17/2010 to 2/25/2010
Atlanta
Python Software Foundation

2/19/2010 to 2/20/2010
Los Angeles
SCALE

2/21/2010 to 2/24/2010
Las Vegas
IBM


 
Most Read Latest News Blog Resources

You Got Unit Testing in My Database




January 1, 2007 — 
Unit testing can transform database programming as much as it has transformed application development. I learned this on a recent project that avoided a bog only when I decided to apply unit-testing techniques as a last resort.

Generally, I don’t worry about databases too much. With personal, departmental and many Web applications, the data structure is so simple and the volume is so low that virtually any DBMS suffices. For enterprise applications, there is generally so much complexity and investment in the existing infrastructure that it would be irresponsible to contemplate switching technologies (database portability via SQL being last century’s version of the current WS-* debacle). With the major DBMSes, there are enough tools and specialized knowledge that I defer to the opinion of the DBA. Typically, I’ll be involved in discussions of the object-relational strategy, but I leave the data side of that equation to others.

But ya gotta pay the bills. When a client approached me with a project integrating an enterprise database from one company with a data store from another and mixing in some secret sauce to precalculate some values that would…well, you know the drill. It’s not uncommon to see an opportunity where a judicious restructuring of the data and precalculated intermediate values can make a tremendous performance difference.

The problem is that, in the real world, someone else has had the same realization before you. Outside of the textbooks, the structural and mathematical elegance of normal form is vanishingly rare. Whether for good reason (performance) or poor (“Do we need that data anymore?” “I dunno. Let’s put it in a table called x_Employees.”), databases in the real world have redundancies, sparseness and undocumented relationships. Stored procedures, triggers and views create cascades of unexpected updates and deletes, but seemingly never in a manner comprehensive enough to avoid the changes wrought by nightly, weekly and monthly batch scripts.

I took a few runs at the enterprise database using trial editions of a number of commercial tools seemingly well suited to the task. They didn’t help much. Since I don’t maintain a professional knowledge of DBA skills, I’m going to refrain from naming names—they may be sharp scalpels in the hands of a specialist—but for an application developer performing a somewhat speculative project, the tools only added to the complexity.

One day, battling paralysis and the time-killing temptations of Digg, I decided to follow the advice of the always reliable Scott Ambler (www.agiledata.org) and wrote a few unit tests against the database. One reason that unit testing is such an important contribution to application development is that you can always write a unit test, even during those times when your brain is otherwise filled with cotton candy. My first unit tests were just this kind of brain-dead makework: testing connection strings, confirming that a specific table had records, etc. There’s no embarrassment in writing silly tests. Unit testing exists outside of normal execution and is automated and fast, and tests of basic assumptions (i.e., “silly tests”) often end up quickly isolating snafus. After a few hours, I managed to move to testing data integrity on some joins. By the end of the day, I was walking over most of the troublesome tables. I wasn’t doing anything, and to some it might not appear that I had come any closer to breaking the problem, but I’ve found that one of the great advantages of a unit-testing suite is that it serves as a safety net, emboldening the developer to attempt approaches that otherwise one would not, for fear of “breaking something.”

Database refactoring sometimes seems like trying to solve a Rubik’s Cube: You know the solution is many steps away, but you also know that what looks like an attractive intermediate step may very well prove to be a mistake. An automated testing suite gives you the ability to flush out the unintended consequences and, during the inevitable backtracking, assures you that you’re back to a known configuration.

As the test suite grew, and I began folding in the data from the third party and performing the precalculations, the “clean build” transform and test grew to eight hours, and then 16, and then 32 or more (I used Ruby to write the test suite and have to wonder if the running time might have been less in C# or Java).

I’m not a test-driven purist, so I had no compunction doing my development against a dramatically filtered subset, leaving the “clean build” for the weekends. In the end, the number of test cases was not all that large, but running it over the database resulted in several million assertions. Along the way, I ultimately turned up hundreds of data discrepancies tangential to the task for which I was hired. If I were a more resolute person, perhaps I would have called a failure a failure and addressed each of them, but instead I simply documented them for a future effort (which, I know, is unlikely to occur).

Aside from those discrepancies, after the database test suite was in place, the data model didn’t break once when being tested against both the new and legacy applications. Not bad for several million records in more than a hundred tables.

Larry O’Brien is a technology consultant, analyst and writer. Read his blog at www.knowing.net.


Share this link: http://www.sdtimes.com/link/29932
 

Add comment


Name*
Email*  
Country     


  • Comment
  • Preview
Loading