
I've posted a few bits already on the next big thing in software development, but none of them have been as obvious or as deserved as Hadoop. Named after one of creator Doug Cutting's progeny's stuffed animals, Hadoop is, in my opinion, the killer app for the cloud. Or, at the very least, it's the infrastructure upon which the cloud's killer apps will be built.
You've all been writing systems just like Hadoop since the dawn of computing: It's the infrastructure of building massive data-crunching applications. Your nightly batches. Your monthly customer survey results. Your hourly data sift. As clustered data processing solutions go, Hadoop is a fairly painless one to use, relatively speaking. Naturally, it's a non-trivial task, at present, to process large amounts of data in Hadoop: It is only at version 0.20.0. It sounds like security is a big issue right now, and it currently takes around 20 to 30 minutes to get a crashed Name Node back up. The Name node is the single point of failure for the entire cluster. But, already, there seems to be a vibrant community of people building the ecosystem that will eventually make Hadoop a must-have platform in your data center and in your external clouds.

At its core, Hadoop is an implementation of map/reduce, coupled with a distributed file system. That means Hadoop manages the cluster; you just write the code needed for the actual data exploration. Yahoo is a big backer of the project and employs Cutting full-time. They're said to have a 4,000-node Hadoop cluster up and running, with Zookeeper acting as the sheriff when things go awry.
Data loads I have heard about, thus far, show people using Hadoop to crunch anywhere from 40 terabytes to almost a petabyte at once. That's a lot of customer data to sift through. I'm sure your business analytics people will be salivating when they start to play with Hive, Facebook's framework for accessing Hadoop data clusters via a SQL-like language.