SD TIMES BLOG
ahandy

Hadoop everywhere

by Alex Handy 06/22/2009 06:12 PM EST

I've posted a few bits already on the next big thing in software development, but none of them have been as obvious or as deserved as Hadoop. Named after one of creator Doug Cutting's progeny's stuffed animals, Hadoop is, in my opinion, the killer app for the cloud. Or, at the very least, it's the infrastructure upon which the cloud's killer apps will be built.

You've all been writing systems just like Hadoop since the dawn of computing: It's the infrastructure of building massive data-crunching applications. Your nightly batches. Your monthly customer survey results. Your hourly data sift. As clustered data processing solutions go, Hadoop is a fairly painless one to use, relatively speaking. Naturally, it's a non-trivial task, at present, to process large amounts of data in Hadoop: It is only at version 0.20.0. It sounds like security is a big issue right now, and it currently takes around 20 to 30 minutes to get a crashed Name Node back up. The Name node is the single point of failure for the entire cluster. But, already, there seems to be a vibrant community of people building the ecosystem that will eventually make Hadoop a must-have platform in your data center and in your external clouds.

At its core, Hadoop is an implementation of map/reduce, coupled with a distributed file system. That means Hadoop manages the cluster; you just write the code needed for the actual data exploration. Yahoo is a big backer of the project and employs Cutting full-time. They're said to have a 4,000-node Hadoop cluster up and running, with Zookeeper acting as the sheriff when things go awry.

Data loads I have heard about, thus far, show people using Hadoop to crunch anywhere from 40 terabytes to almost a petabyte at once. That's a lot of customer data to sift through. I'm sure your business analytics people will be salivating when they start to play with Hive, Facebook's framework for accessing Hadoop data clusters via a SQL-like language.

Currently rated 3.0 by 5 people

  • Currently 3/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5

Share this link: http://www.sdtimes.com/blog/1459

Tags: , , , , ,

cloud

Comments

Add comment


 
 

biuquote
  • Comment




 
 
News on Monday
more>>
SharePoint Tech Report
more>>


   

 
 

Download Current Issue
FEBRUARY 2012 PDF ISSUE

Need Back Issues?
DOWNLOAD HERE

Want to subscribe?


 
blogs tab
Are you at risk for burnout?
Burnout is a severe problem and it can strike at any time. Here's how to tell if you are nearing the edge.
02/09/2012 02:16 PM EST

Agility, mom, and apple pie
If we're to evaluate the state-of-the-art in software development, we should start with the values espoused in the Agile Manifesto.
02/07/2012 11:57 AM EST

RIM woos developers with free tablet
How do you get more apps ported to the BlackBerry PlayBook? By giving every developer a free tablet, of course!
02/04/2012 01:57 PM EST

GitHire: Use Headhunters to Find Your Perfect Programmer
Are you a hiring manager tired of scouring the job boards? Check out this new service that will find 5 people interested in your jobs.
02/03/2012 12:17 PM EST

Facebook claims hacker cred
Facebook's SEC S-1 filing form includes a short essay on the Hacker Way by Mark Zuckerberg himself.
02/02/2012 08:26 AM EST

Ryan Dahl steps down
Ryan Dahl, creator of Node.js, steps back from his position as gatekeeper for the project.
02/01/2012 04:58 PM EST

 
Events calendar tab
2/13/2012 to 2/16/2012
Santa Clara
TechWeb

2/26/2012 to 2/29/2012
San Francisco
BZ Media

2/27/2012 to 3/2/2012
San Francisco
RSA

3/4/2012 to 3/7/2012
Las Vegas
IBM Tivoli

3/5/2012 to 3/9/2012
San Francisco
TechWeb