IBM’s M2 corrals massive data sets with Hadoop



Email    print   
October 2, 2009 —  (Page 1 of 2)
With 1,386 members making up the two houses of the Parliament of the United Kingdom, there is certainly no shortage of government data flowing from the territories of Great Britain and Northern Ireland. Bills must be voted upon, elections must be carried out, and many other actions must and tracked.

That is one of the reasons why IBM created M2, an enterprise data analysis platform. M2, announced today at Hadoop World in New York, aims to help organizations better gather important government and business data. It was built using Apache Hadoop, an open-source Java framework that enables applications to work with large sets of data.

M2 is IBM’s latest Web 2.0 technology, joining the ranks of the Mashup Center mashup platform and WebSphere sMash Web application development environment.

Rod Smith, vice president of IBM’s emerging technologies unit, said M2 is different from other data analyzers because it is flexible and able to scale to large data sets. It can also integrate with other visualization and analytic engines, such as IBM’s Cognos business intelligence software.

Smith said customers spoke about how they didn’t know how to harvest vast amounts of data properly for business intelligence and analytics. “We scratched our heads about it for a while, and then when the Hadoop project got started up, it looked like a good foundation to build on where we could explore the idea of doing do-it-yourself analytics,” he said.

“It’s about deeper intelligence that’s more exploratory than what you’d think about from a data warehouse.”

In a demo with SD Times, IBM showed a BBC data mashup called “Digital Democracy,” which sifts through government-published data and makes that information easier to access for BBC journalists. The mashup can show which members of Parliament are working on what bills, as well as voting records, demographic trends and many other data points.

M2 has a spider that crawls the Internet to retrieve content, but content can come from other sources, such as internal databases. In the case of the “Digital Democracy” mashup, the spider collected a few million pages of content over four days, according to Stewart Nickolas, a distinguished engineer for IBM’s emerging technologies unit. For a crawl, a user will identify URLs he or she would like to begin with and how vast a search they want to conduct.



Related Search Term(s): Hadoop, IBM

Pages 1 2 


Share this link: http://sdt.bz/33808
 
Most Read Latest News Blog Resources

Add comment


Name*
Email*  
Country     


  • Comment
Loading




close
NEXT ARTICLE
Hadoop hits milestone 1.0 release
HBase and cluster managements tools are highlights of the new offering Read More...
 
 
 
 
News on Monday
more>>
SharePoint Tech Report
more>>


   

 
 

Download Current Issue
FEBRUARY 2012 PDF ISSUE

Need Back Issues?
DOWNLOAD HERE

Want to subscribe?


 
blogs tab
Agility, mom, and apple pie
If we're to evaluate the state-of-the-art in software development, we should start with the values espoused in the Agile Manifesto.
02/07/2012 11:57 AM EST

RIM woos developers with free tablet
How do you get more apps ported to the BlackBerry PlayBook? By giving every developer a free tablet, of course!
02/04/2012 01:57 PM EST

GitHire: Use Headhunters to Find Your Perfect Programmer
Are you a hiring manager tired of scouring the job boards? Check out this new service that will find 5 people interested in your jobs.
02/03/2012 12:17 PM EST

Facebook claims hacker cred
Facebook's SEC S-1 filing form includes a short essay on the Hacker Way by Mark Zuckerberg himself.
02/02/2012 08:26 AM EST

Ryan Dahl steps down
Ryan Dahl, creator of Node.js, steps back from his position as gatekeeper for the project.
02/01/2012 04:58 PM EST

Bloomberg opens its API
Bloomberg's APIs could lead to a future standard for accessing market data.
02/01/2012 04:41 PM EST

 
Events calendar tab
2/13/2012 to 2/16/2012
Santa Clara
TechWeb

2/26/2012 to 2/29/2012
San Francisco
BZ Media

2/27/2012 to 3/2/2012
San Francisco
RSA

3/4/2012 to 3/7/2012
Las Vegas
IBM Tivoli

3/5/2012 to 3/9/2012
San Francisco
TechWeb