Most Read Latest News Blog Resources

Database 'renaissance' gives developers choices




April 22, 2009 — 
Over the past two years, new database projects have become more and more common. From document databases, like Damien Katz's CouchDB written in Erlang and 10gen's MongoDB, to the recent fork of MySQL known as Drizzle and the expanding popularity of the Apache Hadoop project, database developers have been working hard to improve the state of the MapReduce.

Geir Magnusson is a member of the Apache Foundation, and he has been working with 10gen on MongoDB, a document database similar in function to CouchDB. Originally a cloud platform company, 10gen discovered that the database portion of its project was holding its own. Thus, the company ditched the platform and focused on MongoDB, which stores JSON elements in binary form to create persistence storage for Web-based applications.

“It's being referred to as this renaissance of databases," said Magnusson. "All of the sudden, from out of the Dark Ages we've got all these ideas that people are willing to try and use. It's a fantastic time for databases and data storage. The problem is [that] you say 'database' and people automatically think of a relational database."

He said that much of the new work in databases is outside of the relational domain.

“My position is that we have two things happening," said Magnusson. "One is…because of the way applications are moving towards the Web, and Internet developers are starting to have to deal with scale of datasets for activity. They traditionally haven't had to do that when they were producing department-level applications. They've been taking the LAMP stack and starting to find that when the things get big, it falls apart.

“Second, it appears that work over the last 30 years in distributed systems is really coming together and bearing fruit in terms of the kind of systems we're seeing.

“You see a lot more work around the key-value stores: Tokyo Cabinet and MemcacheDB, for example. There are a couple of implementations of Google BigTable, such as Apache Hadoop. Again, there's realization that of a lot of distributed systems research was correct."

These key-value storage mechanisms speed Web application performance and development by cutting out the fancy footwork and sticking to what databases do best: store information that can be indexed, searched and maintained quickly and easily. That means eliminating ERP-like functions and staying away from adding new features.

Katz has been working on CouchDB for over a year now, and he's convinced that his database can solve a number of problems in organizations that haven't yet been addressed in an open-source, scalable way.

“We get massive read scalability [because of Erlang]. The number of concurrent reads we can get is up to 20,000 clients hitting a serve," said Katz, describing the best use case for CouchDB.

"The only reason we haven't had more clients hitting a single server is the test tools couldn't generate more than that. The types of stuff that Microsoft SharePoint does would be a very good fit for CouchDB. Things that, if you weren't using a computer in the real world, you'd basically have a bunch of papers stacked and filed and pushed around."

Forking up
Then there are the new updates to old database ideas. Brian Aker has been working on MySQL since it was created, and is now an employee of Sun Microsystsems. Recently, he's spent his time working on a fork of MySQL named Drizzle. The project focuses on the 20% of the MySQL functionality that is used by 80% of the users. The goal is to create and offer what he describes as a more scalable, reliable and agile version of MySQL.

“With MySQL 5.0 and 5.1, we've created a database people want to use, but in the end that's not the bread and butter," said Aker. "What they were interested in was performance. So in April of last year we said, 'What can we pull out?' You can run 5.0 on an Amiga or VMS. We decided to not live in the past. We wanted to go back into MySQL and figure out what the majority of our users actually use. We wanted to remove every single feature that doesn't matter unless it matters specifically to the Web environment."

Drizzle foregoes many of the trappings of modern relational databases, said Aker. There is no support for stored procedures, and all user access controls have been removed as well. All of these things, he said, can be moved into the client and only slow down the database.

Amazon, on the other hand, offers two different types of hosted database storage within its Amazon Web Services Elastic Compute Cloud. The company unveiled SimpleDB last year, a stripped-down database service for use with EC2 applications. In March, the company also announced the availability of Apache Hadoop in its cloud, giving users access to a powerful Map Reduce function.

Adam Selipsky, vice president of product management and developer relations for Amazon Web Services, said that databases are always work to maintain, but SimpleDB hopes to remove some of the headaches.

“Running a relational database, irrespective of where you do it, takes a certain amount of work and administration," said Selipsky.

"There are a lot of use cases where people don't need that full functionality of a relational database. SimpleDB is not really meant to be the Swiss Army knife of databases. You're not going to do joins, you're not going to do complex math procedures. If you want to do data indexing and querying, that's 20% of the functionality, and 80% of all the scaling hassles go away from you.”


Related Search Term(s): databases


Share this link: http://www.sdtimes.com/link/33428
 

Add comment


Name*
Email*  
Country     


  • Comment
  • Preview
Loading



 
 
 
 
News on Monday
more>>
SharePoint Tech Report
more>>


   

 
 
Download Current Issue
ISSUE 3/15/2010 PDF

Need Back Issues?
DOWNLOAD HERE

Receive the print Edition?


 
blogs tab
Google Code turns 5
Google Code Turns 5, and adds a Paxos Algorithm to make the system more stable and reliable.
03/17/2010 11:16 AM EST

Test your Visual Studio 2010 know-how
Microsoft is offering free beta certification exams for Visual Studio 2010.
03/17/2010 11:08 AM EST

Microsoft lifts the hood on IE9
Microsoft is previewing IE9.
03/16/2010 01:10 PM EST

 

Events calendar tab
3/22/2010 to 3/25/2010
Santa Clara, Calif.
The Eclipse Foundation

4/12/2010 to 4/14/2010
Las Vegas
Penton Media

4/12/2010 to 4/15/2010
Santa Clara, Calif.
O'Reilly Media

4/19/2010
New York City
Flagg Management

4/25/2010 to 4/28/2010
Overland Park, Kans.
IIUG