Most Read Latest News Blog Resources

DataRush tackles multicore woes




September 15, 2008 — 
To tackle the problems associated with re-architecting programs for parallel processing, Pervasive Software made available in late August a release candidate of its data flow-based DataRush multicore library for Java.

Steve Hochschild, manager of business development for DataRush, said that Pervasive believes the solution to multicore woes is the use of data flow rather than von Neumann programming models. “It's a fascinating approach. Unlike traditional von Neumann programming, where you have a program counter and the application marches down through the code line by line, in a data flow architecture, each component is unique and independent, and is only connected to others through input and output queues,” said Hochschild.

“When a piece of data arrives at the input queue, it gets taken in and immediately processed, then pushed out through the output queue. A lot of people call this pipelining. Spreadsheets are data flow implementations. If you type a value into a cell, that cell does what it's supposed to do, then as soon as that's done, any other cell that's using that cell as an input also crunches it down, and so on.”

Originally designed to help programmers with signal processing applications, data flow has been a programming model for decades. Only now, however, are programmers beginning to see that it's a great method for overcoming parallelism problems.

How it works with Java
DataRush comes in the form of Java libraries and a runtime component that decides where each component process will live and work. Hochschild said that designing an application to take advantage of DataRush is not likely to be effective for existing applications. But for projects starting from scratch and requiring intensive operations on a single large chunk of data, the potential for saving time is phenomenal.

“You really need to look at your application in a different way,” said Hochschild about architecting for DataRush. “You look at your application and decide what each piece of data needs to change. As this record is going through the system, what has to happen? You lay out the operators, and it's all 100% Java. If you can take that time and study the application in that way, we've seen huge productivity gains from the design-time benefits.”

Hochschild said that the included operators can be expanded upon, but they already have many of the basic functions developers need for complex data-intensive applications. These include text readers, positioners, sorters, aggregators and other common functions. Once the initial design is complete, said Hochschild, the time saved on development and on the actual running of the applications in production is significant.

Pervasive CTO, Mike Hoskins, said that the data flow approach to parallelism is gaining a following among the gurus that speak at JavaOne. He also said that the method hasn't become popular yet because it is so much more effective for bringing parallelism to inherently unparallel applications.

“I don't think people have thought hard about data-intensive parallelism,” said Hoskins. “There's one area of software parallelism where the solution is at hand. It's the sort of low hanging fruit of parallelism that is around virtualization. You can scale out by having multiple Web servers. It becomes fairly easy to scale the hardware. Your software isn't parallel, you're just running more copies of the same task.

“The difference in data intensive applications, where we play, is you can't do that,” said Hoskins. “You have to move to a world of parallelism. I don't think many people are looking at that problem. To make a single job run faster is a less common area of research, and it's a much harder problem.”

Pervasive hopes to have DataRush available as a full commercial product before the end of the year. Until then, developers can try out the first release candidate of the software at www.pervasivedatarush.com.


Related Search Term(s): Javamulticoresoftware developmentPervasive


Share this link: http://www.sdtimes.com/link/32792
 

Add comment


Name*
Email*  
Country     


  • Comment
  • Preview
Loading



 
 
 
 
News on Monday
more>>
SharePoint Tech Report
more>>


   

 
 
Download Current Issue
ISSUE 3/15/2010 PDF

Need Back Issues?
DOWNLOAD HERE

Receive the print Edition?


 
blogs tab
Google Code turns 5
Google Code Turns 5, and adds a Paxos Algorithm to make the system more stable and reliable.
03/17/2010 11:16 AM EST

Test your Visual Studio 2010 know-how
Microsoft is offering free beta certification exams for Visual Studio 2010.
03/17/2010 11:08 AM EST

Microsoft lifts the hood on IE9
Microsoft is previewing IE9.
03/16/2010 01:10 PM EST

 

Events calendar tab
3/22/2010 to 3/25/2010
Santa Clara, Calif.
The Eclipse Foundation

4/12/2010 to 4/14/2010
Las Vegas
Penton Media

4/12/2010 to 4/15/2010
Santa Clara, Calif.
O'Reilly Media

4/19/2010
New York City
Flagg Management

4/25/2010 to 4/28/2010
Overland Park, Kans.
IIUG