Most Read Latest News Blog Resources

Infosolve Unveils Data Quality Platform


OpenDQ V2.0 offers data cleanup with a dash of ETL



November 15, 2007 — 
It may not seem possible for data to rot, but it does. The problem isn’t nearly as acute with neatly structured data; that’s relatively easy to keep fresh and solutions for that sort of problem have existed for years. Instead, unstructured data is the challenge facing enterprises today. Free text data in the form of documents and messages is the missing link for many organizations, foiling attempts at categorizing and integrating into a company’s knowledge assets.

Infosolve Technologies released on Oct. 23 OpenDQ version 2.0, a Java-based open source data quality and ETL (extract, transform and load) platform released under GPLv2. It includes reporting tools with dashboards for monitoring data quality, and customer rule development for flagging nonconforming records.

The company’s objective with OpenDQ V2.0 was to “provide businesses of all sizes innovative, cost-effective data integration and data quality solutions that maximize the quality of their data,” noted company vice president Subbu Manchiraju.

OpenDQ V2.0 offers users the choice of batch or interactive processing, offers workflow and scheduling capabilities, with integration options into other enterprise data processes using drag-and-drop components.

The data profiling tools allow users to develop standards while discovering hidden formats, patterns and rules. OpenDQ can generate reports on column minimums, maximums and averages, measure rule compliance across data sets and provide point-in-time data profiling history. Reports are available in common formats including CSV, HTML and PDF.

OpenDQ allows users to develop data dictionaries, and the company offers external data enhancement along demographic, firmographic and psychographic lines through a subscription service. The company uses grid computing technology from Sun Microsystems to support its services infrastructure.

‘Fuzzy Matching’
Infosolve includes “fuzzy matching” in OpenDQ, identifying relationships with inclusive algorithms. It enables master data and reference data comparisons, and offers matching capabilities across multiple data attributes and multiple sources.

Standardization and de-duplication are also in the purview of OpenDQ V2.0, which has the ability to compare against the master set, and offers auditing and logging of the de-duplication process. It also allows for removal of nonconforming data and so-called “noise.”

The unstructured data interface of OpenDQ uses natural language processing to capture key data elements from sources, and provides what the company claims is “seamless” integration of structured and unstructured data.

Manchiraju noted, “We see this expansion of Infosolve’s solutions as another step in our commitment to help businesses achieve maximum profitability, with comprehensive data integration and data quality management solutions.”


Share this link: http://www.sdtimes.com/link/31370
 

Add comment


Name*
Email*  
Country     


  • Comment
  • Preview
Loading



 
 
 
 
News on Monday
more>>
SharePoint Tech Report
more>>


   

 
 
Download Current Issue
ISSUE 3/15/2010 PDF

Need Back Issues?
DOWNLOAD HERE

Receive the print Edition?


 
blogs tab
Google Code turns 5
Google Code Turns 5, and adds a Paxos Algorithm to make the system more stable and reliable.
03/17/2010 11:16 AM EST

Test your Visual Studio 2010 know-how
Microsoft is offering free beta certification exams for Visual Studio 2010.
03/17/2010 11:08 AM EST

Microsoft lifts the hood on IE9
Microsoft is previewing IE9.
03/16/2010 01:10 PM EST

 

Events calendar tab
3/22/2010 to 3/25/2010
Santa Clara, Calif.
The Eclipse Foundation

4/12/2010 to 4/14/2010
Las Vegas
Penton Media

4/12/2010 to 4/15/2010
Santa Clara, Calif.
O'Reilly Media

4/19/2010
New York City
Flagg Management

4/25/2010 to 4/28/2010
Overland Park, Kans.
IIUG