Index Engines, an enterprise data management and archiving company, has announced the release of the Catalyst Data Profiling Engine, designed to be a cost- and time-effective solution to managing Big Data.

The Catalyst Data Profiling Engine processes all forms of unstructured files and e-mail databases, creating a searchable index of what exists, where it is located, who owns it, when it was last accessed and what key terms are in it. High-level summary reports allow instant insight into enterprise storage providing never before knowledge of data assets.  Through this process, unknown or lost data is found and decisions can be made on its disposition.

Data profiling relies on an enterprise class index of metadata from user files and email such as last modified or accessed time, number of duplicates, size, owner, location, file type and more. Using summary reports combined with filters, users can view content on specific servers or locations, and see a chart of top owners by capacity, age of data, files by type and much more.

Optionally, data profiling can look beyond metadata and go deep within documents and e-mail to find content supporting e-Discovery keyword searches or even confidential information compliance assurance audits for sensitive content misplaced behind the firewall in PSTs or on the wrong server. Once the data is located it can be remediated, archived or even migrated to a different storage platform. Validation checkpoints and detailed disposition reports are also built in. 

The Catalyst Data Profiling Engine is designed for large enterprise class environments allowing organizations to uncover and analyze unstructured and mystery data, creating an index of the information that is only a one percent footprint resulting in extreme scalability. From there, the indexing engine, Version 5.0, allows action to take place on the data.

Features include:

Deletion with Validation: Manage the defensible deletion of unstructured data using validation to ensure the content has not changed since it was profiled.  Validation checks the modified date or optionally the signature of the document.

Defensible Audit Logs: As disposition of the data is performed, including deletion, logs will be maintained that detail the date and disposition of the document, including the user that executed the disposition.

Expanded Duplicate Reports: Summary reports include duplicates by file type, owner, age, location and more. These reports allow for deeper profiling of redundant content.

Report Scheduling and History: Stored reports can be scheduled to run on a periodic basis and the results can be stored in order to access a historical perspective of the data environment.  This allows a view into the data center including the incremental change of the content based on historical reports.

Increased Capacity: This version breaks the 1PB barrier and now supports metadata profiling of up to 1PB of unstructured data using a single engine. This unprecedented scale and efficiency is unmatched in the market and allows for enterprise class data profiles to be achieved.