 |
|
|
 |
Overview

Tech Notes

Documentation

FAQs

Download / Licensing

Demos and Tutorials

|


 |
|
In order to facilitate our research activities, ALG has, over the last few years, developed the D2K application environment for data mining. D2K - Data to Knowledge is a rapid, flexible data mining and machine learning system that integrates analytical data mining methods for prediction, discovery, and deviation detection, with data and information visualization tools. It offers a visual programming environment that allows users to connect programming modules together to build data mining applications and supplies a core set of modules, application templates, and a standard API for software component development. All D2K components are written in Java for maximum flexibility and portability.
Features and Functionality
Major features that D2K provides to an application developer include:
Visual Programming System Employing a Scalable Framework
Robust Computational Infrastructure
 Enables processor intensive applications
 Supports distributed computing
 Enables data intensive applications
 Provides low overhead for module execution
Flexible and Extensible Architecture
 Provides plug and play subsystem architectures and standard APIs
 Promotes code reuse and sharing
 Expedites custom software developments
 Relieves distributed computing burden
Rapid Application Development (RAD) Environment
Integrated Environment for Models and Visualization
|
 |


 |
D2K Module Development
The Automated Learning Group has developed hundreds of modules that address every part of the KDD process. some data mining algorithms implemented include Naive bayesian, Decision Trees, and apriori, as well as visualizations for the results of each of these approaches. In addition, we have developed modules for cleaning and transforming data sets and a number of visualization modules for deviation detection problems. Modules have also been created for specific projects and collaborations.
We are continuing development of modules with the short-term goal of enhancing our cleaning and transformation modules, improving the data mining algorithms and continuing development of feature subset selection modules. Long-term, we plan to continue devleopment of modules for predictive modeling, image analysis and textual analysis, particularly toward enabling them for distributed and parallel computing. This work enables us to make the latest research developments available to be used on real-world applications.
D2K-driven Applications
D2K can be used as a stand-along applications for developing data mining applications or developers can take advantage of the D2K infrastructure and D2K modules to build D2K-driven applications such as the ALG applications ThemeWeaver and I2K-Image to Knowledge. These applications employ D2K functionality in the background, using modules dynamically to construct applications. They present their own specialized user interfaces specific to the tasks being performed. More information on ThemeWeaver and I2K can be found in the Tools section of this website. Advantages of coupling with D2K to build highly functional data mining applications such as these include reduced development time through module reuse anad sharing, and access to D2K distributed computing and parallel processing capabilities.
|
|
 |