ALG Logo
About ALG
Tools
Downloads
Projects
Case Studies
DocumentsLogin
Overview
D2K - Data to Knowledge
D2K Streamline
I2K - Image to Knowledge
T2K - Text to Knowledge
E2K - Evolution to Knowledge
ThemeWeaver
T2K - Text to Knowledge Overview

Tech Notes
Documentation
FAQs

Download / Licensing

Demos and Tutorials



Knowledge discovery is the process of uncovering relationships in data previously unknown and extracting this knowledge from the data. Even using current data mining methods, understanding these data relationships can be a difficult task. Data stores in any given problem area are often huge, forcing decision-makers to construct complex queries to reflect the multiple dimensions of their problem domain. These decision-makers would benefit from tools that help highlight potential "information nuggets" and that help in the formation of the complex queries.

Often, a large percentage of these data stores is in the form of text. The T2K (Text to Knowledge) tool provides text mining and analysis capabilities that have been specially designed to operate in and capitalize upon the complexity of rich natural language domains of very large stores of text and multimedia documents.

T2K is a library of D2K modules that implements sophisticated algorithms for text analysis. Some of the types of functionality that are available include:

BulletAutomated Document Clustering
BulletAutomated Document Classification
BulletIntegration with GATE (http://gate.ac.uk)
BulletPart-of-Speech Tagging
BulletInformation Extraction
BulletBuilding Models for Very Large Document Stores
BulletRealtime Clustering of Very Large Document Stores
BulletCluster Visualizations
BulletAutomated Document Cleaning and Preparation
BulletTerm Stemming, Phrase Extraction, Tokenization, and Parsing


T2K Screen Shot

T2K Theme Graph Visualizaation

T2K Records View





    Copyright © 2004 The Board of Trustees of the University of Illinois, All Rights ReservedNCSA Logo