|
Welcome to the CAT - Content Analysis Toolkit website. We are currently in the public beta phase of CAT v1 working towards a production version for the second half of 2009. Our internal beta testing phase worked with a focus group of key users to test various deployment scenarios and screen for any undiscovered bugs. It is now time to stretch the borders again and further involve you the user. We want as many people as possible to download CAT and give us feedback in the form of suggestions and bug reports. Get involved in our forum discussions and test as many scenarios you can think of! 
|
|
|
What is the Content Analysis Toolkit? |
|
CAT (short for Content Analysis Toolkit) is a program that extracts value from large quantities of electronic text written in almost any language. Its ultimate goal is to discover topics embedded in a collection of electronic documents giving the user an overview of what the document collection is about without necessarily being knowledgeable about the contents of such documents. In addition, CAT helps you explore a document collection in a topic-guided way by building a model that captures the essence of a document collection. Topics in CAT consist of a number of words ordered by descending probability (i.e. words best describing the topic precedes words that describe the topic to a lesser extent). Associated documents and other closely related topics further characterise a topic. The user can navigate the model by focusing on a given element (i.e. a topic, document or word) and using their inter-relationships to focus on associated elements (i.e. highly appropriate topics, documents and words). The user, as an expert or a student in the field, may supplement each topic with a label to describe it better. The user may further open any document in the analysed document collection directly from the CAT interface to examine its content. As part of the process of exploring and understanding the content of a collection of documents users may further rate documents according to the perceived level of usefulness or quality. CAT also returns a vocabulary for each analysis consisting of all significant words and terms encountered in the specific analysis along with a weight indicating the level of generality or specificity of the given word or term. Users can further use the integrated look-up facilities to establish the meaning of a given word, acronym or term using Google, on-line dictionaries, gazetteers or Wikipedia. Words or terms may be supplemented with a definition or description to capture the newly discovered meaning of a unknown word or term. In summary, users can explore the model generated by CAT for a given document collection by: - Navigating from a given topic to associated documents
- Navigating from a given topic to associated topics
- Navigating from a given topic to associated words/terms
- Navigating from a given document to associated topics
- Navigating from a given document to associated documents
- Navigating from a given document to associated words/terms
- Navigating from a given word/term to associated topics
- Navigating from a given word/term to associated documents
- Navigating from a given word/term to associated words/terms
In summary, users can enrich the model generated by CAT by: - Adding labels and descriptions to topics
- Assigning ratings topics
- Assigning ratings to documents
- Assigning ratings to words/terms
- Adding descriptions to documents
- Adding descriptions to words/terms
Although not designed to primarily be a search engine, CAT offers the user unique search capabilities combining full text search with topic-based search functionality. |
|
|
|