Puneet Varma (Editor)

RetrievalWare

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit
Written in
  
C, C++, Java

Type
  
Search and Index

Operating system
  
Cross-platform

Developer(s)
  
Fast Search & Transfer, Convera, Excalibur Technologies, ConQuest Software, Microsoft

Stable release
  
8.2 / October 13, 2006 (2006-10-13)

RetrievalWare is an enterprise search engine emphasizing natural language processing and semantic networks which was commercially available from 1992 to 2007 and is especially known for its use by government intelligence agencies.

Contents

History

RetrievalWare was initially created by Paul Nelson, Kenneth Clark, and Edwin Addison as part of ConQuest Software. Development began in 1989, but the software was not commercially available on a wide scale until 1992. Early funding was provided by Rome Laboratory via a Small Business Innovation Research grant.

On July 6, 1995, ConQuest Software was merged with the NASDAQ company, Excalibur Technologies and the product was rebranded as RetrievalWare. On December 21, 2000, Excalibur Technologies was combined with Intel Corporation's Interactive Media Services division to form the Convera Corporation. Finally, on April 9, 2007, the RetrievalWare software and business was purchased by Fast Search & Transfer at which point the product was officially retired. Microsoft Corporation continues to maintain the product for its existing customer base.

Annual revenues for RetrievalWare peaked in 2001 at around $40 million US dollars.

Use of natural language techniques

RetrievalWare is a relevancy ranking text search system with processing enhancements drawn from the fields of natural language processing (NLP) and semantic networks. NLP algorithms include dictionary-based stemming (also known as lemmatisation) and dictionary-based phrase identification. Semantic networks are used by RetrievalWare to expand the query words entered by the user to related terms with terms weights determined by the distance from the user's original terms. In addition to automatic expansion, a feedback-mode whereby users could choose the meaning of the word before performing the expansion was available. The first semantic networks were built using WordNet.

In addition, RetrievalWare implemented a form of n-gram search (branded as APRP - Adaptive Pattern Recognition Processing), designed to search over documents with OCR errors. Query terms are divided into sets of 2-grams which are used to locate similarly matching terms from the inverted index. The resulting matches are weighted based on similarly measures and then used to search for documents.

All of these features were available no later than 1993 and ConQuest software has claimed that it was the first commercial text-search system to implement these techniques.

Other notable features

Other notable features of RetrievalWare include distributed search servers, synchronizers for indexing external content management systems and relational databases, a heterogeneous security model, document categorization, real-time document-query matching (profiling), multi-lingual searches (queries containing terms from multiple languages searching for documents containing terms from multiple languages), and cross-lingual searches (queries in one language searching for documents in a different language).

Participation in TREC

RetrievalWare participated in the Text REtrieval Conference in 1992 (TREC-1), 1993 (TREC-2), and 1995 (TREC-4).

In TREC-1 and TREC-4, the RetrievalWare runs for manually entered queries produced the best results based on the 11-point averages over all search engines which participated in the ad hoc category where search engines are allowed a single opportunity to process previously unknown queries against an existing database.

References

RetrievalWare Wikipedia