Suvarna Garge (Editor)

LGTE

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit
Development status
  
Active

Operating system
  
Cross-platform

Written in
  
Java

Platform
  
Java

Lucene Geographic and Temporal (LGTE) is an information retrieval tool developed at Technical University of Lisbon which can be used as a search engine or as evaluation system for information retrieval techniques for research purposes. The first implementation powered by LGTE was the search engine of DIGMAP, a project co-funded by the community programme eContentplus between 2006 and 2008, which was aimed to provide services available on the web over old digitized maps from a group of partners over Europe including several National Libraries.

The tool LGTE is built in Java Programming Language around the Lucene library for full-text search and introduces several extensions for dealing with geographical and temporal information. The package also includes utilities for information retrieval evaluation, such as classes for handling CLEF/TREC (Cross Language Evaluation Forúm/Text Retrieval Conference) topics and document collections.

Technically LGTE is a layer on the top of Lucene and provides an extended Lucene API to integrate several services like snippets generation, query expansion, and many others. The LGTE provides the chance to implement new probabilistic models. The API depends on a set of modifications at the Lucene level, originally created by the researchers of the University of Amsterdam in a software tool named Lucene-lm developed by the group of Information and Language Processing Systems (ILPS). At the time, the tool was tested with success for the Okapi BM25 model, and a multinomial language model, but also includes divergence from randomness models.

The LGTE 1.1.9 and later versions also provide the possibility to isolate the index fields in different index folders. Another recent feature is the configuration of Hierarchic Indexes using foreign key fields. This gives the chance to create scores for example based on the text of the sentence combined with the general score of the entire page.

Features

  • Provides Isolated Fields using different folders
  • Provides Hierarchic indexes through foreign key fields
  • Provides classes to parse documents using Yahoo PlaceMaker
  • Provides a simple and effective abstraction layer on top of Lucene
  • Supports integrated retrieval and ranking with basis on thematic, temporal and geographical aspects.
  • Supports the Lucene standard retrieval model, as well as the more advanced probabilistic retrieval approaches.
  • Supports Rochio Query Expansion.
  • Provides a framework for IR evaluation experiments (e.g. handling CLEF/TREC topics).
  • Includes a Java alternative to the trec_eval tool, capable of performing significance tests over pairs of runs.
  • Includes a simple test application for searching over the Braun Corpus or the Cranfield Corpus.
  • References

    LGTE Wikipedia