Neha Patil (Editor)

GermaNet

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

GermaNet is a lexical-semantic net for the German language that relates nouns, verbs, and adjectives semantically by grouping lexical units that express the same concept into synsets and by defining semantic relations between these synsets. GermaNet has much in common with the English WordNet and can be viewed as an on-line thesaurus or a light-weight ontology. GermaNet has been developed and maintained within various projects at the research group for General and Computational Linguistics, University of Tübingen since 1997. It has been integrated into the EuroWordNet, a multilingual lexical-semantic database.

Contents

Contents

GermaNet partitions the lexical space into a set of concepts that are interlinked by semantic relations. A semantic concept is modeled by a synset. A synset is a set of words (called lexical units) where all the words are taken to have (almost) the same meaning. Thus a synset is a set-representation of the semantic relation of synonymy, which means that it consists of a list of lexical units and a definition (paraphrase). The lexical units in turn have frames (which specify syntactic valence) and examples of their use. Just as in WordNet, for each word category the semantic space is divided into a number of semantic fields closely related to major nodes in the semantic network: Ort, or "location", Körper, or "body", etc.

The following is an up-to-date statistics of GermaNet's version 6.0 contents (release April 2011):

  • Number of synsets: 69594
  • Of which adjectives: 5991
  • Of which nouns: 53753
  • Of which verbs: 9850
  • Number of lexical units: 93407
  • Of which adjectives: 8582
  • Of which nouns: 71844
  • Of which verbs: 12981
  • Format

    All GermaNet data is stored in a relational PostgreSQL 5 database. The database model follows the internal structure of GermaNet: there are tables to store synsets, lexical units, conceptual and lexical relations, etc. The distribution format of all GermaNet data is XML. The two types of files, one for synsets and the other for relations, represent all data that is available in the GermaNet database.

    Interfaces

    There are several Application Programming Interfaces (API) available for Java and for Perl. These APIs are distributed freely and provide easy access to all information in various versions of GermaNet.

    Licenses

    GermaNet 6.0 (released April 2011) can be distributed under one of the following types of license agreements: Academic Research Agreement, Research and Development Agreement, or Commercial Agreement. GermaNet is free for academic use.

    Applications

    GermaNet has been used for a variety of applications, including semantic analysis, shallow recognition of implicit document structure, compound analysis; for analyzing selectional preferences, for word sense disambiguation, etc.

    References

    GermaNet Wikipedia