Rahul Sharma (Editor)

Unified Medical Language System

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

The Unified Medical Language System (UMLS) is a compendium of many controlled vocabularies in the biomedical sciences (created 1986). It provides a mapping structure among these vocabularies and thus allows one to translate among the various terminology systems; it may also be viewed as a comprehensive thesaurus and ontology of biomedical concepts. UMLS further provides facilities for natural language processing. It is intended to be used mainly by developers of systems in medical informatics.

Contents

UMLS consists of Knowledge Sources (databases) and a set of software tools.

The UMLS was designed and is maintained by the US National Library of Medicine, is updated quarterly and may be used for free. The project was initiated in 1986 by Donald A.B. Lindberg, M.D., then Director of the Library of Medicine.

Purpose and applications

The number of biomedical resources available to researchers is enormous. Often this is a problem due to the large volume of documents retrieved when the medical literature is searched. The purpose of the UMLS is to enhance access to this literature by facilitating the development of computer systems that understand biomedical language. This is achieved by overcoming two significant barriers: "the variety of ways the same concepts are expressed in different machine-readable sources & by different people" and "the distribution of useful information among many disparate databases & systems".

Licensing

Users of the system are required to sign a "UMLS agreement" and file brief annual usage reports. Academic users may use the UMLS free of charge for research purposes. Commercial or production use requires copyright licenses for some of the incorporated source vocabularies.

Inconsistencies and other errors

Given the size and complexity of the UMLS and its permissive policy on integrating terms, errors are inevitable. Errors include ambiguity and redundancy, hierarchical relationship cycles (a concept is both an ancestor and descendant to another), missing ancestors (semantic types of parent and child concepts are unrelated), and semantic inversion (the child/parent relationship with the semantic types is not consistent with the concepts).

These errors are discovered and resolved by auditing the UMLS. Manual audits can be very time-consuming and costly. Researchers have attempted to address the issue through a number of ways. Automated tools can be used to search for these errors. For structural inconsistencies (such as loops), a trivial solution based on the order would work. However, the same wouldn't apply when the inconsistency is at the term or concept level (context-specific meaning of a term). This requires an informed search strategy to be used (knowledge representation).

Supporting software tools

In addition to the knowledge sources, the National Library of Medicine also provides supporting tools.

Third party software

  • UMLS-Similarity, an open source software package that implements many measures of semantic similarity and relatedness.
  • UMLS-Similarity web interface, a web interface to UMLS-Similarity
  • References

    Unified Medical Language System Wikipedia