Girish Mahajan (Editor)

Viral Bioinformatics Resource Center

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

The Viral Bioinformatics Resource Center (VBRC) is an online resource providing access to a database of curated viral genomes and a variety of tools for bioinformatic genome analysis. This resource is one of eight BRCs (Bioinformatics Resource Centers) funded by NIAID with the goal of promoting research against emerging and re-emerging pathogens, particularly those seen as potential bioterrorism threats. The VBRC is supported by researchers at the University of Victoria and University of Alabama at Birmingham.

Contents

The curated VBRC database contains all publicly available genomic sequences for seven of the virus families covered under the NIAID initiative, together with an additional six families specific to VBRC. A unique aspect of this resource relative to other genomic databases is its grouping of all annotated genes into ortholog groups (i.e. protein families) based on pre-run BLASTP sequence similarity searches.

The curated database is accessed through VOCS (Viral Orthologous Clusters), a downloadable Java-based user interface, and acts as the central information source for other programs of the VBRC workbench. These programs serve a variety of bioinformatic analysis functions (whole- or subgenome alignments, genome display, and several types of gene/protein sequence analysis). The majority of these tools are programmed to take user-supplied input as well.

Virus families covered in the VBRC database

The VBRC covers the following viruses under its NIAID mandate:

  • Arenaviridae
  • Bunyaviridae
  • Filoviridae
  • Flaviviridae
  • Paramyxoviridae
  • Poxviridae
  • Togaviridae
  • The following virus families are also covered by Virology.ca:

  • Adenoviridae
  • Asfarviridae
  • Baculoviridae
  • Herpesviridae
  • Iridoviridae
  • Organization of the VBRC database

    The VBRC database stores viral bioinformatic data on three levels:

    1. Whole genomes. This level contains information about the virus species or isolate and its entire genomic sequence.
    2. Annotated genes. This level contains all the predicted ORFs (open reading frames) in a particular virus genome, together with their DNA and (translated) protein sequences.
    3. Ortholog groups (families). This level is a distinguishing feature of the VBRC database. Each annotated gene, after it has been entered into the database, is subjected to BLASTP searching against all other genes already in the database. Based on the search results, it is either assigned to a pre-existing ortholog group or placed in a newly created ortholog group of its own. The goal of this level is to "allow for quick comparison of similar genes across a given virus family."

    Central Tools Provided by VBRC

    VBRC provides researchers with a wide variety of database-linked tools. Of these, the central four programs are VOCs, VGO, BBB, and JDotter.

    1. VOCs (Viral Orthologous Clusters)
      VOCs is the main database access interface. Users can search the available data by a number of criteria related to genome, gene, or ortholog group characteristics. Search results are displayed in table format; from here the user may obtain further information about a particular database entry, or launch a VOCs-linked tool (see below) for analysis of selected data. Additional analysis tools such as BLAST searches, genome maps, genome or gene alignment, phylogenetic trees, etc. are provided.
    2. VGO (Viral Genome Organizer)
      VGO is a Java-based interface used for viewing and searching viral genome sequences. Together with a graphical representation of the selected VBRC (or user-supplied) genome, the program displays information relevant to a genome of interest, including its genes, ORFs and start/stop codons. Tools are provided allowing the user to perform regular expression, a fuzzy motif, and masslist searches. VGO can also be used to identify related genes across multiple sequences.
    3. BBB (Base-by-Base)
      Base-By-Base is a platform-independent (Java-based), whole-genome pairwise and multiple alignment editor. The program highlights differences between consecutive pairs of sequences within an alignment, thus allowing the user to survey a large alignment at a single-residue level. Annotations from the VBRC database or user-supplied files are displayed alongside each sequence.
      Although Base-By-Base was intended as an editor and viewer for alignments of highly similar sequences, it also generates multiple alignments using ClustalW, T-Coffee and MUSCLE. Edit functions are provided to allow users to fine-tune such alignments manually; users may also annotate genomes with comments or primer sequences.
    4. JDotter
      JDotter is a Java-based user interface providing VBRC-linked access to the Linux version of Dotter. JDotter can both access pre-processed dotplots of the genome and gene (DNA or protein) sequences available in the VBRC database, and take user input for generation of new dotplots. JDotter also interfaces with the curated database or the user-supplied file to display supplementary feature data such as gene annotations.

    Other Tools Provided by VBRC

    VBRC provides a number of additional Java-based analysis tools on its website. Several are interfaces to pre-existing bioinformatics tools (e.g. napC, R’MES), while others were independently developed by VBRC. The tools in this category are each designed to perform a very specific task (e.g. regular expression searches, DNA skew plotting) and, though they can be accessed as stand-alone programs with user-supplied input, most have increased utility when launched from the central VOCS application with VBRC-supplied data.

    These additional tools are as follows:

  • Sequence Searcher performs regular expression and fuzzy motif searches of DNA or protein sequences, and is built into VOCS.
  • GFS (Genome Fingerprint Scanning) maps peptide mass fingerprint data to genomic sequences. It is built into VOCS.
  • NAP (Nucleotide Amino Acid Alignment) is a Java interface to napC, a program designed to align a nucleotide and protein sequence, taking terminal gaps and insertion/deletion mutations into account. It can be accessed from VOCS.
  • GraphDNA provides DNA skews and walks (a Cartesian plane-based representation of nucleotide content) from a VBRC database- or user-supplied DNA sequence. It is integrated into VOCS.
  • Hydrophobicity Plotter generates a hydrophobicity graph for a VBRC database- or user-supplied protein sequence. Three hydrophobicity scales (Kyte-Doolittle, Hopp-Woods, and Parker-Guo-Hodges) are supported; the graphing procedure is based on a sliding window of user-determined length. It can be accessed from VOCS.
  • CS (Codon Statistics) allows the user to generate statistical data and graphical representations of the nucleotide content of a VBRC database- or user-supplied DNA sequence (generally an entire genome).
  • JFreq (Java Word Frequencies) is a Java interface to the R’MES program, which allows the user to find and statistically analyze unusual frequencies or distributions of words (short sequence patterns) within a DNA sequence.
  • GATU (Genome Annotation Transfer Utility) allows a user to annotate a newly sequenced genome based on the annotations present in a reference genome; it can also predict new genes in the query genome.
  • REHAB (Recent Hits Acquired from BLAST) is a stand-alone program allowing the bioinformatics researcher to store and compare information obtained by successive PSI-BLAST runs of a single sequence against the continually updated NCBI Genbank database.
  • JIPS (Java GUI for InterProScan) is similar to REHAB in that it allows the user to identify new results (motifs, fingerprints, or domains) in successive searches of a protein sequence against the InterProScan database.
  • The VBRC also provides a number of Web-based, rather than downloadable, analysis tools on its site, including:

  • A Genome List and Ortholog Comparison tool, giving Web-based text and/or graphical access to much of the data supplied by VOCS (see above).
  • An XS-Blast tool allowing the researcher to form a private SQL database for storing, retrieving, and filtering the XML results of repeated BLAST searches for a particular query sequence.
  • A separate Hepatitis C (HCV) Database providing bioinformatic and immunological data. This represents a consortium between VBRC, the Immune Epitope Database and Analysis Resource (IEDB), and the Hepatitis C Virus Resource at the Los Alamos National Laboratory (HCV-LANL).
  • A separate Dengue Database providing bioinformatic, immunological, and epidemiological data. This represents a consortium between VBRC, the Immune Epitope Database and Analysis Resource (IEDB), and the Broad Institute Microbial Sequencing Center (MSC).
  • A Knowledge Database containing curated genes and graphical gene maps for several reference genomes selected from the Ebolaviridae, Flaviviridae, and Poxviridae families. Information presented includes functional, structural, and gene expression data, together with relevant references.
  • References

    Viral Bioinformatics Resource Center Wikipedia