Kalpana Kalpana (Editor)

G quadruplex

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit
G-quadruplex

In molecular biology, G-quadruplexes (also known as G4 DNA) are secondary structures formed in nucleic acids by sequences that are rich in guanine. These structures are four stranded helical structures and occur naturally in nature. They are normally located near the ends of the chromosomes or the better known as the telomeric regions and in transcriptional regulatory regions of multiple oncogenes. Four guanine bases can associate through Hoogsteen hydrogen bonding to form a square planar structure called a guanine tetrad, and two or more guanine tetrads can stack on top of each other to form a G-quadruplex. The placement and bonding to form G-quadruplexes are not random and serve very unusual functional purposes. The quadruplex structure is further stabilized by the presence of a cation, especially potassium, which sits in a central channel between each pair of tetrads. They can be formed of DNA, RNA, LNA, and PNA, and may be intramolecular, bimolecular, or tetramolecular. Depending on the direction of the strands or parts of a strand that form the tetrads, structures may be described as parallel or antiparallel. G-quadruplex structures can be computationally predicted from DNA or RNA sequence motifs, but their actual structures can be quite varied within and between the motifs, which can number over 100,000 per genome. Their activities in basic genetic processes are an active area of research in telomere, gene regulation, and functional genomics research (Rhodes et al., NAR 2015).

Contents

Quadruplex topology

The length of the nucleic acid sequences involved in tetrad formation determines how the quadruplex folds. Short sequences, consisting of only a single contiguous run of three or more guanine bases, require four individual strands to form a quadruplex. Such a quadruplex is described as tetramolecular, reflecting the requirement of four separate strands. Longer sequences, which contain two contiguous runs of three or more guanine bases, where the guanine regions are separated by one or more bases, only require two such sequences to provide enough guanine bases to form a quadruplex. These structures, formed from two separate G-rich strands, are termed bimolecular quadruplexes. Finally, sequences which contain four distinct runs of guanine bases can form stable quadruplex structures by themselves, and a quadruplex formed entirely from a single strand is called an intramolecular quadruplex.

Depending on how the individual runs of guanine bases are arranged in a bimolecular or intramolecular quadruplex, a quadruplex can adopt one of a number of topologies with varying loop configurations. If all strands of DNA proceed in the same direction, the quadruplex is termed parallel. For intramolecular quadruplexes, this means that any loop regions present must be of the propeller type, positioned to the sides of the quadruplex. If one or more of the runs of guanine bases has a 5’-3’ direction opposite to the other runs of guanine bases, the quadruplex is said to have adopted an antiparallel topology. The loops joining runs of guanine bases in intramolecular antiparallel quadruplexes are either diagonal, joining two diagonally opposite runs of guanine bases, or lateral (edgewise) type loops, joining two adjacent runs of guanine base pairs.

In quadruplexes formed from double-stranded DNA, possible interstrand topologies have also been discussed . Interstrand quadruplexes contain guanines that originate from both strands of dsDNA.

Telomeric quadruplexes

Telomeric repeats in a variety of organisms have been shown to form these quadruplex structures in vitro, and subsequently they have also been shown to form in vivo. The human telomeric repeat (which is the same for all vertebrates) consists of many repeats of the sequenced (GGTTAG), and the quadruplexes formed by this structure have been well studied by NMR and X-ray crystal structure determination. The formation of these quadruplexes in telomeres has been shown to decrease the activity of the enzyme telomerase, which is responsible for maintaining length of telomeres and is involved in around 85% of all cancers. This is an active target of drug discovery, including telomestatin.

Non-telomeric quadruplexes

Recently, there has been increasing interest in quadruplexes in locations other than at the telomere. For example, the proto-oncogene c-myc was shown to form a quadruplex in a nuclease hypersensitive region critical for gene activity. Since then, many other genes have been shown to have G-quadruplexes in their promoter regions, including the chicken β-globin gene, human ubiquitin-ligase RFP2 and the proto-oncogenes c-kit, bcl-2, VEGF, H-ras and N-ras. This list is ever-increasing.

Genome-wide surveys based on a quadruplex folding rule have been performed, which have identified 376,000 Putative Quadruplex Sequences (PQS) in the human genome, although not all of these probably form in vivo. A similar study has identified putative G-quadruplexes in prokaryotes. There are several possible models for how quadruplexes could influence gene activity, either by upregulation or downregulation. One model is shown below, with G-quadruplex formation in or near a promoter blocking transcription of the gene, and hence de-activating it. In another model, quadruplex formed at the non-coding DNA strand helps to maintain an open conformation of the coding DNA strand and enhance an expression of the respective gene.

Quadruplex function

Nucleic acid quadruplexes have been described as "structures in search of a function", as for many years there was minimal evidence pointing towards a biological role for these structures. It has been suggested that quadruplex formation plays a role in immunoglobulin heavy chain switching. As cells have evolved mechanisms for resolving (i.e., unwinding) quadruplexes that form, quadruplex formation may be potentially damaging for a cell; the helicases WRN and Bloom syndrome protein have a high affinity for resolving G4 DNA. More recently, there are many studies that implicate quadruplexes in both positive and negative transcriptional regulation, and in allowing programmed recombination of immunologlobin heavy genes and the pilin antigenic variation system of the pathogenic Neisseria. The roles of quadruplex structure in translation control are not as well explored. The direct visualization of quadruplex structures in human cells has provided an important confirmation of their existence. The potential positive and negative roles of quadruplexes in telomere replication and function remains controversial. T-loops and G-quadruplexes are described as the two tertiary DNA structures that protect telomere ends and regulate telomere length.

Ligands which bind quadruplexes

One way of inducing or stabilizing G-quadruplex formation is to introduce a molecule which can bind to the G-quadruplex structure. A number of ligands, both small molecules and proteins, which can bind to the G-quadruplex. These ligands can be naturally occurring or synthetic. This has become an increasingly large field of research in genetics, biochemistry, and pharmacology.

A number of naturally occurring proteins have been identified which selectively bind to G-quadruplexes. These include the helicases implicated in Bloom's and Werner's syndromes and the Saccharomyces cerevisiae protein RAP1. An artificially derived three zinc finger protein called Gq1, which is specific for G-quadruplexes has also been developed, as have specific antibodies.

Cationic porphyrins have been shown to bind intercalatively with G-quadruplexes, as well as the molecule telomestatin.

Quadruplex prediction techniques

Identifying and predicting sequences which have the capacity to form quadruplexes is an important tool in further understanding their role. Generally, a simple pattern match is used for searching for possible intrastrand quadruplex forming sequences: d(G3+N1-7G3+N1-7G3+N1-7G3+), where N is any nucleotide base (including guanine). This rule has been widely used in on-line algorithms.

References

G-quadruplex Wikipedia