Pfam clan CL0012
In biology, histones are highly alkaline proteins found in eukaryotic cell nuclei that package and order the DNA into structural units called nucleosomes. They are the chief protein components of chromatin, acting as spools around which DNA winds, and playing a role in gene regulation. Without histones, the unwound DNA in chromosomes would be very long (a length to width ratio of more than 10 million to 1 in human DNA). For example, each human diploid cell (containing 23 pairs of chromosomes) has about 1.8 meters of DNA, but wound on the histones it has about 90 micrometers (0.09 mm) of chromatin, which, when duplicated and condensed during mitosis, result in about 120 micrometers of chromosomes.
- Classes and histone variants
- Conservation across species
- Compacting DNA strands
- Chromatin regulation
- Lysine methylation
- Arginine methylation
- Arginine citrullination
- Lysine acetylation
- Serinethreoninetyrosine phosphorylation
- Functions in transcription
- Actively transcribed genes
- Repressed genes
- Bivalent promoters
- DNA damage
- Chromosome condensation
Classes and histone variants
Five major families of histones exist: H1/H5, H2A, H2B, H3, and H4. Histones H2A, H2B, H3 and H4 are known as the core histones, while histones H1/H5 are known as the linker histones.
The core histones all exist as dimers, which are similar in that they all possess the histone fold domain; three alpha helices linked by two loops. It is this helical structure that allows for interaction between distinct dimers, particularly in a head-tail fashion (also called the handshake motif). The resulting four distinct dimers then come together to form one octameric nucleosome core, approximately 63 Angstroms in diameter (a solenoid (DNA)-like particle). Around 146 base pairs (bp) of DNA wrap around this core particle 1.65 times in a left-handed super-helical turn to give a particle of around 100 Angstroms across. The linker histone H1 binds the nucleosome at the entry and exit sites of the DNA, thus locking the DNA into place and allowing the formation of higher order structure. The most basic such formation is the 10 nm fiber or beads on a string conformation. This involves the wrapping of DNA around nucleosomes with approximately 50 base pairs of DNA separating each pair of nucleosomes (also referred to as linker DNA). Higher-order structures include the 30 nm fiber (forming an irregular zigzag) and 100 nm fiber, these being the structures found in normal cells. During mitosis and meiosis, the condensed chromosomes are assembled through interactions between nucleosomes and other regulatory proteins.
Histones are subdivided into canonical replication-dependent histones that are expressed during the S-phase of cell cycle and replication-independent histone variants, expressed during the whole cell cycle. In animals, genes encoding canonical histones are typically clustered along the chromosome, lack introns and use a stem loop structure at the 3’ end instead of a polyA tail. Genes encoding histone variants are usually not clustered, have introns and their mRNAs are regulated with polyA tails. Complex multicellular organisms typically have a higher number of histone variants providing a variety of different functions. Recent data are accumulating about the roles of diverse histone variants highlighting the functional links between variants and the delicate regulation of organism development. Histone variants from different organisms, their classification and variant specific features can be found in "HistoneDB 2.0 - Variants" database.
The following is a list of human histone proteins:
The nucleosome core is formed of two H2A-H2B dimers and a H3-H4 tetramer, forming two nearly symmetrical halves by tertiary structure (C2 symmetry; one macromolecule is the mirror image of the other). The H2A-H2B dimers and H3-H4 tetramer also show pseudodyad symmetry. The 4 'core' histones (H2A, H2B, H3 and H4) are relatively similar in structure and are highly conserved through evolution, all featuring a 'helix turn helix turn helix' motif (which allows the easy dimerisation). They also share the feature of long 'tails' on one end of the amino acid structure - this being the location of post-translational modification (see below).
It has been proposed that histone proteins are evolutionarily related to the helical part of the extended AAA+ ATPase domain, the C-domain, and to the N-terminal substrate recognition domain of Clp/Hsp100 proteins. Despite the differences in their topology, these three folds share a homologous helix-strand-helix (HSH) motif.
Using an electron paramagnetic resonance spin-labeling technique, British researchers measured the distances between the spools around which eukaryotic cells wind their DNA. They determined the spacings range from 59 to 70 Å.
In all, histones make five types of interactions with DNA:
The highly basic nature of histones, aside from facilitating DNA-histone interactions, contributes to their water solubility.
Histones are subject to post translational modification by enzymes primarily on their N-terminal tails, but also in their globular domains. Such modifications include methylation, citrullination, acetylation, phosphorylation, SUMOylation, ubiquitination, and ADP-ribosylation. This affects their function of gene regulation.
In general, genes that are active have less bound histone, while inactive genes are highly associated with histones during interphase. It also appears that the structure of histones has been evolutionarily conserved, as any deleterious mutations would be severely maladaptive. All histones have a highly positively charged N-terminus with many lysine and arginine residues.
Histones were discovered in 1884 by Albrecht Kossel. The word "histone" dates from the late 19th century and is from the German word "Histon", a word itself of uncertain origin - perhaps from the Greek histanai or histos. Until the early 1990s, histones were dismissed by most as inert packing material for eukaryotic nuclear DNA, a view based in part on the models of Mark Ptashne and others, who believed that transcription was activated by protein-DNA and protein-protein interactions on largely naked DNA templates, as is the case in bacteria.
During the 1980s, Yahli Lorch and Roger Kornberg showed that a nucleosome on a core promoter prevents the initiation of transcription in vitro, and Michael Grunstein demonstrated that histones repress transcription in vivo, leading to the idea of the nucleosome as a general gene repressor. Relief from repression is believed to involve both histone modification and the action of chromatin-remodeling complexes. Vincent Allfrey and Alfred Mirsky earlier proposed a role of histone modification in transcriptional activation, regarded as a molecular manifestation of epigenetics. Michael Grunstein and David Allis found support for this proposal, in the importance of histone acetylation for transcription in yeast and the activity of the transcriptional activator Gcn5 as a histone acetyltransferase.
The discovery of the H5 histone appears to date back to the 1970s, and it is now considered an isoform of Histone H1.
Conservation across species
Histones are found in the nuclei of eukaryotic cells, and in certain Archaea, namely Thermoproteales and Euryarchaea, but not in bacteria. The unicellular algae known as dinoflagellates were previously thought to be the only eukaryotes that completely lack histones, however, later studies showed that their DNA still encodes histone genes.
Archaeal histones may well resemble the evolutionary precursors to eukaryotic histones. Histone proteins are among the most highly conserved proteins in eukaryotes, emphasizing their important role in the biology of the nucleus. In contrast mature sperm cells largely use protamines to package their genomic DNA, most likely because this allows them to achieve an even higher packaging ratio.
Core histones are highly conserved proteins; that is, there are very few differences among the amino acid sequences of the histone proteins of different species.
There are some variant forms in some of the major classes. They share amino acid sequence homology and core structural similarity to a specific class of major histones but also have their own feature that is distinct from the major histones. These minor histones usually carry out specific functions of the chromatin metabolism. For example, histone H3-like CenpA is associated with only the centromere region of the chromosome. Histone H2A variant H2A.Z is associated with the promoters of actively transcribed genes and also involved in the prevention of the spread of silent heterochromatin. Furthermore, H2A.Z has roles in chromatin for genome stability. Another H2A variant H2A.X is phosphorylated at S139 in regions around double-strand breaks and marks the region undergoing DNA repair. Histone H3.3 is associated with the body of actively transcribed genes.
Compacting DNA strands
Histones act as spools around which DNA winds. This enables the compaction necessary to fit the large genomes of eukaryotes inside cell nuclei: the compacted molecule is 40,000 times shorter than an unpacked molecule.
Histones undergo posttranslational modifications that alter their interaction with DNA and nuclear proteins. The H3 and H4 histones have long tails protruding from the nucleosome, which can be covalently modified at several places. Modifications of the tail include methylation, acetylation, phosphorylation, ubiquitination, SUMOylation, citrullination, and ADP-ribosylation. The core of the histones H2A and H2B can also be modified. Combinations of modifications are thought to constitute a code, the so-called "histone code". Histone modifications act in diverse biological processes such as gene regulation, DNA repair, chromosome condensation (mitosis) and spermatogenesis (meiosis).
The common nomenclature of histone modifications is:
So H3K4me1 denotes the monomethylation of the 4th residue (a lysine) from the start (i.e., the N-terminal) of the H3 protein.
Examples of histone modifications in transcription regulation include:
A huge catalogue of histone modifications have been described, but a functional understanding of most is still lacking. Collectively, it is thought that histone modifications may underlie a histone code, whereby combinations of histone modifications have specific meanings. However, most functional data concerns individual prominent histone modifications that are biochemically amenable to detailed study.
The addition of one, two or three methyl groups to lysine has little effect on the chemistry of the histone; methylation leaves the charge of the lysine intact and adds a minimal number of atoms so steric interactions are mostly unaffected. However, proteins containing Tudor, chromo or PHD domains, amongst others, can recognise lysine methylation with exquisite sensitivity and differentiate mono, di and tri-methyl lysine, to the extent that, for some lysines (e.g.: H4K20) mono, di and tri-methylation appear to have different meanings. Because of this, lysine methylation tends to be a very informative mark and dominates the known histone modification functions.
What was said above of the chemistry of lysine methylation also applies to arginine methylation, and some protein domains—e.g., Tudor domains—can be specific for methyl arginine instead of methyl lysine. Arginine is known to be mono- or di-methylated, and methylation can be symmetric or asymmetric, potentially with different meanings.
Enzymes called peptidylarginine deiminases (PADs) hydrolyze the imine group of arginines and attach a keto group, so that there is one less positive charge on the amino acid residue. This process has been involved in the activation of gene expression by making the modified histones less tightly bound to DNA and thus making the chromatin more accessible. PADs can also produce the opposite effect by removing or inhibiting mono-methylation of arginine residues on histones and thus antagonizing the positive effect arginine methylation has on transcriptional activity.
Addition of an acetyl group has a major chemical effect on lysine as it neutralises the positive charge. This reduces electrostatic attraction between the histone and the negatively charged DNA backbone, loosening the chromatin structure; highly acetylated histones form more accessible chromatin and tend to be associated with active transcription. Lysine acetylation appears to be less precise in meaning than methylation, in that histone acetyltransferases tend to act on more than one lysine; presumably this reflects the need to alter multiple lysines to have a significant effect on chromatin structure. The modification includes H3K27ac.
Addition of a negatively charged phosphate group can lead to major changes in protein structure, leading to the well-characterised role of phosphorylation in controlling protein function. It is not clear what structural implications histone phosphorylation has, but histone phosphorylation has clear functions as a post-translational modification, and binding domains such as BRCT have been characterised.
Functions in transcription
Most well-studied histone modifications are involved in control of transcription.
Actively transcribed genes
Two histone modifications are particularly associated with active transcription:
Three histone modifications are particularly associated with repressed genes:
Analysis of histone modifications in embryonic stem cells (and other stem cells) revealed many gene promoters carrying both H3K4Me3 and H3K27Me3, in other words these promoters display both activating and repressing marks simultaneously. This peculiar combination of modifications marks genes that are poised for transcription; they are not required in stem cells, but are rapidly required after differentiation into some lineages. Once the cell starts to differentiate, these bivalent promoters are resolved to either active or repressive states depending on the chosen lineage.
Marking sites of DNA damage is an important function for histone modifications. It also protects DNA from getting destroyed by ultraviolet radiation of sun.