Escherichia coli (/ˌɛʃᵻˈrɪkiə ˈkoʊlaɪ/; also known as E. coli) is a gram-negative, facultatively anaerobic, rod-shaped, coliform bacterium of the genus Escherichia that is commonly found in the lower intestine of warm-blooded organisms (endotherms). Most E. coli strains are harmless, but some serotypes can cause serious food poisoning in their hosts, and are occasionally responsible for product recalls due to food contamination. The harmless strains are part of the normal flora of the gut, and can benefit their hosts by producing vitamin K2, and preventing colonization of the intestine with pathogenic bacteria. E. coli is expelled into the environment within fecal matter. The bacterium grows massively in fresh fecal matter under aerobic conditions for 3 days, but its numbers decline slowly afterwards.
- Type and morphology
- Culture growth
- Cell cycle
- Genetic adaptation
- Genome plasticity and evolution
- Neotype strain
- Gene nomenclature
- Normal microbiota
- Therapeutic use
- Role in disease
- Causes and risk factors
- Role in biotechnology
- Model organism
E. coli and other facultative anaerobes constitute about 0.1% of gut flora, and fecal–oral transmission is the major route through which pathogenic strains of the bacterium cause disease. Cells are able to survive outside the body for a limited amount of time, which makes them potential indicator organisms to test environmental samples for fecal contamination. A growing body of research, though, has examined environmentally persistent E. coli which can survive for extended periods outside of a host.
The bacterium can be grown and cultured easily and inexpensively in a laboratory setting, and has been intensively investigated for over 60 years. E. coli is a chemoheterotroph whose chemically defined medium must include a source of carbon and energy. E. coli is the most widely studied prokaryotic model organism, and an important species in the fields of biotechnology and microbiology, where it has served as the host organism for the majority of work with recombinant DNA. Under favorable conditions, it takes only 20 minutes to reproduce.
Type and morphology
E. coli is a Gram-negative, facultative anaerobic (that makes ATP by aerobic respiration if oxygen is present, but is capable of switching to fermentation or anaerobic respiration if oxygen is absent) and nonsporulating bacterium. Cells are typically rod-shaped, and are about 2.0 μm long and 0.25–1.0 μm in diameter, with a cell volume of 0.6–0.7 μm3.
E. coli stains Gram-negative because its cell wall is composed of a thin peptidoglycan layer and an outer membrane. During the staining process, E. coli picks up the color of the counterstain safranin and stains pink. The outer membrane surrounding the cell wall provides a barrier to certain antibiotics such that E. coli is not damaged by penicillin.
Strains that possess flagella are motile. The flagella have a peritrichous arrangement.
E. coli can live on a wide variety of substrates and uses mixed-acid fermentation in anaerobic conditions, producing lactate, succinate, ethanol, acetate, and carbon dioxide. Since many pathways in mixed-acid fermentation produce hydrogen gas, these pathways require the levels of hydrogen to be low, as is the case when E. coli lives together with hydrogen-consuming organisms, such as methanogens or sulphate-reducing bacteria.
Optimum growth of E. coli occurs at 37 °C (98.6 °F), but some laboratory strains can multiply at temperatures up to 49 °C (120 °F). E. coli grows in a variety of defined laboratory media, such as lysogeny broth, or any medium that contains glucose, ammonium phosphate, monobasic, sodium chloride, magnesium sulfate, potassium phosphate, dibasic, and water. Growth can be driven by aerobic or anaerobic respiration, using a large variety of redox pairs, including the oxidation of pyruvic acid, formic acid, hydrogen, and amino acids, and the reduction of substrates such as oxygen, nitrate, fumarate, dimethyl sulfoxide, and trimethylamine N-oxide. E. coli is classified as a facultative anaerobe. It uses oxygen when it is present and available. It can, however, continue to grow in the absence of oxygen using fermentation or anaerobic respiration. The ability to continue growing in the absence of oxygen is an advantage to bacteria because their survival is increased in environments where water predominates.
The bacterial cell cycle is divided into three stages. The B period occurs between the completion of cell division and the beginning of DNA replication. The C period encompasses the time it takes to replicate the chromosomal DNA. The D period refers to the stage between the conclusion of DNA replication and the end of cell division. The doubling rate of E. coli is higher when more nutrients are available. However, the length of the C and D periods do not change, even when the doubling time becomes less than the sum of the C and D periods. At the fastest growth rates, replication begins before the previous round of replication has completed, resulting in multiple replication forks along the DNA and overlapping cell cycles.
Unlike eukaryotes, prokaryotes do not rely upon either changes in gene expression or changes in protein synthesis to control the cell cycle. This probably explains why they do not have similar proteins to those used by eukaryotes to control their cell cycle, such as cdk1. This has led to research on what the control mechanism is in prokaryotes. Recent evidence suggests that it may be membrane- or lipid-based.
E. coli and related bacteria possess the ability to transfer DNA via bacterial conjugation or transduction, which allows genetic material to spread horizontally through an existing population. The process of transduction, which uses the bacterial virus called a bacteriophage, is where the spread of the gene encoding for the Shiga toxin from the Shigella bacteria to E. coli helped produce E. coli O157:H7, the Shiga toxin-producing strain of E. coli.
E. coli encompasses an enormous population of bacteria that exhibit a very high degree of both genetic and phenotypic diversity. Genome sequencing of a large number of isolates of E. coli and related bacteria shows that a taxonomic reclassification would be desirable. However, this has not been done, largely due to its medical importance, and E. coli remains one of the most diverse bacterial species: only 20% of the genes in a typical E. coli genome is shared among all strains.
In fact, from the evolutionary point of view, the members of genus Shigella (S. dysenteriae, S. flexneri, S. boydii, and S. sonnei) should be classified as E. coli strains, a phenomenon termed taxa in disguise. Similarly, other strains of E. coli (e.g. the K-12 strain commonly used in recombinant DNA work) are sufficiently different that they would merit reclassification.
A strain is a subgroup within the species that has unique characteristics that distinguish it from other strains. These differences are often detectable only at the molecular level; however, they may result in changes to the physiology or lifecycle of the bacterium. For example, a strain may gain pathogenic capacity, the ability to use a unique carbon source, the ability to take upon a particular ecological niche, or the ability to resist antimicrobial agents. Different strains of E. coli are often host-specific, making it possible to determine the source of fecal contamination in environmental samples. For example, knowing which E. coli strains are present in a water sample allows researchers to make assumptions about whether the contamination originated from a human, another mammal, or a bird.
A common subdivision system of E. coli, but not based on evolutionary relatedness, is by serotype, which is based on major surface antigens (O antigen: part of lipopolysaccharide layer; H: flagellin; K antigen: capsule), e.g. O157:H7). It is, however, common to cite only the serogroup, i.e. the O-antigen. At present, about 190 serogroups are known. The common laboratory strain has a mutation that prevents the formation of an O-antigen and is thus not typeable.
Genome plasticity and evolution
Like all lifeforms, new strains of E. coli evolve through the natural biological processes of mutation, gene duplication, and horizontal gene transfer; in particular, 18% of the genome of the laboratory strain MG1655 was horizontally acquired since the divergence from Salmonella. E. coli K-12 and E. coli B strains are the most frequently used varieties for laboratory purposes. Some strains develop traits that can be harmful to a host animal. These virulent strains typically cause a bout of diarrhea that is often self-limiting in healthy adults but is frequently lethal to children in the developing world. More virulent strains, such as O157:H7, cause serious illness or death in the elderly, the very young, or the immunocompromised.
The genera Escherichia and Salmonella diverged around 102 million years ago (credibility interval: 57–176 mya) which coincides with the divergence of their hosts: the former being found in mammals and the latter in birds and reptiles. This was followed by a split of an Escherichia ancestor into five species (E. albertii, E. coli, E. fergusonii, E. hermannii, and E. vulneris). The last E. coli ancestor split between 20 and 30 million years ago.
The long-term evolution experiments using E. coli, begun by Richard Lenski in 1988, have allowed direct observation of major evolutionary shifts in the laboratory. In this experiment, one population of E. coli unexpectedly evolved the ability to aerobically metabolize citrate, which is extremely rare in E. coli. As the inability to grow aerobically is normally used as a diagnostic criterion with which to differentiate E. coli from other, closely related bacteria, such as Salmonella, this innovation may mark a speciation event observed in the laboratory.
E. coli is the type species of the genus (Escherichia) and in turn Escherichia is the type genus of the family Enterobacteriaceae, where the family name does not stem from the genus Enterobacter + "i" (sic.) + "aceae", but from "enterobacterium" + "aceae" (enterobacterium being not a genus, but an alternative trivial name to enteric bacterium).
The original strain described by Escherich is believed to be lost, consequently a new type strain (neotype) was chosen as a representative: the neotype strain is U5/41T, also known under the deposit names DSM 30083, ATCC 11775, and NCTC 9001, which is pathogenic to chickens and has an O1:K1:H7 serotype. However, in most studies, either O157:H7, K-12 MG1655, or K-12 W3110 were used as a representative E. coli. The genome of the type strain has only lately been sequenced. Particularly the use of whole genome sequences yields highly supported phylogenies. Based on such data, five subspecies of E. coli were distinguished.
The link between phylogenetic distance ("relatedness") and pathology is small, e.g. the O157:H7 serotype strains, which form a clade ("an exclusive group")—group E below—are all enterohaemorragic strains (EHEC), but not all EHEC strains are closely related. In fact, four different species of Shigella are nested among E. coli strains (vide supra), while E. albertii and E. fergusonii are outside of this group. Indeed, all Shigella species were placed within a single subspecies of E. coli in a phylogenomic study that included the type strain, and for this reason an according reclassification is difficult. All commonly used research strains of E. coli belong to group A and are derived mainly from Clifton's K-12 strain (λ⁺ F⁺; O16) and to a lesser degree from d'Herelle's Bacillus coli strain (B strain)(O7).
The first complete DNA sequence of an E. coli genome (laboratory strain K-12 derivative MG1655) was published in 1997. It was found to be a circular DNA molecule 4.6 million base pairs in length, containing 4288 annotated protein-coding genes (organized into 2584 operons), seven ribosomal RNA (rRNA) operons, and 86 transfer RNA (tRNA) genes. Despite having been the subject of intensive genetic analysis for about 40 years, a large number of these genes were previously unknown. The coding density was found to be very high, with a mean distance between genes of only 118 base pairs. The genome was observed to contain a significant number of transposable genetic elements, repeat elements, cryptic prophages, and bacteriophage remnants.
Today, several hundred complete genomic sequences of Escherichia and Shigella species are available. The genome sequence of the type strain of E. coli has been added to this collection not before 2014. Comparison of these sequences shows a remarkable amount of diversity; only about 20% of each genome represents sequences present in every one of the isolates, while around 80% of each genome can vary among isolates. Each individual genome contains between 4,000 and 5,500 genes, but the total number of different genes among all of the sequenced E. coli strains (the pangenome) exceeds 16,000. This very large variety of component genes has been interpreted to mean that two-thirds of the E. coli pangenome originated in other species and arrived through the process of horizontal gene transfer.
Genes in E. coli are usually named by 4-letter acronyms that derive from their function (when known). For instance, recA is named after its role in homologous recombination plus the letter A. Functionally related genes are named recB, recC, recD etc. The proteins are named by uppercase acronyms, e.g. RecA, RecB, etc. When the genome of E. coli was sequenced, all genes were numbered (more or less) in their order on the genome and abbreviated by b numbers, such as b2819 (=recD) etc. The "b" names were created after Fred Blattner who led the genome sequence effort. Another numbering system was introduced with the sequence of another E. coli strain, W3110, which was sequenced in Japan and hence uses numbers starting by JW... (Japanese W3110), e.g. JW2787 (= recD). Hence, recD = b2819 = JW2787. Note, however, that most databases have their own numbering system, e.g. the EcoGene database uses EG10826 for recD. Finally, ECK numbers are specifically used for alleles in the MG1655 strain of E. coli K-12. Complete lists of genes and their synonyms can be obtained from databases such as EcoGene or Uniprot.
Several studies have investigated the proteome of E. coli. By 2006, 1,627 (38%) of the 4,237 open reading frames (ORFs) had been identified experimentally.
Protein complexes. A 2006 study purified 4,339 proteins from cultures of strain K-12 and found interacting partners for 2,667 proteins, many of which had unknown functions at the time. A 2009 study found 5,993 interactions between proteins of the same E. coli strain, though these data showed little overlap with those of the 2006 publication.
Binary interactions. Rajagopala et al. (2014) have carried out systematic yeast two-hybrid screens with most E. coli proteins, and found a total of 2,234 protein-protein interactions. This study also integrated genetic interactions and protein structures and mapped 458 interactions within 227 protein complexes.
E. coli belongs to a group of bacteria informally known as coliforms that are found in the gastrointestinal tract of warm-blooded animals. E. coli normally colonizes an infant's gastrointestinal tract within 40 hours of birth, arriving with food or water or from the individuals handling the child. In the bowel, E. coli adheres to the mucus of the large intestine. It is the primary facultative anaerobe of the human gastrointestinal tract. (Facultative anaerobes are organisms that can grow in either the presence or absence of oxygen.) As long as these bacteria do not acquire genetic elements encoding for virulence factors, they remain benign commensals.
Nonpathogenic E. coli strain Nissle 1917, also known as Mutaflor, and E. coli O83:K24:H31 (known as Colinfant) are used as probiotic agents in medicine, mainly for the treatment of various gastroenterological diseases, including inflammatory bowel disease.
Role in disease
Most E. coli strains do not cause disease, but virulent strains can cause gastroenteritis, urinary tract infections, and neonatal meningitis. It can also be characterized by severe abdominal cramps, diarrhea that typically turns bloody within 24 hours, and sometimes fever. In rarer cases, virulent strains are also responsible for bowel necrosis (tissue death) and perforation without progressing to hemolytic-uremic syndrome, peritonitis, mastitis, septicemia, and gram-negative pneumonia.
There is one strain, E.coli #0157:H7, that produces the Shiga toxin (classified as a bioterrorism agent). This toxin causes premature destruction of the red blood cells, which then clog the body's filtering system, the kidneys, causing hemolytic-uremic syndrome (HUS). This in turn causes strokes due to small clots of blood which lodge in capillaries in the brain. This causes the body parts controlled by this region of the brain not to work properly. In addition, this strain causes the buildup of fluid (since the kidneys do not work), leading to edema around the lungs and legs and arms. This increase in fluid buildup especially around the lungs impedes the functioning of the heart, causing an increase in blood pressure.
Uropathogenic E. coli (UPEC) is one of the main causes of urinary tract infections. It is part of the normal flora in the gut and can be introduced in many ways. In particular for females, the direction of wiping after defecation (wiping back to front) can lead to fecal contamination of the urogenital orifices. Anal intercourse can also introduce this bacterium into the male urethra, and in switching from anal to vaginal intercourse, the male can also introduce UPEC to the female urogenital system. For more information, see the databases at the end of the article or UPEC pathogenicity.
In May 2011, one E. coli strain, O104:H4, was the subject of a bacterial outbreak that began in Germany. Certain strains of E. coli are a major cause of foodborne illness. The outbreak started when several people in Germany were infected with enterohemorrhagic E. coli (EHEC) bacteria, leading to hemolytic-uremic syndrome (HUS), a medical emergency that requires urgent treatment. The outbreak did not only concern Germany, but also 11 other countries, including regions in North America. On 30 June 2011, the German Bundesinstitut für Risikobewertung (BfR) (Federal Institute for Risk Assessment, a federal institute within the German Federal Ministry of Food, Agriculture and Consumer Protection) announced that seeds of fenugreek from Egypt were likely the cause of the EHEC outbreak.
The mainstay of treatment is the assessment of dehydration and replacement of fluid and electrolytes. Administration of antibiotics has been shown to shorten the course of illness and duration of excretion of enterotoxigenic E. coli (ETEC) in adults in endemic areas and in traveller’s diarrhoea, though the rate of resistance to commonly used antibiotics is increasing and they are generally not recommended. The antibiotic used depends upon susceptibility patterns in the particular geographical region. Currently, the antibiotics of choice are fluoroquinolones or azithromycin, with an emerging role for rifaximin. Oral rifaximin, a semisynthetic rifamycin derivative, is an effective and well-tolerated antibacterial for the management of adults with non-invasive traveller’s diarrhoea. Rifaximin was significantly more effective than placebo and no less effective than ciprofloxacin in reducing the duration of diarrhoea. While rifaximin is effective in patients with E. coli-predominant traveller’s diarrhoea, it appears ineffective in patients infected with inflammatory or invasive enteropathogens.
ETEC is the type of E. coli that most vaccine development efforts are focused on. Antibodies against the LT and major CFs of ETEC provide protection against LT-producing ETEC expressing homologous CFs. Oral inactivated vaccines consisting of toxin antigen and whole cells, i.e. the licensed recombinant cholera B subunit (rCTB)-WC cholera vaccine Dukoral have been developed. There are currently no licensed vaccines for ETEC, though several are in various stages of development. In different trials, the rCTB-WC cholera vaccine provided high (85–100%) short-term protection. An oral ETEC vaccine candidate consisting of rCTB and formalin inactivated E. coli bacteria expressing major CFs has been shown in clinical trials to be safe, immunogenic, and effective against severe diarrhoea in American travelers but not against ETEC diarrhoea in young children in Egypt. A modified ETEC vaccine consisting of recombinant E. coli strains over expressing the major CFs and a more LT-like hybrid toxoid called LCTBA, are undergoing clinical testing.
Other proven prevention methods for E. coli transmission include handwashing and improved sanitation and drinking water, as transmission occurs through fecal contamination of food and water supplies.
Causes and risk factors
Role in biotechnology
Because of its long history of laboratory culture and ease of manipulation, E. coli plays an important role in modern biological engineering and industrial microbiology. The work of Stanley Norman Cohen and Herbert Boyer in E. coli, using plasmids and restriction enzymes to create recombinant DNA, became a foundation of biotechnology.
E. coli is a very versatile host for the production of heterologous proteins, and various protein expression systems have been developed which allow the production of recombinant proteins in E. coli. Researchers can introduce genes into the microbes using plasmids which permit high level expression of protein, and such protein may be mass-produced in industrial fermentation processes. One of the first useful applications of recombinant DNA technology was the manipulation of E. coli to produce human insulin.
Many proteins previously thought difficult or impossible to be expressed in E. coli in folded form have been successfully expressed in E. coli. For example, proteins with multiple disulphide bonds may be produced in the periplasmic space or in the cytoplasm of mutants rendered sufficiently oxidizing to allow disulphide-bonds to form, while proteins requiring post-translational modification such as glycosylation for stability or function have been expressed using the N-linked glycosylation system of Campylobacter jejuni engineered into E. coli.
E. coli is frequently used as a model organism in microbiology studies. Cultivated strains (e.g. E. coli K12) are well-adapted to the laboratory environment, and, unlike wild-type strains, have lost their ability to thrive in the intestine. Many laboratory strains lose their ability to form biofilms. These features protect wild-type strains from antibodies and other chemical attacks, but require a large expenditure of energy and material resources.
In 1946, Joshua Lederberg and Edward Tatum first described the phenomenon known as bacterial conjugation using E. coli as a model bacterium, and it remains the primary model to study conjugation. E. coli was an integral part of the first experiments to understand phage genetics, and early researchers, such as Seymour Benzer, used E. coli and phage T4 to understand the topography of gene structure. Prior to Benzer's research, it was not known whether the gene was a linear structure, or if it had a branching pattern.
E. coli was one of the first organisms to have its genome sequenced; the complete genome of E. coli K12 was published by Science in 1997.
By evaluating the possible combination of nanotechnologies with landscape ecology, complex habitat landscapes can be generated with details at the nanoscale. On such synthetic ecosystems, evolutionary experiments with E. coli have been performed to study the spatial biophysics of adaptation in an island biogeography on-chip.
Studies are also being performed attempting to program E. coli to solve complicated mathematics problems, such as the Hamiltonian path problem.
In 1885, the German-Austrian pediatrician Theodor Escherich discovered this organism in the feces of healthy individuals. He called it Bacterium coli commune because it is found in the colon. Early classifications of prokaryotes placed these in a handful of genera based on their shape and motility (at that time Ernst Haeckel's classification of bacteria in the kingdom Monera was in place).
Bacterium coli was the type species of the now invalid genus Bacterium when it was revealed that the former type species ("Bacterium triloculare") was missing. Following a revision of Bacterium, it was reclassified as Bacillus coli by Migula in 1895 and later reclassified in the newly created genus Escherichia, named after its original discoverer.