Rahul Sharma (Editor)

Molecular graphics

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit
Molecular graphics

Molecular graphics (MG) is the discipline and philosophy of studying molecules and their properties through graphical representation. IUPAC limits the definition to representations on a "graphical display device". Ever since Dalton's atoms and Kekulé's benzene, there has been a rich history of hand-drawn atoms and molecules, and these representations have had an important influence on modern molecular graphics. This article concentrates on the use of computers to create molecular graphics. Note, however, that many molecular graphics programs and systems have close coupling between the graphics and editing commands or calculations such as in molecular modelling.

Contents

Relation to molecular models

There has been a long tradition of creating molecular models from physical materials. Perhaps the best known is Crick and Watson's model of DNA built from rods and planar sheets, but the most widely used approach is to represent all atoms and bonds explicitly using the "ball and stick" approach. This can demonstrate a wide range of properties, such as shape, relative size, and flexibility. Many chemistry courses expect that students will have access to ball and stick models. One goal of mainstream molecular graphics has been to represent the "ball and stick" model as realistically as possible and to couple this with calculations of molecular properties.

Figure 1 shows a small molecule (NH
3
CH
2
CH
2
C(OH)(PO
3
H)(PO
3
H)-
), as drawn by the Jmol program. It is important to realize that the colors and shapes are purely a convention, as individual atoms are not colored, nor do they have hard surfaces. Bonds between atoms are also not rod-shaped.

Comparison of physical models with molecular graphics

Physical models and computer models have partially complementary strengths and weaknesses. Physical models can be used by those without access to a computer and now can be made cheaply out of plastic materials. Their tactile and visual aspects cannot be easily reproduced by computers (although haptic devices have occasionally been built). On a computer screen, the flexibility of molecules is also difficult to appreciate; illustrating the pseudorotation of cyclohexane is a good example of the value of mechanical models.

However, it is difficult to build large physical molecules, and all-atom physical models of even simple proteins could take weeks or months to build. Moreover, physical models are not robust and they decay over time. Molecular graphics is particularly valuable for representing global and local properties of molecules, such as electrostatic potential. Graphics can also be animated to represent molecular processes and chemical reactions, a feat that is not easy to reproduce physically.

History

Initially the rendering was on early Cathode ray tube screens or through plotters drawing on paper. Molecular structures have always been an attractive choice for developing new computer graphics tools, since the input data are easy to create and the results are usually highly appealing. The first example of MG was a display of a protein molecule (Project MAC, 1966) by Cyrus Levinthal and Robert Langridge. Among the milestones in high-performance MG was the work of Nelson Max in "realistic" rendering of macromolecules using reflecting spheres.

By about 1980 many laboratories both in academia and industry had recognized the power of the computer to analyse and predict the properties of molecules, especially in materials science and the pharmaceutical industry. The discipline was often called "molecular graphics" and in 1982 a group of academics and industrialists in the UK set up the Molecular Graphics Society (MGS). Initially much of the technology concentrated either on high-performance 3D graphics, including interactive rotation or 3D rendering of atoms as spheres (sometimes with radiosity). During the 1980s a number of programs for calculating molecular properties (such as molecular dynamics and quantum mechanics) became available and the term "molecular graphics" often included these. As a result, the MGS has now changed its name to the Molecular Graphics and Modelling Society (MGMS).

The requirements of macromolecular crystallography also drove MG because the traditional techniques of physical model-building could not scale. The first two protein structures solved by molecular graphics without the aid of the Richards' Box were built with Stan Swanson's program FIT on the Vector General graphics display in the laboratory of Edgar Meyer at Texas A&M University: First Marge Legg in Al Cotton's lab at A&M solved the structure of staph. nuclease (1975) and then Jim Hogle solved the structure of monoclinic lysozyme in 1976. A full year passed before other graphics systems were used to replace the Richards' Box for modelling into density in 3-D. Alwyn Jones' FRODO program (and later "O") were developed to overlay the molecular electron density determined from X-ray crystallography and the hypothetical molecular structure.

In 2009 BALLView became the first software to use Raytracing for molecular graphics.

Art, science and technology in molecular graphics

Both computer technology and graphic arts have contributed to molecular graphics. The development of structural biology in the 1950s led to a requirement to display molecules with thousands of atoms. The existing computer technology was limited in power, and in any case a naive depiction of all atoms left viewers overwhelmed. Most systems therefore used conventions where information was implicit or stylistic. Two vectors meeting at a point implied an atom or (in macromolecules) a complete residue (10-20 atoms).

The macromolecular approach was popularized by Dickerson and Geis' presentation of proteins and the graphic work of Jane Richardson through high-quality hand-drawn diagrams such as the "ribbon" representation. In this they strove to capture the intrinsic 'meaning' of the molecule. This search for the "messages in the molecule" has always accompanied the increasing power of computer graphics processing. Typically the depiction would concentrate on specific areas of the molecule (such as the active site) and this might have different colors or more detail in the number of explicit atoms or the type of depiction (e.g., spheres for atoms).

In some cases the limitations of technology have led to serendipitous methods for rendering. Most early graphics devices used vector graphics, which meant that rendering spheres and surfaces was impossible. Michael Connolly's program "MS" calculated points on the surface-accessible surface of a molecule, and the points were rendered as dots with good visibility using the new vector graphics technology, such as the Evans and Sutherland PS300 series. Thin sections ("slabs") through the structural display showed very clearly the complementarity of the surfaces for molecules binding to active sites, and the "Connolly surface" became a universal metaphor.

The relationship between the art and science of molecular graphics is shown in the exhibitions sponsored by the Molecular Graphics Society. Some exhibits are created with molecular graphics programs alone, while others are collages, or involve physical materials. An example from Mike Hann (1994), inspired by Magritte's painting Ceci n'est pas une pipe, uses an image of a salmeterol molecule. "Ceci n'est pas une molecule," writes Mike Hann, "serves to remind us that all of the graphics images presented here are not molecules, not even pictures of molecules, but pictures of icons which we believe represent some aspects of the molecule's properties."

Colour molecular graphics is often use on chemistry journal covers in an artistic manner.

Space-filling models

Fig. 4 is a "space-filling" representation of formic acid, where atoms are drawn as solid spheres to suggest the space they occupy. This and all space-filling models are necessarily icons or abstractions: atoms are nuclei with electron "clouds" of varying density surrounding them, and as such have no actual surfaces. For many years the size of atoms has been approximated by physical models (CPK) in which the volumes of plastic balls describe where much of the electron density is to be found (often sized to van der Waals radii). That is, the surface of these models is meant to represent a specific level of density of the electron cloud, not any putative physical surface of the atom.

Since the atomic radii (e.g. in Fig. 4) are only slightly less than the distance between bonded atoms, the iconic spheres intersect, and in the CPK models, this was achieved by planar truncations along the bonding directions, the section being circular. When raster graphics became affordable, one of the common approaches was to replicate CPK models in silico. It is relatively straightforward to calculate the circles of intersection, but more complex to represent a model with hidden surface removal. A useful side product is that a conventional value for the molecular volume can be calculated.

The use of spheres is often for convenience, being limited both by graphics libraries and the additional effort required to compute complete electronic density or other space-filling quantities. It is now relatively common to see images of surfaces that have been colored to show quantities such as electrostatic potential. Common surfaces in molecular visualization include solvent-accessible ("Lee-Richards") surfaces, solvent-excluded ("Connolly") surfaces, and isosurfaces. The isosurface in Fig. 5 appears to show the electrostatic potential, with blue colors being negative and red/yellow (near the metal) positive (there is no absolute convention of coloring, and red/positive, blue/negative are often reversed). Opaque isosurfaces do not allow the atoms to be seen and identified and it is not easy to deduce them. Because of this, isosurfaces are often drawn with a degree of transparency.

Technology

Early interactive molecular computer graphics systems were vector graphics machines, which used stroke-writing vector monitors, sometimes even oscilloscopes. The electron beam does not sweep left-and-right as in a raster display. The display hardware followed a sequential list of digital drawing instructions (the display list), directly drawing at an angle one stroke for each molecular bond. When the list was complete, drawing would begin again from the top of the list, so if the list was long (a large number of molecular bonds), the display would flicker heavily. Later vector displays could rotate complex structures with smooth motion, since the orientation of all of the coordinates in the display list could be changed by loading just a few numbers into rotation registers in the display unit, and the display unit would multiply all coordinates in the display list by the contents of these registers as the picture was drawn.

The early black-and white vector displays could somewhat distinguish for example a molecular structure from its surrounding electron density map for crystallographic structure solution work by drawing the molecule brighter than the map. Color display makes them easier to tell apart. During the 1970s two-color stroke-writing Penetron tubes were available, but not used in molecular computer graphics systems. In about 1980 Evans & Sutherland made the first practical full-color vector displays for molecular graphics, typically attached to an E&S PS-300 display. This early color tube was expensive, because it was originally engineered to withstand the shaking of a flight-simulator motion base.

Color raster graphics display of molecular models began around 1978 as seen in this paper by Porter on spherical shading of atomic models. Early raster molecular graphics systems displayed static images that could take around a minute to generate. Dynamically rotating color raster molecular display phased in during 1982-1985 with the introduction of the Ikonas programmable raster display.

Molecular graphics has always pushed the limits of display technology, and has seen a number of cycles of integration and separation of compute-host and display. Early systems like Project MAC were bespoke and unique, but in the 1970s the MMS-X and similar systems used (relatively) low-cost terminals, such as the Tektronix 4014 series, often over dial-up lines to multi-user hosts. The devices could only display static pictures but were able to evangelize MG. In the late 1970s, it was possible for departments (such as crystallography) to afford their own hosts (e.g., PDP-11) and to attach a display (such as Evans & Sutherland's MPS) directly to the bus. The display list was kept on the host, and interactivity was good since updates were rapidly reflected in the display—at the cost of reducing most machines to a single-user system.

In the early 1980s, Evans & Sutherland (E&S) decoupled their PS300 display, which contained its own display information transformable through a dataflow architecture. Complex graphical objects could be downloaded over a serial line (e.g. 9600 baud) and then manipulated without impact on the host. The architecture was excellent for high performance display but very inconvenient for domain-specific calculations, such as electron-density fitting and energy calculations. Many crystallographers and modellers spent arduous months trying to fit such activities into this architecture.

The benefits for MG were considerable, but by the later 1980s, UNIX workstations such as Sun-3 with raster graphics (initially at a resolution of 256 by 256) had started to appear. Computer-assisted drug design in particular required raster graphics for the display of computed properties such as atomic charge and electrostatic potential. Although E&S had a high-end range of raster graphics (primarily aimed at the aerospace industry) they failed to respond to the low-end market challenge where single users, rather than engineering departments, bought workstations. As a result, the market for MG displays passed to Silicon Graphics, coupled with the development of minisupercomputers (e.g., CONVEX and Alliant) which were affordable for well-supported MG laboratories. Silicon Graphics provided a graphics language, IrisGL, which was easier to use and more productive than the PS300 architecture. Commercial companies (e.g., Biosym, Polygen/MSI) ported their code to Silicon Graphics, and by the early 1990s, this was the "industry standard". Dial boxes were often used as control devices.

Stereoscopic displays were developed based on liquid crystal polarized spectacles, and while this had been very expensive on the PS300, it now became a commodity item. A common alternative was to add a polarizable screen to the front of the display and to provide viewers with extremely cheap spectacles with orthogonal polarization for separate eyes. With projectors such as Barco, it was possible to project stereoscopic display onto special silvered screens and supply an audience of hundreds with spectacles. In this way molecular graphics became universally known within large sectors of chemical and biochemical science, especially in the pharmaceutical industry. Because the backgrounds of many displays were black by default, it was common for modelling sessions and lectures to be held with almost all lighting turned off.

In the last decade almost all of this technology has become commoditized. IrisGL evolved to OpenGL so that molecular graphics can be run on any machine. In 1992, Roger Sayle released his RasMol program into the public domain. RasMol contained a very high-performance molecular renderer that ran on Unix/X Window, and Sayle later ported this to the Windows and Macintosh platforms. The Richardsons developed kinemages and the Mage software, which was also multi-platform. By specifying the chemical MIME type, molecular models could be served over the Internet, so that for the first time MG could be distributed at zero cost regardless of platform. In 1995, Birkbeck College's crystallography department used this to run "Principles of Protein Structure", the first multimedia course on the Internet, which reached 100 to 200 scientists.

MG continues to see innovation that balances technology and art, and currently zero-cost or open source programs such as PyMOL and Jmol have very wide use and acceptance.

Recently the widespread diffusion of advanced graphics hardware has improved the rendering capabilities of the visualization tools. The capabilities of current shading languages allow the inclusion of advanced graphic effects (like ambient occlusion, cast shadows and non-photorealistic rendering techniques) in the interactive visualization of molecules. These graphic effects, beside being eye candy, can improve the comprehension of the three-dimensional shapes of the molecules. An example of the effects that can be achieved exploiting recent graphics hardware can be seen in the simple open source visualization system QuteMol.

Reference frames

Drawing molecules requires a transformation between molecular coordinates (usually, but not always, in Angstrom units) and the screen. Because many molecules are chiral it is essential that the handedness of the system (almost always right-handed) is preserved. In molecular graphics the origin (0, 0) is usually at the lower left, while in many computer systems the origin is at top left. If the z-coordinate is out of the screen (towards the viewer) the molecule will be referred to right-handed axes, while the screen display will be left-handed.

Molecular transformations normally require:

  • scaling of the display (but not the molecule).
  • translations of the molecule and objects on the screen.
  • rotations about points and lines.
  • Conformational changes (e.g. rotations about bonds) require rotation of one part of the molecule relative to another. The programmer must decide whether a transformation on the screen reflects a change of view or a change in the molecule or its reference frame.

    Simple

    In early displays only vectors could be drawn e.g. (Fig. 7) which are easy to draw because no rendering or hidden surface removal is required.

    On vector machines the lines would be smooth but on raster devices Bresenham's algorithm is used (note the "jaggies" on some of the bonds, which can be largely removed with antialiasing software.)

    Atoms can be drawn as circles, but these should be sorted so that those with the largest z-coordinates (nearest the screen) are drawn last. Although imperfect, this often gives a reasonably attractive display. Other simple tricks which do not include hidden surface algorithms are:

  • coloring each end of a bond with the same color as the atom to which it is attached (Fig. 7).
  • drawing less than the whole length of the bond (e.g. 10%-90%) to simulate the bond sticking out of a circle.
  • adding a small offset white circle within the circle for an atom to simulate reflection.
  • Typical pseudocode for creating Fig. 7 (to fit the molecule exactly to the screen):

    // assume: // atoms with x, y, z coordinates (Angstrom) and elementSymbol // bonds with pointers/references to atoms at ends // table of colors for elementTypes // find limits of molecule in molecule coordinates as xMin, yMin, xMax, yMax scale = min(xScreenMax/(xMax-xMin), yScreenMax/(yMax-yMin)) xOffset = -xMin * scale; yOffset = -yMin * scale for (bond in $bonds) { atom0 = bond.getAtom(0) atom1 = bond.getAtom(1) x0 = xOffset+atom0.getX()*scale; y0 = yOffset+atom0.getY()*scale // (1) x1 = xOffset+atom1.getX()*scale; y1 = yOffset+atom1.getY()*scale // (2) x1 = atom1.getX(); y1 = atom1.getY() xMid = (x0 + x1) /2; yMid = (y0 + y1) /2; color0 = ColorTable.getColor(atom0.getSymbol()) drawLine (color0, x0, y0, xMid, yMid) color1 = ColorTable.getColor(atom1.getSymbol()) drawLine (color1, x1, y1, xMid, yMid) }

    Note that this assumes the origin is in the bottom left corner of the screen, with Y up the screen. Many graphics systems have the origin at the top left, with Y down the screen. In this case the lines (1) and (2) should have the y coordinate generation as:

    y0 = yScreenMax -(yOffset+atom0.getY()*scale) // (1) y1 = yScreenMax -(yOffset+atom1.getY()*scale) // (2)

    Changes of this sort change the handedness of the axes so it is easy to reverse the chirality of the displayed molecule unless care is taken.

    Advanced

    For greater realism and better comprehension of the 3D structure of a molecule many computer graphics algorithms can be used. For many years molecular graphics has stressed the capabilities of graphics hardware and has required hardware-specific approaches. With the increasing power of machines on the desktop, portability is more important and programs such as Jmol have advanced algorithms that do not rely on hardware. On the other hand, recent graphics hardware is able to interactively render very complex molecule shapes with a quality that would not be possible with standard software techniques.

    Electronic Richards Box Systems

    Before computer graphics could be employed, mechanical methods were used to fit large molecules to their electron density maps. Using techniques of X-ray crystallography crystal of a substance were bombarded with X-rays, and the diffracted beams that came off were assembled by computer using a Fourier transform into a usually blurry 3-D image of the molecule, made visible by drawing contour circles around high electron density to produce a contoured electron density map.

    In the earliest days, contoured electron density maps were hand drawn on large plastic sheets. Sometimes, bingo chips were placed on the plastic sheets where atoms were interpreted to be.

    This was superseded by the Richards Box in which an adjustable brass Kendrew molecular model was placed front of a 2-way mirror, behind which were plastic sheets of the electron density map. This optically superimposed the molecular model and the electron density map. The model was moved to within the contour lines of the superimposed map. Then, atomic coordinates were recorded using a plumb bob and a meter stick. Computer graphics held out the hope of vastly speeding up this process, as well as giving a clearer view in many ways.

    A noteworthy attempt to overcome the low speed of graphics displays of the time took place at Washington University in St. Louis, USA. Dave Barry's group attempted to leapfrog the state of the art in graphics displays by making custom display hardware to display images complex enough for large-molecule crystallographic structure solution, fitting molecules to their electron-density maps. The MMS-4 (table above) display modules were slow and expensive, so a second generation of modules was produced for the MMS-X (table above) system.

    The first large molecule whose atomic structure was partly determined on a molecular computer graphics system was Transfer RNA by Sung-Hou Kim's team in 1976. after initial fitting on a mechanical Richards Box. The first large molecule whose atomic structure was entirely determined on a molecular computer graphics system is said to be neurotoxin A from venom of the Philippines sea snake, by Tsernoglou, Petsko, and Tu, with a statement of being first in 1977. The Richardson group published partial atomic structure results of the protein superoxide dismutase the same year, in 1977. All of these were done using the GRIP-75 system.

    Other structure fitting systems, FRODO, RING, Builder, MMS-X, etc. (table above) succeeded as well within three years and became dominant.

    The reason that most of these systems succeeded in just those years, not earlier or later, and within a short timespan had to do with the arrival of commercial hardware that was powerful enough. Two things were needed and arrived at about the same time. First, electron density maps are large and require either a computer with at least a 24-bit address space or a combination of a computer with a lesser 16-bit address space plus several years to overcome the difficulties of an address space that is smaller than the data. The second arrival was that of interactive computer graphics displays that were fast enough to display electron-density maps, whose contour circles require the display of numerous short vectors. The first such displays were the Vector General Series 3 and the Evans and Sutherland Picture System 2, MultiPicture System, and PS-300.

    Nowadays, fitting of the molecular structure to the electron density map is largely automated by algorithms with computer graphics a guide to the process. An example is the XtalView XFit program.

    References

    Molecular graphics Wikipedia