Filename extension .pdb, .ent, .brk | Internet media type chemical/x-pdb | |
The Protein Data Bank (pdb) file format is a textual file format describing the three-dimensional structures of molecules held in the Protein Data Bank. The pdb format accordingly provides for description and annotation of protein and nucleic acid structures including atomic coordinates, observed sidechain rotamers, secondary structure assignments, as well as atomic connectivity. Structures are often deposited with other molecules such as water, ions, nucleic acids, ligands and so on, which can be described in the pdb format as well. The Protein Data Bank also keeps data on biological macromolecules in the newer mmCIF file format.
Contents
History
The PDB file format was invented in 1976 as a human-readable file that would allow researchers to exchange protein coordinates through a database system. Its original format was limited to 80 columns, which was based on the width of the computer punch cards that were previously used to exchange the coordinates. Through the years the file format has undergone many changes and revisions. As of 13 July 2011, the most recent revision is 3.30.
Example
A typical PDB file describing a protein consists of hundreds to thousands of lines like the following (taken from a file describing the structure of a synthetic collagen-like peptide):
HEADER EXTRACELLULAR MATRIX 22-JAN-98 1A3ITITLE X-RAY CRYSTALLOGRAPHIC DETERMINATION OF A COLLAGEN-LIKETITLE 2 PEPTIDE WITH THE REPEATING SEQUENCE (PRO-PRO-GLY)...EXPDTA X-RAY DIFFRACTIONAUTHOR R.Z.KRAMER,L.VITAGLIANO,J.BELLA,R.BERISIO,L.MAZZARELLA,AUTHOR 2 B.BRODSKY,A.ZAGARI,H.M.BERMAN...REMARK 350 BIOMOLECULE: 1REMARK 350 APPLY THE FOLLOWING TO CHAINS: A, B, CREMARK 350 BIOMT1 1 1.000000 0.000000 0.000000 0.00000REMARK 350 BIOMT2 1 0.000000 1.000000 0.000000 0.00000...SEQRES 1 A 9 PRO PRO GLY PRO PRO GLY PRO PRO GLYSEQRES 1 B 6 PRO PRO GLY PRO PRO GLYSEQRES 1 C 6 PRO PRO GLY PRO PRO GLY...ATOM 1 N PRO A 1 8.316 21.206 21.530 1.00 17.44 NATOM 2 CA PRO A 1 7.608 20.729 20.336 1.00 17.44 CATOM 3 C PRO A 1 8.487 20.707 19.092 1.00 17.44 CATOM 4 O PRO A 1 9.466 21.457 19.005 1.00 17.44 OATOM 5 CB PRO A 1 6.460 21.723 20.211 1.00 22.26 C...HETATM 130 C ACY 401 3.682 22.541 11.236 1.00 21.19 CHETATM 131 O ACY 401 2.807 23.097 10.553 1.00 21.19 OHETATM 132 OXT ACY 401 4.306 23.101 12.291 1.00 21.19 O...REMARK 350 BIOMT
records describe how to compute the coordinates of the experimentally observed multimer from those of the explicitly specified ones of a single repeating unit.