The primary function of PROATM is the transformation of atomic parameters of
protein structures into a form suitable for use in the XTAL system and back. In
PDB form the atomic coordinates are given in orthogonal Angstrom coordinates
and the thermal displacement parameters as U. Moreover, the anisotropic U
values are scaled by 10000. In addition to the scaling problem, the PDB
specifies more naming characters than the XTAL convention allows. These extra
characters serve to unambiguously locate the atom in the structure. Moreover,
atoms are classified as atoms or heteroatoms depending upon whether they are in
the polymeric part of the structure or are attached to it as solvent or
chelated moieties.
Each atom is specified by a unique serial number in addition to the other
naming parameters which are listed below. All of these naming parameters are
stored in the atom record of the bdf. Once they are stored they may later be
retrieved in the exact form specified by the authors of the PDB.
As the loading of the atoms takes place a survey of the residue sequence is
made. The sequence of amino acids in the protein is stored in the bdf in
logical record sequence. This requires that all atoms be loaded in a
preliminary pass. The atoms are written to a scratch file and then copied to
the output archive bdf after the orthogonal to factional transformation matrix
and the sequence of residues have been written to records lrcell: and
lrsequ: of the output archive bdf. If loading is bypassed, all
printed/punched information will be taken from the input archive bdf alone. If
loading is done the printed/punched data will be from the output archive
bdf.
Quantities stored in lratom:
The actual quantities stored in record lratom: of the output archive
bdf is, to a certain extent, under user control. At a minimum the XTAL and PDB
form of the atom name, the atom serial number, x, y, z, the residue sequence
number, the remoteness indicator, the branch and sub-branch designators, the
alternate location indicator, the chain identifier, the atom serial mumber, the
residue name and scattering factor type will be stored for each atom. All the
other quantities are optional. The program is set up to store the most
important quantities by default. However, it is possible through the use of the
prolod line to specify any of the possible atomic parameter
items described by the PDB documentation.
The following list gives the items which may be stored; those marked with *
are stored by default; those marked ** are stored depending upon the complexity
of the thermal parameters specified. The rest will not be stored unless
specified in a prolod line. The description gives the form the
parameter tables in the bdf. Translation to and from the PDB prescribed form is
done automatically.
Mnemonic Idnum Description
None 14 * XTAL form of atom name as two words:
Word 1 contains the atom type followed by the residue number (right
justified)
Word 2 contains the insertion code followed by the remoteness
indicator information.
X 1 * x fractional coordinate
Y 2 * y " "
Z 3 * z " "
U 4 ** isotropic thermal parameter U
U11 5 ** U(1,1) individual anisotropic thermal parameter
U22 6 ** U(2,2) " " " "
U33 7 ** U(3,3) " " " "
U12 8 ** U(1,2) " " " "
U13 9 ** U(1,3) " " " "
U23 10 ** U(2,3) " " " "
POP 11 atom population (occupancy) parameter
APP 12 atom anomolous population parameter
SEQ 15 * character 1 remoteness indicator
character 2 branch designator
character 3 sub-branch designator
character 4 alternate location indicator
RSQ 16 * characters 1-3 least significant digit of residue sequence
number
character 4 insertion of residue code
SET 17 dataset to which the atom belongs
CHN 18 * characters 1-3 chain identifier
character 4 most significant digit of residue sequence number
RES 19 * 4 character residue name
SQN 20 * atom serial number
SFT 22 * scattering factor type; pointer to SF table
TFT 23 thermal parameter type
SX 101 standard deviation in x fractional coordinate
SY 102 " " y "
SZ 103 " " z "
SPP 111 " " population param
SU 104 " " U
SAP 112 " " anomolous population
SU1 105 " " U(1,1)
SU2 106 " " U(2,2)
SU3 107 " " U(3,3)
SU4 108 " " U(1,3)
SU5 109 " " U(2,3)
SU6 110 " " U(3,3)
It is important to note that because PROATM is biased toward keeping the bdf
as small as possible, it is important to use a prolod line if
it is desired to keep any of the items not flagged with an (*) from being
purged during a run.
Hydrogen atoms
The original definition of data for the PDB did not allow for the inclusion of
hydrogen atoms. When H-atoms were added to protein structures it became
necessary to have a sub-branch designator since there could be up to three
hydrogens attached to a carbon atom. This sub-branch designator has been placed
as a number on the part of the scattering factor symbol H in column 7 of an
ATOM line. The PROATM program stores this sub-branch designator as the third
character of item 15 of the atom record.
Order of Entry of Atoms.
The algorithm used in PROATM is based on the serial number as defined in the
Protein Data Bank. All atom data must be presented in serial number order.
Except for the a priori run, all runs are treated as edits of the
binary data file record lratom:. Atoms may be added, replaced, inserted
or deleted, but only in serial number order. The serial number must be
increasing in the list but need not be continuous. If gaps are not included
initially (to provide for future additions) special provision is made to insert
atoms subsequently. This does, however, increase the serial numbers of
following atoms already present in the bdf.
There are four functions allowed in the loading of atom parameters with this
program. They are input, replacement, deletion and insertion of atoms into the
bdf. Renumbering of following atoms takes place only with insertion. All
operations are done in terms of the PDB serial numbers.
Input of Atoms
With the a priori option specified all atoms are entered in serial
number order by input lines. Any atoms present in lratom: of the input
binary data file are deleted. With the 'merge' option atoms are entered in
serial number order and merged in proper order with the atoms previously loaded
into the lratom: record.
Replacement of Atoms
An atom with a serial number equal to one in the file will cause the
replacement of the values in the file with the values in the line input
stream.
Deletion of Atoms
If an atomd line is prepared in the following form:
atomd <first serial number> <last serial
number>,
then all atoms from the first serial number through the last serial number will
be deleted from the file. If the last serial number is void or smaller than the
first serial number only the first number atom will be deleted.
Insertion of Atoms (with subsequent renumbering)
atomi and atome lines containing the serial
numbers to control the insertion are entered surrounding a group of new atoms
to be loaded. e.g.
atomi < existing serial number of the atom preceding the
insert >
............atom input lines in any recognized form
atome < existing serial number of the last atom to be
pushed down >
All the inserted atoms will be forced into the file just after the
atomi-specified atom. The serial number of all the atoms in the
bdf following the inserted atom, but only down to, and including the one
specified in the atome line will have their serial numbers
increased by the number of inserted atoms.
All the editing features described here may be used in any combination so long
as all are presented in the input stream in increasing serial number order.
After the binary data file has been prepared, it may be printed and/or punched
in standard Protein Data Bank form. This operation can be carried out just at
the end of a loading/editing session or as a separate operation on an existing
bdf.