CEDAR: Crystallographic, Energy and Dynamics Refinement
Authors: KeithWatenpaugh, Michael Carson & Jan Hermans
Contact: Keith Watenpaugh, Physical and Analytical Chemistry,
7255-209-1, The Upjohn Company, Kalamazoo, MI 49007, USA
CEDAR refines molecular structures using energy minimization, dynamics and
crystallographic techniques either separately or together. Forces that may
enter into the calculations include those from target atoms, bondlengths,
bondangles, dihedral angles, interatomic distances, van der Waals interaction
and partial ionic charges. In addition, least-squares refinement against the
crystallographic data; symmetrically related and lattice related forces; and
solvent molecules may be included in the calculations and minimization
procedure.
Calculations Performed
Based on the dictionary of bonding forces entered onto the bdf by PRECED, a
process of minimizing a non-linear function of the following form is
performed.
Φ = Σφi
where φ1a =
Σ(||-||)2
Structure factors
φ1b =
Σ(∊p)2 Parameter
shifts
φ2 =
1/2Σ(-)2
Bond lengths
φ3 =
1/2Σ(-)2
Bond angles
φ4 =
Σ(-)2
Fixed dihedral angles
φ5 =
Σ(1-cos(nx))2 Flexible dihedral
angles
φ6 =
Σ(/-/+/
Non-bonded
φ7 =
Σ(/-/
1-4 distances
∊p = shift in atomic positions from crystallographic
terms
L = interatomic distance
T = interatomic bond angles
D = fixed torsion angles
nX = flexible torsion angle with "n" minimum positions
and = van der Waals terms between non-bonded
atoms
and = partial charges on atoms "i" and "j"
d = dielectric constant
= interatomic distance of non-bonded atoms
and = van der Waals terms 1-4 interactions
(i.e. separated by two bonds)
Any member of the set of functions may be weighted from 0.0 upwards. If all the
energy terms are weighted at 0.0, then the program is doing a conventional
crystallographic unrestrained refinement. If the crystallographic refinement is
weighted at 0.0, then the program is doing an energy minimization of the terms
selected without using any crystallographic structure factors. Likewise,
dynamics simulation or combined dynamics and crystallographic refinement may be
turned on.
During the calculation of the crystallographic shifts, a block diagonal matrix
is used, each block consisting of the variable parameters for each atom. The
shifts of the parameters, adjusted or weighted by user defined parameters are
combined with the other terms of Φ. Φ is then solved by an interative
conjugate gradient method. In other words φ1b rather than
φ1a is used. If φ1b is applied using only the
diagonal terms as is the case with other restrained refinement programs (e.g.
PROLSQ), then it reduces to φ1a. The advantage of
φ1b is that bad or large shifts may adjusted to reasonable
values or weighted by their e.s.d.'s before applying the conjugant gradient
step. The block diagonal calculation also takes into account correlation
between atom parameters. In addition, shifts calculated in other programs may
be used in place of the internal crystallographic least-squares routines. The
most frequently used is CRYLSQ in order to do rigid body refinement at early
stages of refinement.
When non-bonded energy terms are incorporated into the minimization process,
the symmetry related atoms are also included in the calculations. The
neibor option must have been used in PRECED previously to put
the necessary symmetry and translational operators on the bdf.The number of
interatomic interactions can become quite large and thus computing time can
increase greatly. Turning on the dynamics simulations allows for complete
subversion of a computer for vast amounts of time.
Description of Some Input Lines
mode line:
The mod option sets weights 1-5 to 1.0, and weights 6-8 to
0.0. The ene option sets weights 1-7 to 1.0 and the target
weight on hydrogen (weight 8) to a pseudo sigma of 25.0 or a weight of
1.0/25.0. xta turns on crystallographic least-squares cycle
while ext uses the previously calculated shifts and sigmas. The
previously calculated shifts could be from CEDAR (see the xmode
line) or CRYLSQ. Target atoms with larger estimated standard deviations can be
down weighted using the sig option.
weight, initwt and finlwt lines:
If a finlwt line is included in addition to the
weight (initwt) line, the the weights will
change from the initial weighting to the final weighting in the number of
conjugate gradient refinement cycles requested. For example, a large target
weighting may be used initially and small target weighting at the end.
xmode line:
Only the isotropic B factor option works; anisotropic B factors have been
coded in but not activated and tested. The pre option puts the
input atomic coordinates into the previous atomic coordinate fields on the
output bdf and the currently refined coordinates in the current coordinate
fields. A similar option exists on some versions of CRYLSQ. The
ext option on the mode line uses the previous
and current coordinates to calculate shifts. The fca,
fxe and fxa options allow inclusion but no
refinement of additional atoms, for example, hydrogens. fca
uses the A(calc) and B(calc) stored on the bdf and adds them into the terms
calculated in this run. The fixed A and B are over-written and lost. In the
fxe option the A(calc) and B(calc) are strored for future use
in the fields reserved for them on the bdf and also applied. The
fxa option used the valued stored in the fixed atom
contribution fields on the bdf, but doesn't modifiy them. The
ski option calculates the FC and partial derivatives for every
nth reflection. This allows for an estimate of the R-factor to be calculated
with the FC option. It is not recommended for the least-squares refinement.
xlimit line:
B factors are reset to be between the minimum and maximum limits before each
cycle of refinement. At the same time, the bch option allows
B's to change by a maximum amount between connected atoms. For example, if
bch=2., and atom1 has a B of 5.0, then atom2 that is bonded to
atom1 will be reset to be between 5.0/2.0 or 2.5 and 5.0*2.0 or 10.0. I don't
defend this a the best way of maintaining reasonable B's and I am open to other
suggestions. I don't beleive the "riding motion" restraint in PROLSQ is a
correct way to handle it.
scale line:
In the current version, a scale line should always be included if the scale is
not unity.
Refinement Strategies
Normal combined crystallographic and energy minimization refinement usually
consists of refinement with the input lines
CEDAR
mode xtal model
weight 100. 1 1 4 1 0 0
refine 4
In this case the crystallographic shifts are weighted 100.0 times the default
weight. Frequently even a weighting of 1000.0 can be used to place more
importance on the crystallographic terms. In later stages of refinement the
crystallographic weighting may be dropped to reduce deviations from the
standard bonding parameters. In addition, the fixed torsion angles are weighted
more heavily to force peptides and aromatics to maintain greater planarity than
their nornal forces parameters suggest. Four complete cycles of refinement will
be carried out. As with other restrained refinement programs, the decrease of
R-factors usually greatly slows down after a few cycles of refinement. An
unconstrained refinement cycle followed by idealization in a separate step
using the input lines
CEDAR
mode xtal
weight 1 0 0 0 0 0 0
refine
mode model
refine
can sometimes get regions out of local minima. Otherwise, more drastic methods
such as rebuilding with computer graphics, non-bonded energy terms or dynamic
simulations are required to correct areas of a structure. Turning on the
listing of bad energy terms (weight line, field 9) helps
identify bad inter-atomic parameters (lengths, angles, dihedrals).
Including the non-bonded terms is much more compute-intensive, but may help in
the refinement process. It especially helps in keeping non-bonded atoms from
getting too close and helps form hydrogen-bonding networks. The pseudo energy
terms for chirality can help low resolution refinements from flipping to a
wrong hand. A typical combined non-bonded energy refinement with
crystallographic refinement could include the input lines.
CEDAR
mode xtal energy
weight 100.
refine
The energy option on the mode line
automatically places unit weight on weigths 1-6. The weight line changes weight
of crystallographic shifts to 100. without changing the others. The non-bonded
terms tend to push inter-atomic parameters slightly away from the standard
values in systematic and hopefully in realistic ways depending on secondary
structure or neighbouring contacts. In this mode, the non-bonded energy for
each residue is listed after each cycle of refinement. Limited experience has
shown that residues with positive energies are suspect and should be checked.
It is also good to check graphically where the hydrogens are and to adjust them
to give more reasonable H-bonds sometimes. Be default, the hydrogen positions
are weighted down by 1/25.0 relative to other atoms (weight
line, field 8)=25.), but they still don't always move from their initially
calculated postions very well.
One can further play around with trying to get the best set of refinement
parameters by breaking the process up into the computing crystallographic step
and the energy minimization step by first doing and storing the results of the
crystallographic calculations using the input lines
CEDAR
mode xtal
weight 1 0 0 0 0 0
xmode previous
refine
This will store the shifts and e.s.d.s on the output bdf for input into the
energy minimization step. Then by using a set of input lines such as
CEDAR
mode external
weight 1000 1 1 1 1 0 0
refine
xmode skip 50
fc
nobdf
the effect of different weighting schemes can be tested quickly and usually
interactively on the computer.
Externally calculated shifts can be used in another and sometimes powerful way
by using CRYLSQ to do rigid-body refinement coupled with CEDAR doing restrained
refinement. As yet a program to conveniently set-up the sometimes extensive
control lines for many rigid groups does not exist and needs to be written. A
sample input could be:
title crylsq refinement cycle 1
CRYLSQ iso fu .75 cy 1 ms pp
scale 1.00 1
maxhkl *4 0.05 0.35
select o n c
group *6 -1
atomgr c001_a-c002_a
group *6 -1
atomgr c002_b-c002_z
group *6 -1
atomgr c002-n003
group *6 -1
atomgr c003_a-o003_xt
noref pop
title cedar refinement cycle 2
CEDAR
scale 1.00 1
mode model
weight 1 1 1 1 1 0
refine
maxhkl *4 0.05 0.3333
fc
Debugging dictionary entries
It is not always possible to generate a new residue dictionary correctly the
first time. A way to see if connectivity, bond angle, torsion angles, etc are
correct is to let CEDAR either put out bad energy terms using the
weight line or print out all observed and calculated values of
bond lengths, angles, dihedrals using printer priority of 4 on the
reset line. The number of lines output can be controled by just
putting a range of residues of interest into the calculations by using a
reszon line. Any new residue dictionary entries should be
checked before refining a structure.
Dynamics
The currently distributed version of CEDAR does not contain the dynamics
routines. The dynamics version can be obtained by communication directly with
the author.
Referencing
Use of PRECED and CEDAR should be referenced as:
K.D. Watenpaugh, "Conformational Energy as a Restraint in Refinement", In:
**
Molecular Dynamics and Protein Structure (ed. J. Hermans), Polycrystal
Book,
Yellow Springs, OH, 77-80, 1985.
Prerequisites
PROATM must have been run to load atom parameters
PRECED must have been run after any loading of atoms
File Assignments
Read input from input archive bdf
Write output to the output archive bdf
Examples
CEDAR
mode sigma xtal
weight 8 1 1 1 1 1 0 1
scale 8.4
maxhkl *4 0.05 0.27
xlimit bmin 3.0 bmax 50.0 bchange 1.5 damp .8 sft .4
refine 1
CEDAR will be run with crystallographic refinement. During the energy
minimization step the target atomic postions will be given a weighting
proportional to the inverse of the least-squares derived standard deviations
weighted up by a factor of 8. The F(obs) will be scaled by a factor of 8.4 and
only reflections between sinθ/λ limits of 0.05 and 0.27 will be
used in the refinement. During the least-squares refinement B-factors will be
restrained to be between 3.0 and 50.0. In addition, B-factors can only vary by
af factor of 1.5 from the previous atom to which an atom is bound. The
least-squares calculated shifts will be damped by 0.8 and the maximum shift of
an atomic positional coordinate is 0.4 Angstroms. One pass will be made through
the procedure set-up with the current options and parameters.
CEDAR
mode energy
weight 0 1 1 1 1 1 0 0
mmode 50
refine
This is a straight energy refinement. Starting with an initial set of
coordinates, both the bonded and non-bonded energies are minimized. All
interatomic interactions are included, including those to symmetry related
molecules.
CEDAR
mode model
weight 1 1 2 8 1 0 1 0
reszon 1 24
refine 50
In this refinement, non-bonded energies are ignored, while bond angles are
given double weight and torsion angles eight times their normal weight. This
will force them to be more ideal than the normal weighting would have done.
Only residues 1 through 24 will be shifted.
References
Molecular Mechanics Calculations
U. Burkert and N.L. Allinger, Molecular Mechanics, Amer. Chem. Soc.,
Washington,D.C., 1982.
S.J. Weiner, P.A.Kollman, D.A. Case, U.C. Singh, C Ghio, G. Alagona, S.
Profeta, Jr and P.Weiner, "A New Force Field for Molecular Simulations of
Nucleic Acids and Proteins", J. Amer. Chem. Soc., 106, 765-784,
1984.
Molecular Dynamics Simulations
J.A. McCammon and S.C. Harvey, Molecular Dynamics of Proteins and Nucleic
Acids, Cambridge Univ. Press, NY, 1987.
M. Karplus, B.M. Pettit and C. A. Brooks, Proteins: Dynamics, Academic
Press, NY, 1988.
Combined Crystallographic and Molecular Mechanics/Dynamics
K.D. Watenpaugh, "Conformational Energy as a Restraint in Refinement", In:
Molecular Dynamics and Protein Structure (ed. J. Hermans), Polycrystal
Book, Yellow Springs, OH, 77-80, 1985.
A.T. Brunger, G.M. Clore, A.M. Gronenborn and M. Karplus, "Three dimensional
structures of proteins determined by molecular dynamics with interproton
distance restrainsts: Application to crambin", Proc. Natl Acad Sci, USA,
83, 3801-3805, 1986.