##
CEDAR: Crystallographic, Energy and Dynamics Refinement

**Authors:** KeithWatenpaugh, Michael Carson & Jan Hermans

**Contact: **Keith Watenpaugh, Physical and Analytical Chemistry,

7255-209-1, The Upjohn Company, Kalamazoo, MI 49007, USA

*CEDAR refines molecular structures using energy minimization, dynamics and
crystallographic techniques either separately or together. Forces that may
enter into the calculations include those from target atoms, bondlengths,
bondangles, dihedral angles, interatomic distances, van der Waals interaction
and partial ionic charges. In addition, least-squares refinement against the
crystallographic data; symmetrically related and lattice related forces; and
solvent molecules may be included in the calculations and minimization
procedure.*

**Calculations Performed**

Based on the dictionary of bonding forces entered onto the bdf by PRECED, a
process of minimizing a non-linear function of the following form is
performed.

Φ = Σφ_{i}

where φ_{1a} =
Σ(||-||)^{2}
Structure factors

φ_{1b} =
Σ(∊_{p})^{2} Parameter
shifts

φ_{2} =
1/2Σ(-)^{2}
Bond lengths

φ_{3} =
1/2Σ(-)^{2}
Bond angles

φ_{4} =
Σ(-)^{2}
Fixed dihedral angles

φ_{5} =
Σ(1-cos(nx))^{2} Flexible dihedral
angles

φ_{6} =
Σ(/-/+/
Non-bonded

φ_{7} =
Σ(/-/
1-4 distances

∊_{p} = shift in atomic positions from crystallographic
terms

L = interatomic distance

T = interatomic bond angles

D = fixed torsion angles

nX = flexible torsion angle with "n" minimum positions

and = van der Waals terms between non-bonded
atoms

and = partial charges on atoms "i" and "j"

d = dielectric constant

= interatomic distance of non-bonded atoms

and = van der Waals terms 1-4 interactions
(i.e. separated by two bonds)

Any member of the set of functions may be weighted from 0.0 upwards. If all the
energy terms are weighted at 0.0, then the program is doing a conventional
crystallographic unrestrained refinement. If the crystallographic refinement is
weighted at 0.0, then the program is doing an energy minimization of the terms
selected without using any crystallographic structure factors. Likewise,
dynamics simulation or combined dynamics and crystallographic refinement may be
turned on.

During the calculation of the crystallographic shifts, a block diagonal matrix
is used, each block consisting of the variable parameters for each atom. The
shifts of the parameters, adjusted or weighted by user defined parameters are
combined with the other terms of Φ. Φ is then solved by an interative
conjugate gradient method. In other words φ_{1b} rather than
φ_{1a} is used. If φ_{1b} is applied using only the
diagonal terms as is the case with other restrained refinement programs (e.g.
PROLSQ), then it reduces to φ_{1a}. The advantage of
φ_{1b} is that bad or large shifts may adjusted to reasonable
values or weighted by their e.s.d.'s before applying the conjugant gradient
step. The block diagonal calculation also takes into account correlation
between atom parameters. In addition, shifts calculated in other programs may
be used in place of the internal crystallographic least-squares routines. The
most frequently used is CRYLSQ in order to do rigid body refinement at early
stages of refinement.

When non-bonded energy terms are incorporated into the minimization process,
the symmetry related atoms are also included in the calculations. The
**neibor** option must have been used in PRECED previously to put
the necessary symmetry and translational operators on the bdf.The number of
interatomic interactions can become quite large and thus computing time can
increase greatly. Turning on the dynamics simulations allows for complete
subversion of a computer for vast amounts of time.

**Description of Some Input Lines**

**mode**** line:**

The **mod** option sets weights 1-5 to 1.0, and weights 6-8 to
0.0. The **ene** option sets weights 1-7 to 1.0 and the target
weight on hydrogen (weight 8) to a pseudo sigma of 25.0 or a weight of
1.0/25.0. **xta** turns on crystallographic least-squares cycle
while **ext** uses the previously calculated shifts and sigmas. The
previously calculated shifts could be from CEDAR (see the **xmode**
line) or CRYLSQ. Target atoms with larger estimated standard deviations can be
down weighted using the **sig** option.

**weight****, initwt and finlwt lines:**

If a **finlwt** line is included in addition to the
**weight** (**initwt**) line, the the weights will
change from the initial weighting to the final weighting in the number of
conjugate gradient refinement cycles requested. For example, a large target
weighting may be used initially and small target weighting at the end.

**xmode**** line:**

Only the isotropic B factor option works; anisotropic B factors have been
coded in but not activated and tested. The **pre** option puts the
input atomic coordinates into the previous atomic coordinate fields on the
output bdf and the currently refined coordinates in the current coordinate
fields. A similar option exists on some versions of CRYLSQ. The
**ext** option on the **mode** line uses the previous
and current coordinates to calculate shifts. The **fca**,
**fxe** and **fxa** options allow inclusion but no
refinement of additional atoms, for example, hydrogens. **fca**
uses the A(calc) and B(calc) stored on the bdf and adds them into the terms
calculated in this run. The fixed A and B are over-written and lost. In the
**fxe** option the A(calc) and B(calc) are strored for future use
in the fields reserved for them on the bdf and also applied. The
**fxa** option used the valued stored in the fixed atom
contribution fields on the bdf, but doesn't modifiy them. The
**ski** option calculates the FC and partial derivatives for every
nth reflection. This allows for an estimate of the R-factor to be calculated
with the FC option. It is not recommended for the least-squares refinement.

**xlimit**** line:**

B factors are reset to be between the minimum and maximum limits before each
cycle of refinement. At the same time, the **bch** option allows
B's to change by a maximum amount between connected atoms. For example, if
**bch**=2., and atom1 has a B of 5.0, then atom2 that is bonded to
atom1 will be reset to be between 5.0/2.0 or 2.5 and 5.0*2.0 or 10.0. I don't
defend this a the best way of maintaining reasonable B's and I am open to other
suggestions. I don't beleive the "riding motion" restraint in PROLSQ is a
correct way to handle it.

**scale**** line:**

In the current version, a scale line should always be included if the scale is
not unity.

**Refinement Strategies**

Normal combined crystallographic and energy minimization refinement usually
consists of refinement with the input lines

** CEDAR**

** mode xtal model**

** weight 100. 1 1 4 1 0 0**

** refine 4**

In this case the crystallographic shifts are weighted 100.0 times the default
weight. Frequently even a weighting of 1000.0 can be used to place more
importance on the crystallographic terms. In later stages of refinement the
crystallographic weighting may be dropped to reduce deviations from the
standard bonding parameters. In addition, the fixed torsion angles are weighted
more heavily to force peptides and aromatics to maintain greater planarity than
their nornal forces parameters suggest. Four complete cycles of refinement will
be carried out. As with other restrained refinement programs, the decrease of
R-factors usually greatly slows down after a few cycles of refinement. An
unconstrained refinement cycle followed by idealization in a separate step
using the input lines

** CEDAR**

** mode xtal**

** weight 1 0 0 0 0 0 0**

** refine**

** mode model**

** refine**

can sometimes get regions out of local minima. Otherwise, more drastic methods
such as rebuilding with computer graphics, non-bonded energy terms or dynamic
simulations are required to correct areas of a structure. Turning on the
listing of bad energy terms (**weight** line, field 9) helps
identify bad inter-atomic parameters (lengths, angles, dihedrals).

Including the non-bonded terms is much more compute-intensive, but may help in
the refinement process. It especially helps in keeping non-bonded atoms from
getting too close and helps form hydrogen-bonding networks. The pseudo energy
terms for chirality can help low resolution refinements from flipping to a
wrong hand. A typical combined non-bonded energy refinement with
crystallographic refinement could include the input lines.

** CEDAR**

** mode xtal energy**

** weight 100.**

** refine**

The **energy** option on the **mode** line
automatically places unit weight on weigths 1-6. The weight line changes weight
of crystallographic shifts to 100. without changing the others. The non-bonded
terms tend to push inter-atomic parameters slightly away from the standard
values in systematic and hopefully in realistic ways depending on secondary
structure or neighbouring contacts. In this mode, the non-bonded energy for
each residue is listed after each cycle of refinement. Limited experience has
shown that residues with positive energies are suspect and should be checked.
It is also good to check graphically where the hydrogens are and to adjust them
to give more reasonable H-bonds sometimes. Be default, the hydrogen positions
are weighted down by 1/25.0 relative to other atoms (**weight**
line, field 8)=25.), but they still don't always move from their initially
calculated postions very well.

One can further play around with trying to get the best set of refinement
parameters by breaking the process up into the computing crystallographic step
and the energy minimization step by first doing and storing the results of the
crystallographic calculations using the input lines

** CEDAR**

** mode xtal**

** weight 1 0 0 0 0 0**

** xmode previous**

** refine**

This will store the shifts and e.s.d.s on the output bdf for input into the
energy minimization step. Then by using a set of input lines such as

** CEDAR**

** mode external**

** weight 1000 1 1 1 1 0 0**

** refine**

** xmode skip 50**

** fc**

** nobdf**

the effect of different weighting schemes can be tested quickly and usually
interactively on the computer.

Externally calculated shifts can be used in another and sometimes powerful way
by using CRYLSQ to do rigid-body refinement coupled with CEDAR doing restrained
refinement. As yet a program to conveniently set-up the sometimes extensive
control lines for many rigid groups does not exist and needs to be written. A
sample input could be:

** title crylsq refinement cycle 1**

** CRYLSQ iso fu .75 cy 1 ms pp**

** scale 1.00 1**

** maxhkl *4 0.05 0.35**

** select o n c**

** group *6 -1**

** atomgr c001_a-c002_a**

** group *6 -1**

** atomgr c002_b-c002_z**

** group *6 -1**

** atomgr c002-n003**

** group *6 -1**

** atomgr c003_a-o003_xt**

** noref pop**

** title cedar refinement cycle 2**

** CEDAR**

** scale 1.00 1**

** mode model**

** weight 1 1 1 1 1 0**

** refine**

** maxhkl *4 0.05 0.3333**

** fc**

**Debugging dictionary entries**

It is not always possible to generate a new residue dictionary correctly the
first time. A way to see if connectivity, bond angle, torsion angles, etc are
correct is to let CEDAR either put out bad energy terms using the
**weight** line or print out all observed and calculated values of
bond lengths, angles, dihedrals using printer priority of 4 on the
**reset** line. The number of lines output can be controled by just
putting a range of residues of interest into the calculations by using a
**reszon** line. Any new residue dictionary entries should be
checked before refining a structure.

**Dynamics**

The currently distributed version of CEDAR does not contain the dynamics
routines. The dynamics version can be obtained by communication directly with
the author.

**Referencing**

Use of PRECED and CEDAR should be referenced as:

K.D. Watenpaugh, "Conformational Energy as a Restraint in Refinement", In:
**

*Molecular Dynamics and Protein Structure* (ed. J. Hermans), Polycrystal
Book,

Yellow Springs, OH, 77-80, 1985.

**Prerequisites**

PROATM must have been run to load atom parameters

PRECED must have been run after any loading of atoms

**File Assignments**

Read input from input archive bdf

Write output to the output archive bdf

**Examples**

** CEDAR**

** mode sigma xtal**

** weight 8 1 1 1 1 1 0 1**

** scale 8.4**

** maxhkl *4 0.05 0.27**

** xlimit bmin 3.0 bmax 50.0 bchange 1.5 damp .8 sft .4**

** refine 1**

CEDAR will be run with crystallographic refinement. During the energy
minimization step the target atomic postions will be given a weighting
proportional to the inverse of the least-squares derived standard deviations
weighted up by a factor of 8. The F(obs) will be scaled by a factor of 8.4 and
only reflections between sinθ/λ limits of 0.05 and 0.27 will be
used in the refinement. During the least-squares refinement B-factors will be
restrained to be between 3.0 and 50.0. In addition, B-factors can only vary by
af factor of 1.5 from the previous atom to which an atom is bound. The
least-squares calculated shifts will be damped by 0.8 and the maximum shift of
an atomic positional coordinate is 0.4 Angstroms. One pass will be made through
the procedure set-up with the current options and parameters.

** CEDAR**

** mode energy**

** weight 0 1 1 1 1 1 0 0**

** mmode 50**

** refine**

This is a straight energy refinement. Starting with an initial set of
coordinates, both the bonded and non-bonded energies are minimized. All
interatomic interactions are included, including those to symmetry related
molecules.

** CEDAR**

** mode model**

** weight 1 1 2 8 1 0 1 0**

** reszon 1 24**

** refine 50**

In this refinement, non-bonded energies are ignored, while bond angles are
given double weight and torsion angles eight times their normal weight. This
will force them to be more ideal than the normal weighting would have done.
Only residues 1 through 24 will be shifted.

**References**

*Molecular Mechanics Calculations*

U. Burkert and N.L. Allinger, *Molecular Mechanics*, Amer. Chem. Soc.,
Washington,D.C., 1982.

S.J. Weiner, P.A.Kollman, D.A. Case, U.C. Singh, C Ghio, G. Alagona, S.
Profeta, Jr and P.Weiner, "A New Force Field for Molecular Simulations of
Nucleic Acids and Proteins", *J. Amer. Chem. Soc.*, **106**, 765-784,
1984.

*Molecular Dynamics Simulations*

J.A. McCammon and S.C. Harvey, *Molecular Dynamics of Proteins and Nucleic
Acids, *Cambridge Univ. Press, NY, 1987.

M. Karplus, B.M. Pettit and C. A. Brooks,* Proteins: Dynamics*, Academic
Press, NY, 1988.

*Combined Crystallographic and Molecular Mechanics/Dynamics*

K.D. Watenpaugh, "Conformational Energy as a Restraint in Refinement", In:
*Molecular Dynamics and Protein Structure* (ed. J. Hermans), Polycrystal
Book, Yellow Springs, OH, 77-80, 1985.

A.T. Brunger, G.M. Clore, A.M. Gronenborn and M. Karplus, "Three dimensional
structures of proteins determined by molecular dynamics with interproton
distance restrainsts: Application to crambin", *Proc. Natl Acad Sci, USA*,
**83**, 3801-3805, 1986.