MEPHAS: Establish Phases of Selected Reflections by
Application of a Maximum Entropy Procedure
Authors: Edward Prince, Douglas M Collins & James M Stewart
Contact: James M Stewart, Department of Chemistry,
University of Maryland, College Park, MD 20742, USA
MEPHAS calculates reflection moduli from a low-resolution direct-space
constrained-exponential electron density distribution and a selected subset of
reflections. It also calculates an updated constrained exponential electron
density map and its entropy. Trials are performed with different phases
assigned to the chosen reflections by a fractional factorial design. The trial
which shows the map with maximum entropy is accepted as the the updated
constrained exponential electron density distribution and the determined phases
are assigned to the chosen reflection subset. The input electron density map is
based on a smaller number of starting reflections. It is used as a basis to
establish, by maximum entropy criteria, the phases of the a "next" subset of
chosen reflections. This process is used to extend the number of phased
reflections. At each stage in the process of establishing the maximum entropy
phases the constrained exponential electron density function is held everywhere
positive. The magnitudes of the F calculated implied by the electron density is
refined by a dual function procedure which forces a match in the magnitudes of
the observed and calculated F values of the chosen subset of
reflections.
This program is modeled on the programs MESIGN and MEFFIT written by E.
Prince at the National Institutes for Science and Technology, Gaithersburg,
Maryland.
Purpose
MEPHAS reads a direct space exponential electron density map generated by the
programs FOURR and MEDENS or MEPHAS itself. It extends the number of
reflections phased and to make the map such that the moduli of the observed
reflections are expressed in it. The input map can be prepared by only the
phases needed to define the origin and enantiomorph of the space group with,
perhaps, others determined from other information or a previous MEPHAS run.
Method
Two terms are used repeatedly in this description and in the program output.
They are trial and iteration. A trial refers to the use of one
set of phases applied to a subset of reflections to be phased. An iteration
refers to an application of Prince's dual function to bring the moduli to
conformance.
MEPHAS operates in four distinct modes:
1) The 'MEsign' mode.
Sixteen centric reflections are tested. From 49 to 289 separate trials are
carried out until the phases corresponding to the maximum entropy of the
constrained exponential electron density has been found. In each trial dual
function iterations are applied until a good fit of the 16 reflection moduli is
achieved. The first 32 trials consist of assigning 16 patterns of centric
phases and their supplements to the chosen subset of reflections to be phased.
A 'centric' reflection is one with phase restricted to 0/180, 30/210, 45/225,
60/240, 90/270, 120/300, 135/315, or 150/330 degrees. The difference in entropy
associated with the maps from each pair of trials is used in Yates's algorithm
to predict the maximum entropy phase set. A check is made to see if any of the
32 trials show a greater entropy than the Yates's result. The phases from one
or the other source are then tested by taking the supplement of each phase,
reflection by reflection in turn. If the entropy rises the phase is accepted
and the next reflection tested until all have been 'checked'. This checking
will be confined to checking each of the 16 reflections once, or optionally,
over and over until no changes are detected. That is a maximum of of 256
possible trials. Usually the process is complete in 49 trials. The final phases
and their F(calc) values are put out to the bdf without changing any other
reflections previously treated. In addition the final constrained exponential
density as modified in the run is written to an output file.
2) The 'MEosc' mode.
A specified number of centric reflections are oscillated by 180 degrees in
phase. At each trial, when the F fitting iterations are completed, if the
entropy has risen, the current value of the reflection phase is saved and left
unchanged as the trials continue. This is tantamount to running the last part
of the MEsign mode.
3) The 'MEcyc' mode.
A small number of general reflections, usually one, may be specified along with
a phase 'step' to be applied. These reflections will then be tested against the
maximum entropy criterion at each step from 0 to 360 degrees less the step.
Once the maximum entropy phase is established for each reflection in turn that
value is used while the next reflection is tested.
4) The 'MEfit' mode.
A number of general reflections may be specified for moduli fitting. In this
mode the initial phases will be those obtained from the input bdf. Usually a
previous run of RFOURR using the last previous constrained exponential electron
density map will be required. In this case the map will be refined in one trial
of as many iterations as are required to bring the moduli into conformance.
This is the many reflection moduli fitting mode of operation. Note that
differing from the MEFFIT approach, except for special cases, input phases are
held fixed in the logarithm of density while the constrained exponential map is
modified to exactly fit the input |F(obs)|. The phases derived from the
modified map will then show a departure from the phases of the prior map. The
program MEFFIT refines the phases as well as forcing the moduli to conform.
The overall process may be described in terms of the 'MEsign' mode (1). The
other three modes are essentially similar except that in the case of trials of
general reflections, 3), far fewer reflections may be treated at a time. In the
case of moduli fitting, 4), many reflections are treated for just one trial. In
every case at the end the updated constrained electron density map is written
to file mee. The finally arrived at phases and calculated
moduli of the chosen reflections are placed in the output binary data file.
The following steps give an overview of the procedure in mode 1:
I) The program must have as input a "prior" exponential electron density map
which has been prepared using a subset of previously phased reflections. This
requires the use of the XTAL links FOURR and MEDENS. The resolution of this map
must be sufficient to accommodate the highest order reflections used in the
calculation of Prince's dual function which follows.
II) Sixteen reflections which are to be phased in the run must be chosen and
specified by means of PHI input lines or by allowing automatic selection based
on magnitude of F observed. In the 'MEsign' stages of the phasing process these
will be "centric" reflections, that is, ones restricted to two possible values
differing by 180 degrees.
The steps which follow are under program control:
III) If "centric" reflections are being determined, the sixteen reflections
specified by the user are assigned phases, iteratively, in 32 patterns. For
each pattern the entropy of the "prior" electron density as modified by the
inclusion of these new reflections is calculated. The 'MEsign' feature sets up
a search for the phase (sign) pattern associated with the highest entropy, the
"maximum entropy". Once the sign pattern with the highest entropy is
identified, the phase of each of the sixteen reflections in turn is switched
and the result tested by the maximum entropy criterion. By this technique the
testing of the 65536 possible sign combinations is reduced to 49 trials. For
every trial their are a number of iterations carried out using Prince's dual
function to force the map to reflect the observed moduli of the reflections
being tested.
The entropy of each map for each iteration of each trial is calculated by:
S = sum(-rho*ln(rho))/sum(rho) + ln(sum(rho)/nx*ny*nz)
where the sums are over all the pixels in the unit cell sampled in nx points
along x, ny along y, and nz along z. The values of rho used in the calculation
are in electrons per pixel. The entropy, S, is independent of the scale of rho,
and, provided rho is sufficiently sampled by the pixel grid of nx, ny, and nz
points, S is also independent of the number of pixels. This "standard relative
entropy, S is calculated against "the uniform distribution" so that it
represents a measure of the "lumpiness" of rho. The maximum entropy corresponds
to the smoothest possible all positive map.
IV) At any stage in the search, the subset of sixteen selected reflections
produce 304 reflections which will be used in the Hessian matrix to compute
corrections to the electron density. These corrections will force the
magnitudes of F observed and F calculated closer to one another. The "reverse"
Fourier transform of an output map from MEPHAS will give F calculated values
with the derived phase, and the magnitude of F calculated will conform to the
observed F.
The application of Prince's dual function in a system of equations which lead
to a modification in the electron density map forces the magnitudes of the F
values closer together while maintaining smoothness and positivity of the map.
The number of reflections (304) in the expanded list arise from the combination
of the indices of the sixteen chosen reflections. h'(+) = h(i) + h(j) ; h'(-) =
h(i) - h(j) i ranging from 1 to 16; j ranging from i to 17 where in the 17th
case h'(+) = h(i) and h'(-)=h(i). In each phase search cycle the structure
factors of all 304 reflections are needed to set up a Hessian lower triangular
positive definite matrix and a gradient based on the delta F values. This
system of equations is solved by Cholesky factorization to give Fourier
coefficients for the sixteen reflections being phased. These coefficients are
then used in a 'forward' Fourier transform to produce a map which contains
values on a logarithmic scale which are used to adjust the ln(rho) of the
'prior' constrained exponential map. The corrections to the "prior" map
produced by the application of Prince's dual function to determine the Fourier
coefficients is to change the value of rho at each pixel such that the value of
each of the 16 |F calc| moduli moves closer to those of |F obs|. At the same
time the map remains everywhere positive and as smooth as possible. After each
trial the prior map is left untouched to be adjusted anew in the succeeding
trials.
The trials are carried out until the phase combination is found, which in
conjunction with the reflection moduli, produce a maximum entropy constrained
exponential electron density. The phases determined are placed in the output
binary data file on aa2 without changing the phase values for
any other reflections save those established in the run. The input c. e. e. d.
map corresponding to the new result is written to file mee in
the same format as the 'prior' map on file med. By copying file
mee to file med it may then be used as a new
'prior' c.e.e.d. input map and other reflections may be phased under one of the
four modes in which the program may be run.
Automatic Selection of Reflections
If no phi lines are supplied the reflections to be phased will
be taken, in order, from the input binary data file. Selection will be the
largest encountered unphased reflections where delta(F)>DF. In the case of a
'MEsign' run the next 16 centric reflections will be treated. In the case of a
'MEcyc' run where a general reflection is being tested or a 'mefit' run the
number of phases specified in the MEPHAS line will be used. For any of the
phase cycling modes delta(F) is ABS(Fobs - Fc) where the value of Fc is that
found in the input bdf. If Fc in the input is void, i.e. never determined, Fc
is taken to be zero, hence delta(F) is just F observed. For the mefit mode
delta(F) is taken as Fc*(Fo-Fc) so that all large Fc reflections will be left
out. The use of a MAXHKL input line will restrict the choices to be reflections
with magnitudes of h, k, and l less than the values given. It is important to
note that the generation of the Hessian matrix involves the combination of all
the reflection indices with themselves. Thus if the maximum h is 6 a reflection
with h 12 will be generated. In order for this reflection to have F calc by
reverse Fourier transform the resolution of the Fourier map must be at least
2n+1 of the maximum index. In this example the x grid must be 25 or finer. To
save computation time and to assure the most meaningful values of entropy it is
best to choose grids appropriate to the maximum h, k, & l being fitted. I.
e. 4*maxh+1 or as close thereto as FOURR and RFOURR allow.
File Assignments
Reads space group, cell, and reflection information from input binary data
file
Reads a 'prior' constrained electron density map from med
Updates the newly phased reflections on the output binary data file
Example
: Test of maximum entropy routines MEDENS and MEPHAS
: REMARK HYDROLASE 06-APR-81
1BP2*
REMARK PHOSPHOLIPASE A=2= (E.C.3.1.1.4) (PHOSPHATIDE
REMARK 2 ACYL-HYDROLASE)
REMARK BOVINE (BOS TAURUS L.) PANCREAS
REMARK B.W.DIJKSTRA,K.H.KALK,W.G.J.HOL,J.DRENTH
TITLE BOVINE (BOS TAURUS L.) PHOSPHOLIPASE A=2= B.W.DIJKSTRA, et. al.
compid BPHOLP
STARTX
cell 47.070 64.450 38.150 90. 90. 90.
cellsd 0.02 0.02 0.02 0. 0. 0.
sgname P 2AC 2AB : int tab P212121
celcon H 5350
celcon C 2408
celcon N 652
celcon O 1188
celcon S 56
celcon Ca 4
exper 1 1.54178
ADDREF ffac frie
limits *4 .2941176
hklgen HKL FREL SIGF FFRI SFFR
proatm
scale1 .021245 0.000000 0.000000 0.00000
scale2 0.000000 .015516 0.000000 0.00000
scale3 0.000000 0.000000 .026212 0.00000
: PDB atom coordinates omitted end
SORTRF order KHL sort 3 FREL pakfrl aver
end
FC iso archiv 1800 1801 1802 1803 1804 1700
end
PHONYD fri err 0.00001
end
reset psta 3
MESTAR
phi 1 0 1 90
phi 0 1 1 270
phi 0 2 1 180
phi 1 1 0 90
end
FOURR fobs ffsum nw
grid 60 80 48
layout down across layer 60 80 48 0 0 0 1 1 1
end
MEDENS
end
MEPHAS nr 4 mf si 0.005
: MEfit test
phi 1 0 1 90
phi 0 1 1 270
phi 0 2 1 180
phi 1 1 0 90
end : There must be an end line
copybdf mee med : This wil overwrite the initial "prior"
MEPHAS CR 350 CV 0.5 SI 0.005 ms
: MEsign test
phi 1 0 1 90 FIX
phi 0 1 1 270 FIX
phi 1 0 2 0
phi 1 5 0 270
phi 0 2 0 180
phi 0 0 2 0
phi 2 2 0 180
phi 0 2 1 180 FIX
phi 1 1 0 90 FIX
phi 0 3 1 270
phi 2 1 0 0
phi 2 0 1 90
phi 1 4 0 90
phi 3 4 0 270
phi 2 0 3 270
phi 0 9 9 270
phi 0 2 3 0
phi 4 2 0 0
phi 2 0 8 180
phi 2 0 0 180
end : There must be an end line
copybdf mee med
MEPHAS CR 350 CV 0.5 SI 0.005 ms
: MEsign test program chooses refl
end : There must be an end line
MEPHAS CR 350 si 0.005 mc nr 10
: MEcyc test
phi 1 1 1
end : There must be an end line
copybdf mee med
PHACMP ext PHD shel 16
phaso1 ds 1 fc
phaso2 ds 1 fc
end
FINISH
Reference
E. Prince, L. Sjolin, and R. Alenljung (1988) Phase Extension by Combined
Entropy Maximization and Solvent Flattening. Acta Cryst. A44,
216-222