![]() |
|
![]() ![]() |
0° | 45° | 90° | 135° | 180° |
P(ψ
![]() |
.65 | .27 | .03 | .004 | .002 |
P(ψ
![]() |
.34 | .25 | .13 | .06 | .05 |
Direct methods procedures depend critically
on being able to predict the value of
and on being able to apply triplet relationships
as a series of equations. For large values of A,
has a high probability of being 0. and this makes
large-A triplets particularly important for these
processes.
For the quartet structure invariant relationship
the value of
depends principally on the probability factor
For uniform-atom structures
b tends to 1/N. The
probability distribution of
, ignoring the
cross-vectors magnitudes E(
+
), E(
+
), and E(
+
) (referred to later as
,
,
), may be written as
follows (Hauptman, 1976)
This function is similar to P(
|A), except that the value of B will tend to be
much smaller for large structures. The probability
distribution of
is dependent on more than the principal vectors
that go to make up B and is more correctly
where Z is a function of the seven vectors E(
) to
. An important property
of the probability expression (4.24) is that if the
cross-vectors are large, then expression (4.23) holds, but if
the cross-vectors are
small then expression (4.24) approximates as
The importance of this result may be illustrated for a fixed value of B (for example, 3.) and for small and large cross-vectors (XV)
ψ
![]() |
0° | 45° | 90° | 135° | 180° |
![]() |
.65 | .27 | .03 | .004 | .002 |
![]() |
.0001 | .0003 | .002 | .005 | .007 |
Because of the dependence of
on the magnitude of the
cross-vectors, quartets are usually grouped into two
classes according to the sum of cross-vector
magnitudes.
If XVsum is greater than a certain threshold (e.g.
XSHI), a quartet is referred to as a
positive quartet (because
should be positive). If it
is less than a lower threshold (e.g. XSLO), a quartet is
known as a
negative quartet (because
should be negative).
GENSIN estimates the value of XVSUM and various procedures
can be adopted by the user to control the generation of
quartets using this sum.
The normalization program,
GENEV
, converts known structural information into one or
more group structure factors G(h) for each reflection.
These group structure factors are used in
GENEV
to calculate an expectation value for
(h) where M depends on
the number of molecular fragments and the nature of the
fragment information (see
GENEV
for details).
GENEV
outputs the group structure factor values as the
magnitude G(h) and the phase g(h). Knowledge of the
structure influences the values expected for
. If the atomic
parameters are known to a certain precision, then G(h) and
g(h) values (which in this instance are the same as F(h)
and
(h)) may be used to
predict
to the same
precision. The group structure factor is in fact an
important component in the conditional probability
expressions for
and
.
Main (1976) modifies the value of A to the form:
where
The correspondence between
a' and
a in expression (4.20) is
quite apparent when the only knowledge of a structure is
its atomic content. Then each <
(h)>=
(s); each
(s) and
a' =
a. With increasing
structural information the value of
a' may differ
substantially from
a. The most important
term in expression (4.29) is the joint group structure
factor G(
,
,
) which is calculated
directly for each invariant from the known structural
information (see Main, 1976). This calculation is,
however, a time-consuming task even for small structures.
A more efficient approach involving a minimal loss in
precision is the use of the individual group structure
factors of the form (Hall, 1978):
This approximation applied to quartet relationships gives
where
In this way fragment information is used to predict
the distribution of
and
according to
and
. Most
importantly, however, the probability terms A' and B'
from (4.30) and (4.31) are complex
with the phase values
and
, which are estimates of
and
, respectively. The
reliability of
and
as phase estimates
depends on the precision of the structural information,
and the magnitude of A' and B', respectively. For
random-atom structures (i.e., fragment information type-1
in
GENEV
) A' = A and B' = B, and
.
If random-fragment (type-2) information is used in
GENEV
, the
and
values are also assumed
to be zero (this is a limitation of using approximation
(4.32) instead of (4.31)).
For type-3 and type-4 fragments information the values of
and
may be non-zero. In the
following description and input line formats
and
values are referred to
as the fragment
estimates
"QPSI".
In addition to generating structure invariants,
GENSIN provides the conditions for the origin and
enantiomorph definition of the cell. Fixing the origin and
enantiomorph is a necessary first step in the
GENTAN
phase extension process. It is performed
automatically; the user may, however, override this
procedure using phases selected according to the
definitions output by GENSIN. The conditions for specifying
the origin in terms of structure factor seminvariant phases
is detailed by Hauptman and Karle (1956) and Karle and
Hauptman (1956). Application procedures for applying these
conditions are described by Stewart and Hall (1971), Luger
(1980), and Hall (1983).
It should be noted that for GENSIN and
GENTAN
the seminvariant vector conditions are always in
terms of the input indices. It is therefore unnecessary to
transform centred indices to primitive indices for the
purposes of origin specification. Details of the
seminvariants vectors for centred space groups are
described by Hall (1982).
The origin of a cell is fixed by specifying the structure factor phases of p linearly-independent reflections. The value of p ranges from 0 to 3, and is determined by the space group symmetry.
Any reciprocal lattice vector h is a linear combination of p origin defining vectors h(1). . . h(p)
where
is any integer value. This
relationship may be expressed as the vector
transformation
where n is the set of integers n(1), . . .n(p) and H is the set or origin defining reflections h(1), . . . h(p). The linear relationship of a reflection h to the set of origin defining reflections H is given by
A reflection vector h may also be transformed into the seminvariant indices u by the operations
and
where V is the seminvariant vector matrix and m the seminvariant moduli (Hauptman and Karle, 1956). A necessary requirement of any set of origin defining reflections is that the matrix of seminvariant indices U
has the magnitude
The linear relationship of any structure factor phase
(h) may be
expressed in terms of the
u from (4.37) as:
If Q is defined as the set of origin defining phases
The seminvariant phase due to linear relationship of h to H may be derived from the linear combination of these phases (Hauptman and Karle, 1956)
If seminvariant phase q(
h ) is equal, modulus
, to the phase of
vector
h, then the value of
(
h ) is independent of the
enantiomorphous structure. If q(
h ) is significantly
different (modulus
) to
(
h ), then the enantiomorph
may be specified by fixing
(
h ) at one of its two
possible values.
For space groups where
(
h ) is restricted to one of
two values (e.g. for
and
),
the calculation of q(
h ) from integer set
n' and its application to
the phase set
Q provides a
straightforward approach to identifying the formal
requirements of enantiomorphic discrimination (Hall,
1983).
If all restricted phases have values of
(
h ) = q(
h ), then a reflection with
a non-restricted phase value must be used to specify the
enantiomorph. For these space groups, q(
h ) indicates the range of
values
(h) should have to
separate satisfactorily the enantiomorphs. Typically a
(
h ) value would be permuted
in a multisolution process, to a series of values for q(
h )+
/4 to q(
h )+3
/4 in increments of
/4. Differences
between
(
h ) and q(
h ) of less than
/6 will not provide
strong enantiomorphic discrimination and are likely to lead
to instability in the phasing process. For examples of
enantiomorphic discrimination see Hall (1983).
In the default mode GENSIN generates both triplet and quartet structure invariant relationships. Invariant types may be specified by the user with the trip and quar control lines. The number of structure invariants generated is determined by a range of parameters, including the number of generators, the magnitude of the E values, and the magnitude of the A and B thresholds. In default mode, the maximum number of invariants for either type is set at 2000.
The reflections used in the generation process are
selected from the largest En, where n designates the
E-type (1 or 2) output by
GENEV
. The default value of n is 1. The number of
generators is controlled by the user via the
gener line, or set
automatically according to the algorithm
MAXGEN = max (10000, 150 + NNHA*(4 + ICNT + 1/NEQP)),
where ICNT=1 if centrosymmetric and ICNT=0 if noncentrosymmetric, NNHA is the number of non-hydrogen atoms in the molecule, and NEQP is the number of general equivalent positions.
Triplet invariants are always generated for up to
100 of the smallest E-values independent of the TRIP
parameters. These are used subsequently in the
(zero)
figure-of-merit tests in
GENTAN
, and are referred to as
triplets
(Cochran and Douglas, 1957).
GENSIN provides, via the
psical line, the
facility to calculate
from phases and
structure factor values stored on the bdf. The
and |F| values on
the bdf may be from a previous
GENTAN
run, an
FC
calculation on a partial structure, or the back
transform of modified density for protein structures. The
bdf
lrrefl:
ID numbers for
and F are assumed
to be n750 and n751 unless otherwise specified on the
psical line.
The inclusion of fragment QPSI values (see the GENSIN line and description above) in the invariant generation process will modify the calculated A and G values, and therefore change the number and nature of invariants generated. This is quite independent of the E used (i.e., E1 or E2). It is recommended that E1, based on random-atom expectation values, are always used, except in special circumstances (see Subramanian and Hall, 1982; Hall and Subramanian, 1982a,b). This is because the QPSI values contain the group structure phase information of the fragment, and E1 has been shown to provide more reliable structure invariant relationships. As a rule-of-thumb E2 should be treated as a second option that the user can invoke in cases of severe non-randomness (e.g., hypersymmetry, super-structures or very dominant heavy atoms).
The
change control line is
available for modifying the magnitude of specific E or
(E) values.
change can be used to
enhance or to suppress a particular E in the generation
process by increasing or decreasing its E value. It is
also useful for running "replica" tests against other
software and different machines. Please note that
change lines must be
entered in the order of reflection data on the input bdf,
and be the last control lines entered.
The nature of quartet invariants generated by GENSIN is determined largely by the cross-vector magnitudes. The types and magnitudes of cross-vectors permitted during the generation process are controlled via the parameters IXVF, XVMN, XVMX, XSLO, and XSHI on the quar line. The XSLO and XSHI parameters apply only to quartets with cross-vectors inside the data sphere. The cross-vector limits XVMN and XVMX are applied to individual cross-vector E values. If an individual cross-vector E lies outside the range XVMN to XVMX the quartet is rejected. In the default mode, the sum of the cross-vector magnitudes XVsum (eqn (4.26)) is calculated for all cross-vectors inside the data sphere with XVsum < XSLO and XVsum > XSHI.
Quartets generated by GENSIN are used in several
different ways in the subsequent calculations. Typically
quartets are divided into three categories: those with
cross-vector sums above an upper threshold XSHI (known as
positive quartets), those below a lower threshold XSLO
(known as negative quartets), and those in between. The
last category is often not used in the phasing process
because of the unpredictability of the
values. The user can specify these upper and
lower thresholds, XSHI and XSLO, with the
quar line for both the
GENSIN and
GENTAN
calculations.
It is usual to use only quartets with cross-vectors
inside the data sphere. The XVsum is then able to be
calculated and a prediction made about the value of
. There are, however, some drawbacks to this
approach. When XVsum is greater than XSHI it is probable
that the quartet generated will in fact be equivalent to
a combination of three triplet invariants also generated
by the GENSIN process. The phase relationships provided
by positive quartets tend therefore to reinforce, rather
than add to, those provided by the triplets. The phase
"pathways" provided by quartets will, of course, be
different to those of triplets but the generators they
connect will, in effect, be the same.
Quartets with XSLO < XVsum < XSHI are usually not redundant to triplets but are less useful for the reasons already discussed.
Negative quartets with XVSUM less than XSLO provide completely different phase information to triplets but are very few in number. For this reason they are used in GENTAN for a figure-of-merit parameter.
In contrast, quartet invariants with one or more
cross-vectors outside the data sphere, provide
relationships that cannot be represented by a combination
of triplets. These quartets provide new phase pathways
and, as such, could prove crucial in particularly
difficult solutions. The disadvantage of these
"extra-terrestrial" quartets is the lack of cross-vector
information and, therefore, the inability to predict the
value of
. However, it may be assumed that
for these quartets has a distribution based on B
(just as
is a function of A) and an overall reliability
comparable to that of triplets (ignoring the relative
magnitudes of A and B). See Examples 3 and 4.
Reads E values from the input archive bdf
Writes structure invariant relationships to the
file
inv
GENSIN
Generate triplet and quartet invariants to a maximum of 2000 using type-1 E values. Only quartets with cross-vectors inside the data sphere will be output. QPSI values will be applied if fragment information is on the input bdf. Print invariant totals for all generators.
GENSIN nqpi :do not use QPSI information gener *2 300 :use top 300 E values trip yes 1.5 3000 :set max A and max triplets quar no :do not generate quartets print *2 1 50 100 :print SI for gens 1-50 to N of 100
Generate maximum of 3000 triplets with A values greater than 1.5 from 300 generators. Available QPSI values are not applied. Structure invariants for the top 50 generators are printed provided all generator numbers are <= 100.
GENSIN smax .45 :exclude all s values >.45 quar yes 0.75 *8 1. 5. :for Q4 B>.75 and XVsum>5 or <1. change 1 7 3 3.2 :make E = 3.2 change 2 3 -4 2.75 :make E = 2.75
Generate triplets and quartets for generators selected from Es with s<.45. Quartets will be accepted if B>.75 and has a cross-vector sum (all cross-vectors inside the data sphere) 5. or 1. The E value of reflections 1, 7, 3 and 2, 3, -4 will be modified on input.
GENSIN quar *5 outxv :generate quartets with outside cross-vectors
Generate triplets and quartets. The quartets must have at least one of their three cross-vectors outside the data sphere.
GENSIN quar *5 outxv -5. 0. :all cross-vectors outside sphere
Generate triplets and quartets. The quartets must have all three cross-vectors outside the data sphere.
Cochran, W. and Douglas, D. 1957. The Use of a High-speed Digital Computer for the Direct Determination of Crystal Structures . Proc. Roy. Soc. A243, 281.
Hall, S.R. 1981. A Procedure for Random-access to Reflection Data. J. Appl. Cryst. 14, 214-215.
Hall, S.R. 1982. Seminvariant Vectors for Centred Space Groups. Acta Cryst. A38, 874-875.
Hall, S.R. 1983. A Procedure for Identifying Enantiomorph-Defining Phases Acta Cryst. A39, 22-26.
Hall, S.R. and Subramanian, V. 1982a. Normalized Structure Factors. II. Estimating a Reliable Value of B. Acta. Cryst. A38, 590-598.
Hall S.R. and Subramanian, V. 1982b. Normalized Structure Factors. III. Estimation of Errors . Acta Cryst. A38, 598-608.
Hauptman, H. 1976. Some Recent Advances in the Probabilistic Theory of the Structure Invariants. Crystallographic Computing Techniques. F.R. Ahmed, K. Huml, B. Sedlacek, eds., Munksgaard. Copenhagen: 129-130.
Hauptman, H. and Karle, J. 1956. Structure Invariants and Seminvariants for Noncentrosymmetric Space Groups. Acta. Cryst. 9, 45-55.
Karle, J. and Hauptman, H. 1956. Theory of Phase Determination for the Four Types of Non-centrosymmetric Space Groups 1P222, 2P222, 3P12, 3P22. Acta. Cryst. 9, 635-654.
Luger, P. 1980 . Modern X-ray Analysis on Single Crystals. de Gruter: New York.
Main, P. 1976. Recent Developments in the MULTAN System - The Use of Molecular Structure. Crystallographic Computing Techniques F. R. Ahmed, K. Huml, B. Sedlacek, eds., Munksgaard. Copenhagen: 97-105.
Stewart, R.F. and Hall, S.R. 1971. X-ray Diffraction. Determination of Organic Structures by Physical Methods. F.C. Nachod and J.J. Zuckerman, eds., Academic Press: New York, 74-132.
Subramanian, V. and Hall, S.R. 1982. Normalized Structure Factors. I. Choice of Scaling Function. Acta. Cryst. A38, 577-590.