Authors: Ruth Doherty and Sydney Hall
Contact: Syd Hall, Crystallography Centre, University of Western Australia, Nedlands 6907, Australia
MODEL searches peak or atom sites for a connected molecular model. The model interpretation is on the basis of bond lengths and angles, or if information concerning the connectivity sequence of a fragment or molecule is supplied, MODEL attempts to match a canonical description of the input fragment to the connected peak and atom sites. Connected sites are plotted on the printer and a comprehensive list of distances and angles is produced.
The first step in the interpretation of a molecular model is to determine which sites are connected. Sites are considered connected if their bond radii overlap. The bond radii of atom sites is extracted from the archive bdf. The bond radius for peak sites is input on the limits line. The search for these connections includes all possible symmetry transformations of the sites. Sites connected in this way are placed in the same group and labelled as a cluster.
Maximum and minimum bond angles represent a different type of constraint on the site coordination. Within a cluster of sites there may be large number of different ways that the angle constraints can be satisfied. Sites that satisfy a particular interpretation of angle constraints are grouped in a subcluster. One of three different approaches must be selected (on the MODEL line) for identifying sites that belong to the same subcluster. They are:
In this mode no angle limits are applied to the connected sites. All sites within the specified site radii will be included in the model.
In this mode sites are accepted into a subcluster by testing the angles in the order of decreasing peak height and bond connectivity (the number of bonds to each site). This referred to as the maximum connectivity approach.
A more geometric approach to angle constraints is based on weighting sites according to bond lengths and angles they form. Peak sites are assigned weights based on proximity of a connecting bond length to mean accepted value (bondmax-bondmin)/2 and bond angles to the mean (angmax-angmin)/2. This can be an important approach for very regular (e.g. 'chickenwire') or highly-coordinated structures. It is not suitable, however, for structures with a range of coordination geometries (e.g. heavy atoms or 5- & 6-membered rings.).
The ability of MODEL to identify the correct set of connected sites is strongly dependent on the bond length and angle constraints discussed above. When only peak sites are input the default values (see the limits line) will permit a range of geometries. In such cases the limits will be adequate for a molecular search provided the peak sites are well-defined and there are not too many spurious peaks in the map. If possible the user should specify limits to suit the stereochemistry of the structure. Be careful - if you are too restrictive, some legitimate peaks may be rejected; if you are too permissive, spurious peaks will complicate the interpretation of the subcluster.
In addition to the bond length and angle modes for site selection, the user may specify a known fragment as a template (see the conect line). This approach places stringent stereochemical constraints on the selection of connected sites and should be used whenever possible. It should be emphasised, however, that the fitting of the input model to the peak sites is strongly dependent on the sites that are connected in rings. Special care must be taken that the input ring sites are correct - otherwise the fitting process will probably fail. The reliability of the non-ring atoms is less critical.
Interpretation of the connected peaks depends on the 'quality' of the Fourier map, the appropriateness of the bond length and angle constraints and the availibility of reliable stereochemical information. The 'best' model is selected from two figures-of-merit values.
The first FOM is based on the sum of the subcluster connections, weighted according to peak height. This is a reasonable measure provided the bond length and angle constraints are appropriate for the structure. The expected value for the FOM is very dependent on the number of peak sites entered compared to the number of non-Hydrogen sites in the structure.
A second FOM relies on the stereochemical information input on conect lines. If this is used a FOM value is calculated based on the fit of the input model to subcluster peaks. This is a more sensitive FOM than the first and has an optimal value around 2.0. It should be emphasised that, just as the fitting process is strongly influenced by sites connected in rings, this FOM is especially sensitive to matching atoms connected in rings.
MODEL outputs a range of numerical and graphical information about the modelling process. Here is a summary of the different sections.
This example shows the fully defaulted run. All default search conditions apply. No molecular connections (as defined by conect lines) are used in the FOM assessment.
MODEL conn limits 4 0.6 0.9 100 140 conect c1 o1 o2 c6; c7 c2 c6 03; c5 c4 c6; c3 c4 c2
The compound described in this input is salicylic