Example Inputs

This section contains some typical input data sequences for crystallographic calculations. These trace the basic steps from the data reduction of the raw diffraction intensities to the preparation of the publication material. The Xtal system divides these different steps of an analysis into separate calculation models. Users must learn which module does what and how these can be chained together in the best way to suit the study. The archive bdf's provide the main communication link between calculations with the auxilliary files supplying a specific data link between some calculations.

Usually several crystallographic calculations are performed together. The number of calculations in each run will vary according to the nature of the analysis and the expertise of the user. In fact, the length of the run is usually dictated by how often the user wishes to check how the calculations are proceeding. Between these runs the archive and auxilliary files retain the history and the accumulated knowledge of the calculation sequence.

Other example inputs are given with the program descriptions. More detailed examples are provided as the supplied test input files p6122.dat , saly.dat , diam.dat , ags4.dat and lac1.dat . Novice Xtal users are advised to print these files as an additional reference.

Step 1: Input primary data

The first step in any analysis is to enter the primary crystallographic information such as cell, symmetry and chemical formula. In Xtal this is usually done with the program STARTX which is responsible for creating and checking the initial archive file. [Note that all input command sequences in the .dat file must start with the compiddeclaration.]

compid exampl
title study of the compound C23 H28 O6 P212121 Z=4
STARTX
cell 17.076 16.604 7.425 90. 90. 90.
cellsd .004 .002 .007
sgname P 2ac 2ab        : H-M notation P212121
celcon C 92
celcon O 24
celcon H 112

Note that it is possible start an analysis with CIF input as well (using CIFIO) because this automatically invokes the STARTX routine in the process.

Step 2: Input reflection data

The next step, after creating the initial archive file, is to input the reflection data. This may be done in a number of different ways using DIFDAT , ADDREF , CIFIO or REFM90 . In these examples we will look at the two most common approaches involving DIFDAT and ADDREF.

If the reflection data are raw intensities or counts from a diffractometer then there needs to a calculation that reads this data and scales it for conversion to structure factors. DIFDAT is usually used to process a diffractometer file.

Here is a control sequence for processing a standard Siemens ascii file labelled compid.sie. DIFDAT reads this file and stores the scaled intensities on the archive file. ADDREF reads the archive file and reduces the intensities into structure factors (note that ADDREF stores both |Fmeas| and |Fmeas| squared on the archive file). The other data such as diffractometer angles are not entered (ie. absorption corrections will not be applied). If they were on would place allon the bdfinline. [Note that the compidline is omitted from this sequence but if it were used as a separate run it would need to be present].

DIFDAT sie baln 3
genscl
ADDREF 
reduce itof rlp4 12.2
bdfin hkl irel sigi

If one is entering reflection data that has already been scaled it may be entered as part of the .dat file as follows.

ADDREF 
reduce itof rlp4 12.2
hklin hkl rcod irel sigi
hkl 0 0 2 1 21023   644
hkl 0 0 4 1    40   5.97
hkl 0 0 6 1   252   11.4
hkl 0 0 8 1   33.4   4.9
.................................reflection data omitted for
brevity
hkl 19 11 0 2 .214+00 .333+01

Other initial reflection processing calculations are provided by programs such as SORTRF and ABSORB . Absorption corrections to the intensity data, if required, should be applied before the ADDREF step. The SORTRF calculation is used to sort the reflection data into a specified order and to average multiply-observed data into an asymmetric set. Most Xtal calculations need to operate with an unique asymmetric set of reflections and atoms. For example:

SORTRF order hlk aver 1

Look carefully at the examples given in the program descriptions.

Step 3: Solving the structure

The next step is to determine the crystal structure. This can be done with a variety of ways. The most automatic approach is to use the new iterative solution program CRISP , which amalgamates many of the individual program steps described in the alternate approaches described below. The command needed is simply

CRISP

Another direct methods approach which does not involve the automatic iterative program CRISP, uses the programs GENTAN or SIMPEL or PATSEE . Here is a typical sequence for these approaches

GENTAN   
FOURR emap
PEKPIK
PIG

For some centrosymmetric structures it may be more effective to run

SIMPEL       
FOURR emap
PEKPIK
PIG

Or in difficult cases where a fragment of the structure is known one should try the sequence

GENEV
GENSIN
FOURR epat
PATSEE       

Alternatively the user can apply the heavy atom method wth a combination of the programs GENEV , FOURR , PEKPIK and FC . There are many combinations of these calculations, plus routines such as MODEL and PIG, which can be used to visualise the proposed models.

In these seqquences the program GENEV calculates the E values needed for direct methods. GENSIN generates triplets and quartets and outputs them to the auxiliary file .inv. This file is read by subsequent phasing programs such as GENTAN(or SIMPEL or PATSEE). If rerun GENTAN for later phasing attempts the .invfile should be retained. GENTAN has a wide range of options for applying the multi-solution tangent phasing methods. The program SIMPEL can replace GENTAN in this sequence to determine phases using the symbolic addition approach.

The FOURR program inputs the phases estimated by GENTAN to produce an E map. This map is output to the file map. PEKPIK reads the mapfile and searches for the unique set of highest peaks in the asymmetric unit of the density map. These are output to the file pekwhich is in turn read by the programs MODEL and/or PIG . Both these programs search for interatomic connections between the peaks according a preset molecular geometry based on distance and angle criteria. PIG is an interactive graphics version of MODEL and has a number of a visualisation advantages. With MODEL special care must be taken with the specification of the parameters on the limits line as the default values may not be appropriate for the structure being studied. MODEL produces a table of bond lengths and angles, a printer plot of the connected peaks, and outputs peak sites as ATOMlines to the line file .pch. The ATOMlines may be used in subsequent calculations such as BONDLA or ADDATM . Changes to the atomic site information with PIG are automatically applied to the archive file.

Direct methods strategies.

In the event that solution programs such as CRISP, GENTAN, SIMPEL or PATSEE do provide a clear solution of the initial structure, it may be necessary to adjust the standard options used in these calculations. Here is some general advice on the optimal use of these routines. Note that in the above example input only a few command lines are needed. This is because the default controls are appropriate for most analyses. For some analyses additional control signals may be required. Here are some tips for the application of the direct methods programs.

  • For the first attempt, use the default values for a structure solution sequence as shown in the above example. This has been shown to solve about 90% of small and medium sized structures automatically.

  • For centrosymmetric structures SIMPEL may often be used in place of GENTAN and is about four times faster.

For structures not solved with the default options, try the following alternatives separately. Note that the judicious saving of the bdf's permits you to start calculations at the program in which the option needs to be changed. Do not always start with STARTX, or even GENEV!

  • Increase the number of starting phases in GENTAN using the select line (particularly for low symmetry structures).

  • Decrease Emin in GENSIN to 1.5 - 1.3 (particularly for triclinic).

  • If there are sufficient triplets (i.e. >10 times the number of E's), increase the Emin to about 1.6 in GENTAN (or GENSIN).

  • Increase the number of triplets and quartets by reducing Amin and Bmin to about 0.8 and 0.4 on the GENSIN lines trip and quar. Even lower values can be considered for large structures.

  • If a fragment is found in the space group P1bar with a possible origin translation, redo STARTX as P1 and locate the origin using the fragment in FC, FOURR, etc.Then return to P1bar for the refinement.

  • Exclude quartets from the GENTAN calculation using the invar line.

  • For noncentrosymmetric structures, change the magic option to permute on the select line of GENTAN. Alternatively, use the random option instead of permute.

  • Apply the block and/or w2 options on the refine line of GENTAN.

When assessing a direct methods run for success or failure, do not rely wholly on the GENTAN or SIMPEL figure of merit estimates. For some structure types these will be unreliable. It is good practice to calculate the E maps for the top four (or even eight) phase sets and then look closely at the PEKPIK results. One or two dominant peaks usually signals an incorrect phase set (except if there one or two heavy atoms in the structure!). The best E maps contain a large number of medium height peaks and this may be used as a quite sensitive criterion for identifying the more correct phase sets.

There are many other combinations of options available in GENSIN and GENTAN. If the above alternatives fail you should look critically at the reflection data, both in terms of its precision and high angle limit, and at the possibility of incorrect space group symmetry. In some cases structures are solved automatically by recollecting the data with better precision and/or higher angle data. In cases where there are very high B-values it may be necessary to collect low-temperature data. It cannot be emphasised enough that these methods are very dependent on good E-values. GENEV works well with good data, but if the Wilson plot shows anomalies at high angles it is better to exclude this data from the structure solution (by setting Smax in GENEV). Always check the Wilson plots carefully if the default run fails.

Step 4: Refining atom parameters

Once a structure is determined the coordinate and displacement parameters need to be refined. A variety of programs are needed for a refinement sequence. Often the initial atomic coordinates are loaded onto the archive file by PEKPIK and PIG. In other instances all, or same, of the coordinates will need to be loaded with ADDATM.

In this simple example ADDATM is used to load atom data and CRILSQ or CRYLSQ to refine these parameters. The programs RSCAN , PIG and BONDLA are used to gauge the progress of the refinement and check the reflection data and current model.

ADDATM
scale 1.02033 1
atom       C1  .48236 -.11502  .93088  .0332 1.0 .00023 .00024
.00053
uij       C1  .00244  .00391  .01162 -.00008  .00009 -.00021 
atom       O1  .47364 -.18767  .83988  .0385 1.0 .00017 .00015
.00039
uij        O1  .00378  .00323  .01594 -.00027 -.00053  .00071
.......................................atom data omitted for
brevity
atom        H6  .75500 -.10800  .93600 .0570 1.0 .00000 .00000
.0000
atom        H7  .64700 -.14500 1.20600 .0680 1.0 .00000 .00000
.0000
atom        H81 .81360 -.11540 1.23320 .0740 1.0 .00000 .00000
.0000
atom        H82 .79070 -.20340 1.18740 .0740 1.0 .00000 .00000
.0000
atom        H83 .78040 -.17080 1.37730 .0740 1.0 .00000 .00000
.0000
CRYLSQ cy 2 ad fm fu 0.7   : CRILSQ could be used instead
noref  H
RSCAN
PIG
BONDLA

This sequence emphasises the modularity of Xtal calculations. There are many other combinations involving many programs. For example, when atoms are still missing from the model (or one is searching for the H atom sites) the following combination is very useful.

CRYLSQ cy 2
noref  H
FOURR fdif
PEKPIK
PIG input
peaks prad 1.

Here the program PIG is used to decide which peak sites are to become atom sites (and are labelled) and which should be rejected

Step 5: Checking molecular geometry

Periodically during an analysis the model and refined structure factors need be checked. The model can, of course, be monitored with PIG, but many other checking tools are provided in the Xtal system. These include BUNYIP , LSQPL , BONDLA , REGFE and RSCAN , to name a few.

Here is one sort of run which also inserts H atom sites from the non H atom geometry.

BONDLA  cont  dihe
atrad H 1.   .5
FOURR fdif r2 0
PEKPIK  punch
BONDAT
calcat tetchn 0.95 O5 C1 C2 H1 H11
calcat teterm 0.95 C1 O1 C11 H111 H112 H113
calcat tetchn 0.95 C1 C2 C3 H21 H22
calcat tetchn 0.95 O2 C21 C22 H211 H212
calcat trigon 0.95 C22 C23 C24 H23
calcat trigon 0.95 C23 C24 C25 H24
calcat trigon 0.95 C24 C25 C26 H25
calcat tetchn 0.95 C3 C4 C5 H41 H42
calcat tetchn 0.95 C4 C5 O5 H51 H52
calcat teterm 0.95 C6 C7 C8 H81 H82 H83

BONDAT calculates new coordinates for the H atoms based on the prescribed geometries of the refined non-hydrogen sites. Note that PIG could also have been used after PEKPIK for viewing and manipulating the atom sites, and for checking the geometry.

Step 6: Preparing for publication

When the structure is ready for publication various molecular plots, and tables need to be prepared. Here are some of the calculations that can be used for this purpose.

BONDLA
LSQPL
PLANE
DEFINE C22 C23 C24 C25 C26 C27
nondef O2 C21
PLANE
DEFINE C32 C33 C34 C35 C36 C37
nondef O3 C31
PIG
ORTEP mole acta
PREVUE
PLOTX postl pre
LISTFC nofl wid 120 lin 64
ATABLE isou
CIFIO cifo

General Advice

The examples shown above are a very small sample of the combinations possible with Xtal programs. A novice user should experiment with input sequences using known data so that there will be a proper understanding of the various functions. And always remember that the archive bdfs aa1and aa2are interchanged with every update, so it is best to initially rely on the master file procedure which always reads and generates the latest aaafile. The aaafile is also readable with a text editor and this provides a new user with the opportunity to learn how the data is archived. However, only the most experienced users should ever attempt to change values in this file with a text editor, and then, only after keeping a copy of the unedited version!