PREPUB : Pre-publication tests on CIF structural data.

Authors: Doug du Boulay & Syd Hall

Contact: Syd Hall, Crystallography Centre, University of Western Australia, Nedlands 6907, Australia

PREPUB performs a subset of the IUCr data validation series of CIF tests on all data blocks of a supplied CIF containing structural data.

Description

A series of checks are made on a CIF containing one or more data blocks. The following tests are applied:

  1. Test that the data organisation of the CIF conforms to the STAR syntax. Mismatched tags and values are identified, as well as unclosed text blocks, and whether items are missing from a looped list.

  2. Check that the data items in the CIF are defined in the standard CoreCIF dictionary.

  3. SYMMG01 DVtest: Check the _symmetry_space_group_number value matches that expected for the _symmetry_space_group_name_H-M entry.

  4. SYMMG02 DVtest: Check that the _symmetry_equiv_pos_as_xyz values are consistent with the _symmetry_space_group_name_H-M entry.

  5. CELLZ01 DVtest: Check that the combination of the _cell_formula_units_Z number and the _chemical_formula_sum entry matches the atomic content derived from the atomic site information in the _atom_site list.

  6. CHEMW03 DVtest: Check that the _chemical_formula_weight value matches that calculated from the atomic site information in the _atom_site list.

  7. REFLT03 DVtest: Check that the calculated number of reflections in the diffraction sphere out to the _diffrn_reflns_theta_max value is consistent with the number reported in the CIF by _reflns_number_total. Checks are made on the _diffrn_reflns_limit_ max and min values, and the presence of Friedel pairs.

  8. STRVAL02 DVtest: Check that the _refine_ls_abs_structure_Flack value is in a sensible range.

Examples

A typical input sequence is as follows, assuming a CIF filename is banana.cif

compid banana
prepub

This results in an output, such as:

 PREPUB banana   16-Mar99   Page  1   Xtal3.6 FEB97:UNXDEC
 ********************
 *** Begin PREPUB ***
 ********************
 INPUT FILE:  banana.cif
 >>>>> Check input data names against dictionary.
 ciftbx warning: banana.cif data_global line:     38
  Data name _iucr_compatibility_tag not in dictionary!
 ciftbx warning: banana.cif data_global line:     38
  No category defined for _iucr_compatibility_tag
 ciftbx warning: banana.cif data_banana line:    342
  Numb type violated  _exptl_crystal_density_meas
 >>>>> End of CIF dictionary checking.
===========================================================================
 >>>>>>>>>>>>>>>>>
Data block: global
===========================================================================
 >>>>>>>>>>>>>>>>>
Data block: banana
 SYMMG_01: Check the space group number
 --------------------------------------
 From the CIF: _symmetry_space_group_number            ?
 From the CIF: _symmetry_space_group_name_H-M          P n a 21
 International Tables space group number for P n a 21 is  33
 SYMMG_01: OK
 SYMMG_02: Check the space group is recognised
 ---------------------------------------------
 From the CIF: _symmetry_equiv_pos_as_xyz
                                           x, y, z
                                           -x, -y, z+1/2
                                           x+1/2, -y+1/2, z
                                           -x+1/2, y+1/2, z+1/2
 The xyz symops generate the Hall space group symbol   p_2c_-2n
 The xyz symops consistent with the H-M space group    P_n_a_21
 From the CIF: _symmetry_space_group_name_H-M          p_n_a_21
 SYMMG_02: OK
 CELLZ_01: Check formula with the supplied model
 -----------------------------------------------
 From the CIF: _cell_formula_units_Z        4
 From the CIF: _chemical_formula_sum  Mo4 O11
      TEST: Compare cell contents of formula and atom_site data
             atom    Z*formula  cif sites diff
             Mo        16.00     16.00    0.00
             O         44.00     44.00    0.00
 CELLZ_01: OK
 CHEMW_03: Check formula weight with the model
 ---------------------------------------------
 From the CIF: _cell_formula_units_Z          4
 From the CIF: _chemical_formula_weight     559.76
      TEST: Calculate formula weight from _atom_site_*
             atom     mass    num     sum
             O        16.00   11.00  175.99
             Mo       95.94    4.00  383.76
            Calculated formula weight       559.75
 CHEMW_03: OK
 REFLT_03: Check the reflection counts
 -------------------------------------
 From the CIF: _diffrn_reflns_theta_max           29.99
 From the CIF: _reflns_number_total               2267
 From the CIF: _diffrn_reflns_limit_ max hkl   33.    9.    7.
 From the CIF: _diffrn_reflns_limit_ min hkl    0.    0.   -7.
      TEST1: Expected hkl limits for theta max
                      Calculated maximum hkl   33.    9.    7.
                      Calculated minimum hkl  -33.   -9.   -7.
      TEST2: Reflns within _diffrn_reflns_theta_max
          Count of symmetry unique reflns         1373
          Completeness (_total/calc)            165.11%
      TEST3: Check Friedels for noncentro structure
          Estimate of Friedel pairs measured       894
          Fraction of Friedel pairs measured     0.651
          Are heavy atom types Z>Si present        yes
 REFLT_03: OK
 STRVAL_02: Check absolute structure measures
 --------------------------------------------
 From the CIF: _refine_ls_abs_structure_Flack       0.390
 From the CIF: _refine_ls_abs_structure_Flack_su    0.070
 ALERT: Flack test results are ambiguous.
 Time  h  m  s     CPU secs    Total CPU secs     Memory words
      12:29:53      9.26126           9.40571         201001