The processing of binary files is one of the most important aspects of Xtal
programming. Here we summarise the various procedures and macros used to read
and write these files. Knowledge of the archive file structure is assumed (see
Section 5). In particular, familiarity with the concepts of a logical record, a
logical record directory and a logical record packet, are needed to understand
the procedures discussed here.
As with line input/output, a developer is provided with a set of tools for
handling binary files. The standard Fortran input/output instructions are
inadequate for this purpose, both for the reasons cited for line I/O, and
because of the enormous flexibilty that exists in the structure of the archive
file. This flexibility necessitates considerable checking and bookkeeping on
the part of the nucleus routines.
The checking operations have been largely concealed from programmer through
the use of the macro tools:
writepkt: |
provides the position for the next packet to be inserted into thedesignated output bdf buffer, and writes the buffer when full. |
readwpkt:
|
provides the position of the next packet to be extracted from thedesignated input bdf buffer, and reads a new buffer when complete. |
indexpkt:
|
reads and if requested constructs the directory packet of a logical record. |
copyfile:
|
copies designated logical records from one bdf to another. |
The nucleus permits up to eight different binary files to be assigned to a
calculation at one time. Typically between two and four files are used. The
device number for each binary file is stored in the system array IOUNIT(1) to
IOUNIT(8). In the program files are referenced by the index of IOUNIT, i.e. bdf
1 to bdf 8. Externally these bdf's are usually referred to by their filename
extensions.
The line files discussed above are character files, and, as such, require the
use of the character buffers CHRIN and CHROT. Bdf's are output in 'binary',
i.e. exactly as they appear in computer memory. The buffers used to output
these files are allocated by the programmer as part of the QX array. Each bdf
used in a program must be assigned a buffer of real words in this array. This
is quite simple to do. For example, the buffer area for bdf 2 is designated in
the QX array with the following lines
IOMARK(2)=MARKS# mark start of bdf 2 buffer
incrqx:(MARKE,MARKS+bdfbuf:,XX0502)# mark end of bdf 2 buffer
The first line specifies the start of the buffer space in the QX array by
setting the system variable IOMARK(2) to the current QX limit MARKS. The next
line uses the macro incrqx: to set MARKE equal to MARKS plus the length of the
buffer in words (defined by the macro bdfbuf:) and to request that the QX limit
be extended to this value. The bdf reading and writing macros will
automatically use this allocated buffer space.
Here is a brief summary of how these macros are applied. For detailed
information refer to the later macro definitions.
writepkt:(NFIL, LREC, PSIZ, PAKPT)
writepkt: is responsible for the 'putting' of the binary file buffer to the
physical device. It is equivalent to writeline: for the line files, except that
each application of writepkt: does not necessarily result in a buffer being
output. The different mechanism is because information is stored in a bdf
buffer as 'packets'. Each call to writepkt: provides the position in the buffer
where the next packet starts. If there are insufficient words for another
packet, writepkt: outputs the buffer and positions the pointer at the front of
the buffer. writepkt: performs all of the functions needed to construct a
logical record (bdf 'lead words' are described in Section 6), but the
programmer is responsible for transferring the packet information into the
buffer.
The arguments of writepkt: are as follows. NFIL is the bdf file number (1 to
8). LREC is the logical record number in the form of a system macro (see the
XMACRO file or Section 6). PSIZ is the number of words to output as a packet.
PAKPT is the QX array index returned by writepkt: which points to the first
word of the packet minus one. That is, the first word of a packet is
QX(PAKPT+1).
writepkt: is a file creator but it is usually used in conjuction with
copyfile: and readwpkt:. Logical record 1 is special, in that it is
automatically updated with the calculation history by the nucleus, and is
usually handled by these latter two macros (see example below). For all other
records, writepkt: may be either used separately or in conjunction with
readwpkt: and copyfile:.
readwpkt:(NFIL, LREC, PSIZ, PAKPT, MFIL)
readwpkt: is used to read, or read and write files. It is the 'read'
equivalent of writepkt: where PAKPT points to the start of the packet to be
read. readwpkt: inputs from bdf NFIL into a buffer starting at IOMARK(NFIL).
Provided the input packets do not need to be expanded or contracted, readwpkt:
may also be used to output this buffer to file MFIL. This is referred to as the
read-write single buffer mode. When packets being output differ in size from
those being read it is necessary to use both readwpkt: and writepkt: and
separate buffers. Note that in single-buffer mode it is possible to change the
value of words in a packet but no actual transfer of data is needed (as is the
case in the double-buffer mode).
copyfile:(NFIL, MFIL, LREC1, LREC2)
copyfile: copies all data of logical records LREC1 to LREC2 from the buffer of
the binary file NFIL to the buffer of the binary file MFIL. Note that if LREC2
is endrecord: both bdf's NFIL and MFIL will be closed. If NFIL=1 and MFIL=2,
the device numbers stored in IOUNIT(1) and IOUNIT(2) will be interchanged. This
'automatic interchange' process is a standard feature of the nucleus, and
ensures that the most recent data will always be read from bdf 1 (i.e. fileA).
Note also that the use of readwpkt: to read logical record endrecord: from bdf
1, and in double-buffer mode, writepkt: to write endrecord: to bdf 2, will also
result in the interchange of these device numbers.
indexpkt:(NFIL, LREC, PSIZ, PAKPT, MFIL, KEY, WANT, RELPT)
When processing directory-format (rather than fixed-format) logical records
the programmer needs to set up a procedure for extracting (or installing)
information about the contents of a given record. The directory packet is the
first packet in the logical record, and contains a unique identification number
for each item stored in subsequent packets. The order of the ID numbers in the
directory packet is identical to the order of the item values in all subsequent
packets in the logical record.
indexpkt: is used to extract, and to instal, item identification numbers from
and into the directory packet of a logical record. The first five arguments in
this command are identical in function to readwpkt:. WANT is an integer array
of item identification numbers (see Section 6) that are to be searched for,
appended, updated or deleted from the directory of the designated logical
record LREC. The purpose of the ID numbers in WANT are specified by control
signals in the four-element array KEY. The results of the directory search by
indexpkt: are returned in the integer array RELPT. RELPT contains the relative
position in the packet of each ID number listed in WANT. There is usually a
one-for-one correspondence between WANT and RELPT. Note that indexpkt: requires
that the dimension of RELPT array must be greater than the dimension of the
WANT array.
Warning: If the WANT array contains the same ID number repeated consecutively,
as used for example to point to all of the words a large character string data
item (e.g. an atom label), the RELPT array will be returned containing pointers
to the consecutive words. That is, if WANT contains the IDN's 11 11 11 11,
RELPT will be returned with 5 6 7 8 if the first IDN 11 is located in word 5.
This is correct but a problem can occur if by chance two non-character IDN's
are loaded into WANT consecutively. indexpkt: will automatically return the
pointers as consecutive words, and this may not be what is needed. This is an
usual circumstance but some care is required on the part of the programmer.
Complete details of this powerful command are given in the macro definitions.
It will suffice here to show the purpose of indexpkt: in the overall binary
file manipulation process. The RELPT packet pointers returned by indexpkt: are
used during subsequent file reading and writing processes to extract, modify
and insert data into the buffer. Note that it is only necessary to locate items
which are actually 'used' in the calculation - all other items in the packet
are transferred as a block. Below is an example which shows the general
principles. Refer to the program FC for more complete examples.
Example of binary file processing
Here is an illustrative example of processing the logical record lrtest:. Note
that the dimension of the REL array is one greater than the WNT array
INTEGER KEY(4)# Indexpkt controls
INTEGER REL(6)# Indexpkt rel pointers
intdata:(WNT,[11,17,18,8,23])# Indexpkt list of ID numbers
..............
KEY(1)=1# Item 1 is mandatory
KEY(2)=3# Items 2-3 are optional
KEY(3)=5# Items 4-5 are to be appended
KEY(4)=0# Set append-item control
indexpkt:(1,lrtest:,NP,IP,2,KEY,WNT,REL)# Process directory
IF(KEY(4)<=0) iquit:(90103.)# If item 1 not found-- exit
MP=KEY(4)# Set expanded packet size
..............
REPEAT# Loop over lrtest: packets
$(# >>>>>>>>>>>>>>>>>>>>>>>>>>>> 3
readwpkt:(1,lrtest:,NP,IP,0)# Point to input lrtest packet
IF(IP<=0) BREAK# Exit after last packet
writepkt:(2,lrtest:,MP,JP)# Point to expanded o/p packet
movereal:(QX,IP,QX,JP,NP,0)# Transfer input items to o/p
.............
I=JP+REL(1); QX(I)=QX(I)+1.# Increment mandatory item
.............
I=JP+REL(3); IF(I>JP)AMX=QX(I)# Extract optional item
.............
I=JP+REL(4); QX(I)=QX(MM+2)# Store appended item
$)# <<<<<<<<<<<<<<<<<<<<<<<<<<<< 3