Revised February 15, 2001

miniCarlo command language

[ This section is under construction. I will start with the absolutely required minimum, and then gradually extend it. ]


Overview

The flow of calculations by miniCarlo is controlled by a user-specified protocol. The protocol ("way-file") consists of a sequence of commands, which invoke simulation steps (i/o, energy minimization, Monte Carlo), control the flow of the protocol, or modify various parameters (both parameters of simulations, and independent parameters determining the nucleic acid structure). The protocol file must be specified with the "-w" option in the miniCarlo command line. The command language allows loops (which can be nested) and calls of files with sub-protocols. At the start of miniCarlo the sequence of commands is compiled to minimize communications with disk.

Commands in the protocol are specified with four-character keywords. Empty lines in the protocol file and lines starting with the pound sign ("#") are ignored. Left white space (spaces and tabs) in each line is ignored, which allows using indentation. For each command only four first characters are interpreted and the rest of the line is ignored. All three following examples are equivalent:


TITL
TITLE
   TITLE this is a command specifying the title of the job
All required numeric data are in free format, but such data must be supplied on lines different from commands. The following is the correct example of setting temperature to 300K:

TEMP
300
The following example is incorrect:

TEMP 300
In the following, commands (keywords) will be shown in boldcase; <file> will stand for a valid unix file name, <data> will stand for numeric data; <integer> - for integer numeric data, etc.



Index of commands and parameters

The commands without a link are not described yet. An asterisk denotes parameters that can be changed with ADDD, CHNG, or MULT.


ADDD AVER BLTZ CHNG COMM
COPY CPAV CTOF* CVAL* DIEL*
FAVE FILE FIND FORN FOUT
IJON INFO INPT ITER
IVAR LABE LDAV LOOP MINI
MMOL MULT NBLS NEXT OFST*
OUTP PAIR PAUS PDQ0* PDQS*
PDQX PRNT PROB RATE REGH
REST ROUT RSAV SHAK STEP
STOR SUG4 SYMM TEMP* TITL
UWND* WGAP* WOBJ* WRES* XMOL


Default values


Parameters changeable with ADDD/CHNG/MULT
parameter default description
PAIR 0.0 all PAIR parameters
STEP 0.0 all STEP parameters
CTOF 8.0 cutoff for non-bonded interactions
CVAL 50.0 force constant for bond angles
DIEL -1.0 dielectric constant
OFST 0 offset for PAIR and STEP selection
PDQ0 0.0
PDQS 0.0
TEMP 300.0 temperature
UWND 0.0
WGAP 1.0
WOBJ 1.0 weight of relaxation rates-based objective function
WRES 1.0 weight of distance restraints
Some other parameters (not changeable with ADDD/CHNG/MULT)
IJON 0 mode of backbone closure
PRNT 0 controls the details of STOR output
REGH 0 modifies energy calculation for regular helices
SUG4 1 sugar model


Commands description


ADDD, CHNG, MULT

Commands modifying selected parameter(s). Usage:


ADDD, CHNG, or MULT
<selection of parameter(s)>
<list of values>
Each of these three commands modifies the selected parameters using specified <values>. CHNG sets the parameters to <values>, ADDD adds <values> to the current values of parameters, and MULT multiplies the current values of parameters by <values>.

Two types of parameters can be selected for modification:
  1. Internal coordinates can be selected with PAIR or STEP commands in the same manner as they are selected for MINI or SHAK. In this case, <list of values> should correspond exactly to the list of selected parameters.
  2. Certain parameters controlling the flow of calculations can be selected by specifying their name CTOF, CVAL, DIEL, OFST, PDQ0, PDQS, TEMP, UWND, WGAP, WOBJ, WRES). Each of these parameters has its default value at the start of miniCarlo; each parameter must be changed with a separate ADDD, CHNG, or MULT command. Command CHNG can be omitted for the change of these parameters (but not for the change of PAIR or STEP).
Examples:

Both

CHNG
CTOF
10.0
and

CTOF
10
will change the cutoff value for non-bonded interactions to 10 angstrom.

MULT
WRES
1.2
will multiply the current weight of distance restraints by 1.2.

ADDD
TEMP
100
will add 100K to the current value of temperature.

CHNG
CTOF TEMP
10 400
the above is incorrect: each parameter must be changed with a separate CHNG.

It does not seem very useful to multiply such parameters as PAIR, STEP, OFST, but it is not disallowed.

See Example 4, Example 6.


AVER

Calculating and outputting mean and std values for Metropolis Monte Carlo simulation. Usage:


AVER
Strictly speaking, the usage of this command is not restricted to Monte Carlo simulations, but this is where it is most useful. Each time the command ITER appears in the protocol, a number of structural parameters for the current structure (including its internal coordinates) are added to the running sums for the mean and std values. AVER simply averages these running sums and outputs results in the file set with FAVE.

See Example 9.

See also FAVE, ITER, RSAV, LDAV.


COMM

Commentary line. Usage:


COMM
This line is ignored. Useful to insert commentaries in the protocol.

See Example 3.


CPAV

Averages internal coordinates among all current copies of the molecule. This command is used only when miniCarlo is run with multiple copies (with the option "-m"). Usage:


CPAV
<copy number>
Internal coordinates are averaged among all current copies of the molecule taking into account current probabilities of each copy. The result is loaded into the copy <copy number>. The resulting probabilities are also modified: the new copy <copy number> gets a probability of 1.0 and the rest of copies get probabilities of 0.0.

This operation is different from that performed with LDAV. LDAV averages the accumulated segment of the Monte Carlo chain, and it can be used in both single- and multi-copy mode. (In a multi-copy mode, the Monte Carlo chain is averaged for each copy separately).

See also: LDAV, COPY, PROB, BLTZ, PDQX.


DIEL

Dielectric constant. Usage: parameter changeable with ADDD/CHNG/MULT. Default value -1.

When DIEL = 0, electrostatic interactions are ignored.
When DIEL is positive, dielectric constant is distance independent; electrostatic interactions are divided over DIEL.
When DIEL is negative, dielectric is distance dependent; electrostatic interactions are divided over DIEL times distance between partial charges.


CTOF

Cutoff distance for non-bonded interactions. Usage: parameter changeable with ADDD/CHNG/MULT. Default value 8 Å.

List of non-bonded interactions is calculated and stored as a list of residue pairs rather than atom pairs. All interactions between two residues are calculated if at least a pair of atoms from these residues are at a distance below the cutoff value. This scheme of energy calculations combined with the usage of internal coordinates has an advantage of not having to evaluate energy terms that have not changed. If, e.g., only the roll parameter of step # 5 in a 10-bp duplex has changed, than the energy terms for the first four base pairs are not necessary to calculate again.

See also NBLS.


FAVE

Sets file for output of mean and std values. Usage:


FAVE
<file>
Any output created by AVER command will be written to this file. If no file was set with FAVE, the output will be written in the file "fort.22". Existing files are overwritten with AVER without warning. Any new FAVE will close the previously opened file and open a new one.

See Example 9.

See also AVER, ITER.


FILE

Specifies file with a sub-protocol (subprogram). Usage:


FILE
<file>
Executes protocol from <file>. Nesting of FILE calls is allowed (maximum depth of nesting is 10).

See Example 5, Example 8.


FORN and NEXT

Specify loop in the protocol. Usage:


FORN
<integer>
<commands>
...
NEXT
Part of the protocol between FORN and NEXT is repeated <integer> number of times. Nested loops are allowed. No more than 20 FORN ... NEXT pairs are allowed in protocol.

Loops are required for Monte Carlo simulations and simulated annealings. They are also convenient to run repeated similar operations, such as simulations of large molecules, organizing grid searches, and in many other cases.

Bugs: Currently, the code does not check if the number of FORN ... NEXT pairs exceeded their maximum allowed number (currently, 20).

See Example 3.


FOUT

Opens the output file for the internal coordinates. Usage:


FOUT
<file>
Any output created by STOR or ROUT commands will be written to this file. If no file was set with the FOUT command, the output will be written in the file "fort.50". Any new FOUT command will close the previously opened file and open a new one.

By default, existing files are not overwritten with FOUT, so that if FOUT tries to open the existing file, miniCarlo will abort with error message. To allow overwriting, use the "-O" option in the miniCarlo command line.

See Example 1, Example 7, Example 8.


INFO

Sets the output file for short info. Usage:


INFO
<file>
The output created by the OUTP will be written in this file. This file is overwritten during each output. This file is useful during long Monte Carlo simulations when miniCarlo is run in background.

The default name of this file is "info_current.". It is useful to change this name when several copies of miniCarlo are run simultaneously from the same directory.


INPT

Input internal coodinates. Usage:


INPT
<file>
<integer record>
This command inputs internal coordinates using record # <integer record> of file <file>; the previous internal coordinates are overwritten. Format of the input file is explained in other section; it is exactly the same as output written by STOR command. If multiple copies were set in the miniCarlo command line with the "-m <integer copies>", then internal coordinates will be input for each copy, starting with the record number <integer record> of <file>. In case if total number of record in <file> is less than number of multiple copies, the last record of <file> will be used to fill in internal coordinates of all remaining copies. For example, if the protocol of Example 1 is executed with the following command

miniCarlo -m 2 -O -s aaa.seq -w aaa_1.way

then both copies will have identical internal coordinates input from "aaa.inp".

The format of input file depends on the value of parameter SUG4.

Input file with internal coordinates can be also specified in the command line. Input from the protocol overwrites input from the command line. miniCarlo does not check if the internal coordinates are input at least once. If they have never been input, they are all zero by default.

Bugs: If the input file was prepared using STOR with parameter PRNT set to 1, then INPT will be able to read only first record of this file.

See Example 1, Example 6, Example 11.


ITER

Defines the completion of one iteration of Metropolis Monte Carlo simulation. Usage:


ITER
Strictly speaking, the usage of this command is not restricted to Monte Carlo simulations, but this is where it is most useful. Each time this command appears in the protocol, a number of structural parameters for the current structure (including its internal coordinates) are added to the running sums for the mean and std values; this is called an "iteration". Then, AVER can be used to average these running sums and to output the results in the file set with FAVE.

There is a flexibility as to where ITER can appear in the protocol. Most logically, ITER must appear after the whole molecule has been randomly tried with the SHAK command(s). Iterations normally must be repeated using the FORN ... NEXT construction. During output of the current internal coordinates with STOR, the current iteration number is printed as parameter "RITER". Also it appears during the output with OUTP.

See Example 9.

See also FAVE, AVER, SHAK, OUTP, RSAV, LDAV.


IVAR

This is an obsolete keyword. In pre-released versions it had a purely syntaxis function; it was used with MINI and SHAK commands to indicate that the selection of STEP or PAIR parameters will follow. For consistency with old protocols, this keyword is still recognized, but is no longer necessary.


LDAV

Averages internal coordinates for the segment of Monte Carlo and loads them into the current structure. Usage:


LDAV
See Example 10.

See also RSAV, ITER.


MINI

Energy minimization. Usage:


MINI
<num_var>, <num_cycles>
PAIR or STEP
<num>, <num_list>, <list of PAIR or STEP parameters' ID numbers>
<list of maximum increments for selected parameters>
...
PAIR or STEP
<num>, <num_list>, <list of PAIR or STEP parameters' ID numbers>
<list of maximum increments for selected parameters>
It runs <num_cycles> cycles of minimization using <num_var> selected variables (internal coordinates); non-selected variables are not changed during the minimization. Variables are selected with sequential PAIR or STEP commands. Each PAIR or STEP command selects <num_list> variables specified with the list of ID numbers of individual internal coordinates from PAIR or STEP number <num> (see description of pair and step numbers). Also, each PAIR or STEP command must have a list of maximum allowed increments corresponding to the selected parameters. In most cases, the actual values of the maximum increments are not important, as long as they are big enough.

Selection of internal coordinates continues until total number of selected variables becomes equal to <num_var>. No more than 400 parameters can be selected for any single minimization. Don't try to minimize all independent coordinates at once; often it is more effective to split variables into several (possibly overlapping) subsets, and minimize them sequentially. After the minimization is finished, all variables are automatically deselected.

In the case of multiple copies, all selected copies are minimized sequentially (see keyword COPY).

See Example 2, Example 3.


MMOL/XMOL

Output of pdb file. Usage:


MMOL
<file>
or

XMOL
<file>
When the number of copies of the molecule (which is set with the "-m" option in the command line) equals one, then the two keywords are equivalent. When the number of copies is greater than one, MMOL outputs structures of all copies into a single file; individual structures in the pdb file are separated by MODEL and ENDMDL records. XMOL in that case creates separate pdb files for each copy, with file names derived from <file>.

See Example 3, Example 11.


NBLS

Update list of non-bonded interactions. Usage:


NBLS
Calculates new list of non-bonded interactions using the current value of cutoff distance CTOF and current structure. In case of multiple copies of the molecule, each has its own list of non-bonded interactions. Also, this command writes on disk file "non-bonded.list", mostly for debugging purposes.

List of non-bonded interactions is updated in the following cases: See Example 9.

See also: CTOF.


OFST

This is a parameter changeable by CHNG, ADDD and MULT. Default value zero. It modifies the number of a pair or a step during PAIR or STEP selection: it is added to the pair and step number specified in the PAIR or STEP command.

See Example 4, Example 5.


OUTP

Short output of structure info. Usage:


OUTP
<integer>
This command is useful during long Monte Carlo simulations. If <integer> = 0, a short output is written onto stdio; if <integer> = 10, it is written into the file set with the command INFO (default name of this file is "info_current."; this file is overwritten each time). The latter is useful when miniCarlo is run in background.

See Example 9.


PAIR

Selection of PAIR parameters. Usage:


PAIR
<num>, <num_list>, <list of PAIR parameters' ID numbers>
<list of values>
This command selects <num_list> variables from pair # <num> + OFST. The variables (internal coordinates) are specified in the list using their ID numbers. See also a description of pair and step numbers.

This command together with STEP is used within the context of MINI, SHAK, CHNG, ADDD, or MULT commands. <list of values> must correspond to the list of selected variables, but it is interpreted differently in each case by MINI, SHAK, CHNG, ADDD, or MULT.

See Example 2, Example 3.


PDQX

Inputs spt-file with experimental cross-relaxation rates and sets the mode for the pdqpro/RELAX calculations. This command is used only when miniCarlo is run with multiple copies (with the option "-m"). Usage:


PDQX
<mode value>
<spt-file>
or

PDQX
<mode value>
Allowed values for the <mode value>:
-1 No pdqpro/RELAX calculations (this is a default value for this parameter when no PDQX command was executed).
0 Probabilities are calculated each time when objective function Qr is evaluated (i.e., during each energy call). This is the most common usage of PDQX.
1 Probabilities are not calculated, even though the objective function Qr is still evaluated.


When the PDQX mode is set to "0" or "1" for the first time, the file name for the spt-file must be provided.

Once the spt-file is input, and the PDQX mode is not "-1", the objective function Qr, scaled by the parameter WOBJ, is always added as penalty to the total energy of each conformer in the ensemble (copy of the molecule).

See also: WOBJ, COPY, PROB, RATE, PDQ0, PDQS.

See Example 11.


PRNT

Parameter modifying the output during STOR command. Default value: 0. Usage:


PRNT
<value>
Allowed values:
0 Normal STOR output.
1 In addition to output of internal coordinates, STOR will output individual energy terms corresponding to the current list of non-bonded interactions.

Bugs: Repeated use of STOR with PRNT = 1 interferes with the consequent use of the output file for the input with INPT. INPT will read successfully only the first record of such file, but it will not be able to go through the lines with individual energy terms. Workaround: delete manually the lines with individual energy terms, or prepare a separate input file with PRNT = 0:

PRNT
0
FOUT
separate_file.inp
STOR
1


REST

Input distance restraints. Usage:


REST
<file>
Distance restraints are input from the <file>; see a description of the format of this file. Only one file of distance restraints can be input (files of distance restraints specified later in protocol will overwrite the previous ones).

After the distance restraints were input, the penalty Erestraint is always added to the total energy. Setting WRES to zero will zero this energy term. If there is only one copy of the molecule, actual distances r are used to calculate Erestraint(r). In the case of multiple copies, distances are third-root ensemble averaged accounting for the probabilities of each copy. In the latter case the term Erestraint is common for all copies of the molecule.

See Example 8.

See also: WRES, ROUT.


ROUT

Outputs distances and distance deviations in the file set with FOUT. Usage:


ROUT
<integer>
When <integer> = 1, ROUT produces short output (a number of distance deviation indexes). When <integer> = 2, ROUT also outputs all individual distances corresponding to distance restraints.

See Example 8.

See also: REST, WRES, and format of distance restraints.

Bugs: ROUT writes distance restraints in a file set with FOUT, i.e., the same file that is used by STOR to output internal coordinates.


RSAV

Restarts the Metropolis Monte Carlo chain. Usage:


RSAV
This commands zeroes all internal variables accumulated during previous execution of the IVAR command.

See Example 10.

See also LDAV, ITER.


SHAK

A single Metropolis Monte Carlo step (shake). Usage:


SHAK
<num_var>
PAIR or STEP
<num>, <num_list>, <list of PAIR or STEP parameters' ID numbers>
<list of maximum increments for selected parameters>
...
PAIR or STEP
<num>, <num_list>, <list of PAIR or STEP parameters' ID numbers>
<list of maximum increments for selected parameters>
This command randomly changes <num_var> selected variables within the limits of specified maximum increments and performs a single trial in the Metropolis algorithm (the new structure is accepted or rejected based on comparison of delta energy with the Boltzmann factor).

Selection of variables is done the same way as during minimization and it is explained there.

Total number of selected variables <num_var> and the maximum increments should not be very big, otherwise, all new structures will be rejected. In practice, values of maximum increments can be adjusted so that about 50% of all trials are accepted.

In the case of multiple copies, all selected copies are shaked sequentially (see keyword COPY).

See Example 9.

See also FAVE, AVER, ITER, RSAV, LDAV.


STEP

Selection of STEP parameters. Usage:


STEP
<num>, <num_list>, <list of STEP parameters' ID numbers>
<list of values>
This command selects <num_list> variables from step # <num> + OFST. The variables (internal coordinates) are specified in the list using their ID numbers. See also a description of pair and step numbers.

This command together with PAIR is used within the context of MINI, SHAK, CHNG, ADDD, or MULT commands. <list of values> must correspond to the list of selected variables, but it is interpreted differently in each case by MINI, SHAK, CHNG, ADDD, or MULT.

See Example 2, Example 3.


STOR

Output of internal coodinates. Usage:


STOR
<integer>
Internal coordinates of all copies will be sequentially output into a file set by the FOUT command. If such a file already existed, miniCarlo will abort with error status, unless the "-O" option ("overwrite") was specified in the command line. If the file was never set with FOUT, the output will appear in the file "fort.50". The value of <integer> will appear in the output file prior to the sequence of the molecule. It can be used to facilitate the search in long output files, but it does not have any function within miniCarlo. Each new STOR will append internal coordinates in the end of this file.

The format of the output file is generally consistent with the input file, so it can be used for input with INPT (but see bugs).

Some features of the output file depend on values of parameters SUG4, PRNT, PDQX.

Command ROUT also produces output into the file set with FOUT.

Bugs:

See Example 1, Example 3, Example 4.


TEMP

Temperature (in Kelvins). Usage: parameter changeable with ADDD/CHNG/MULT. Default value 300K.

See Example 9.


TITL

Sets the title of the job. Usage:


TITL
<string>
The first 50 characters of <string> will appear in the first line of the output file set by FOUT. It does not have any other function.

See Example 1.


WOBJ

Weight of cross-relaxation rates-based objective function Qr. Usage: parameter changeable with ADDD/CHNG/MULT. Default value 1.0.

See Example 11.

See also: PDQX, format of spt-file.


WRES

Weight of distance restraints. Usage: parameter changeable with ADDD/CHNG/MULT. Default value 1.0.

See Example 8, Example 9.

See also: REST, ROUT, format of distance restraints.