Revised November 21, 2006
miniCarlo performs conformational calculations (energy minimization and
Metropolis Monte Carlo simulations) on nucleic acids. Similar to conventional
Molecular Dynamics (MD) simulation packages (such as AMBER), it uses Cartesian
coordinates of individual atoms to calculate the conformational energy of a molecule.
miniCarlo uses the Zhurkin-Poltev-Florentiev force field
, which is currently hard-coded in the program.
To generate the Cartesian coordinates, miniCarlo uses a specialized set of independent internal coordinates, which include helical parameters. This approach allowes one to reduce drastically the number of degrees of freedom in a molecule by treating aromatic bases as rigid bodies, and using idealized values for bond lengths and most bond angles. Consequently, to start the miniCarlo calculations, a a set of helical parameters is required as input, and not a set of xyz coordinates. A set of helical parameters is also output as result of calculations (although, a pdb file can be also output for the use with other programs). Sample input files of helical parameters for some standard structures will be provided with example files. Also, helical parameters can be calculated from a pdb file using an auxiliary program fitparam. However, because miniCarlo requires idealized geometries for aromatic bases, bond lengths and bond angles, the nucleic acid structure from a pdb file may require a "regularization" for the use with miniCarlo. As practice shows, this is not always trivial.
Because of a specialized nature of internal coordinates used, miniCarlo can presently work with nucleic acids only. An arbitrary number of strands and any combination of deoxy and ribo residues is supported; allowed nucleic acid bases are: adenine, cytosine, guanine, thymine, and uracil.
The flow of calculations by miniCarlo is controlled by a user-specified protocol. The protocol ("way-file") consists of a sequence of commands, which invoke simulation steps (i/o, energy minimization, Monte Carlo), control the flow of the protocol, or modify various parameters (both parameters of simulations, and independent parameters determining the nucleic acid structure). The command language allows loops (which can be nested) and calls of files with sub-protocols. At the start of miniCarlo the sequence of commands is compiled to minimize communications with disk.
The core of miniCarlo is the backbone closure algorithm by Zhurkin and co-authors, which was used in the early version of the program working with regular DNA duplexes . The current version of miniCarlo still uses portions of the code of the original program, including the backbone closure routine, matrix mathematics, etc. The program adopted its present shape in 1988 . Since then the program underwent numerous modifications and it exists in several divergent versions [Refs]. The current version of miniCarlo is being developed in the Tom James' NMR Lab at UCSF and it includes NMR-related options: distance restraints , and multiple-copy refinement with floating probabilities  against NOE-derived proton-proton dipolar relaxation rates. During the multiple-copy refinement, floating probabilities are calculated using the pdqpro algorithm , and theoretical relaxation rates are calculated using routines of the RELAX program . The proper description of miniCarlo has never been published (which we will do soon), but it has been extensively used in several labs [Refs].
The program is written mostly in FORTRAN (with the exception of pdqpro and RELAX routines) without dynamic memory allocation. This alpha-version of miniCarlo is compiled allowing for a maximum of 100 residues, 50 base pairs, 2000 distance restraints, and 10 copies of a molecule. Please contact Nick Ulyanov if you need to recompile the program with different dimensions.
The following are the independent variables used by miniCarlo to determine the Cartesian coordinates of all atoms in a nucleic acid molecule:
|ID numbers of PAIR parameters|
STEP Helical parameters.
Also, there are three rotations and three translations defining the relative position of two consecutive pairs; they are associated with steps. The table below shows the ID numbers of step parameters; the ID numbers are used for the selection of these parameters with the STEP command (described in the "miniCarlo command language" section).
|ID numbers of STEP parameters|
Note that the definitions of helical parameters internally used in
miniCarlo do not conform with the guidelines of the Cambridge
convention  (e.g., because
of a different choice of frames of reference, see Figure 1). Consequently,
it is not advised to publish these parameters -- they are for the internal
use in miniCarlo only. The easiest way to obtain a set of
helical parameters conforming to the guidelines of the Cambridge convention
is to output the structure in pdb format, and use one of the available
nucleic acid analysis programs .
It is our intention to change the internal definitions of helical parameters in the next release of miniCarlo, so that the same set of parameters will be used for the definition of structure and for its description.
Residue (NUCL) parameters.
The rest of parameters define the sugar conformation, orientation of the hydroxyl group for the ribose, orientation of the methyl group for T and m5C, and orientation (, , , Sx,Sy,Sz) of the 3rd or 4th base in a triple or a quadruple. Now (since winter 2004) they are all referenced with the NUCL keyword.
|ID numbers of the Residue (NUCL) parameters|
|, , , Sx,Sy,Sz||1-6|| For the 3rd or 4th base
in a triple or a quadruple only
|Sugar pseudorotation P||7||SUG4 = 1|
|Four-parameter sugar, endocyclic parameters|
|3||10||SUG4 = 4 or 10|
|4||11||SUG4 = 4 or 10|
|4||12||SUG4 = 4 or 10|
|5||13||SUG4 = 4 or 10|
|Ten-parameter sugar, exocyclic parameters|
|C1'||14||SUG4 = 10|
|C1'||15||SUG4 = 10|
|C2'||16||not used currently|
|C2'||17||not used currently|
|C3'||18||SUG4 = 10|
|C3'||19||SUG4 = 10|
|C4'||20||SUG4 = 10|
|C4'||21||SUG4 = 10|
|2' hydroxyl group||22||riboses only|
|methyl group||23||T and m5C only|
The glycosidic angles are internally defined as C2'-C1'-N9-C4 for purines
and C2'-C1'-N1-C2 for pyrimidines; this will also change in the next release
For the internal geometry of sugars, one of two models must be selected: one-parameter (SUG4 = 1, default) or four-parameter (SUG4=4). In addition, for the four-parameter sugar model it is allowed to change exocyclic bond angles (with SUG4=10).
Below is an example of a protocol selecting all independent variables of a DNA molecule consisting of two base pairs which does not have any thymines, assuming that the one-parameter sugar model is used:
PAIR 1, 6, 1,2,3,4,5,6 PAIR 2, 6, 1,2,3,4,5,6 NUCL 1, 2, 7,8 NUCL 2, 2, 7,8 NUCL 3, 2, 7,8 NUCL 4, 2, 7,8 STEP 1, 6, 1,2,3,4,5,6
Note that in the case of PAIR and STEP parameters, the base pair number is referenced, but in the case of the NUCL parameters, the nucleotide number is referenced.
If the molecule had thymines or RNA residues, the torsion angles determining
the orientation of methyl groups in T's and hydroxyl groups in riboses
should have been also selected.
The sequence of nucleotides and their assignments to base pairs must be
specified in a "sequence file", which is required for
First line of the sequence file is an integer specifying the number of nucleotides in a molecule.
Third line is a string specifying the sequence in one-character format. Allowed characters are A, C, G, T, U; spaces are not allowed.
Second line is a string specifying ribo and deoxy residues. Allowed characters are r (ribo), d (deoxy), and space (deoxy). Empty string is interpreted as all deoxy residues.
Fourth line is a string specifying which residues are 5'- or 3'-ends of a strand. Allowed characters are 5, 3 and space (neither 5'- nor 3'-end). If a strand consists of a single residue, this residue must be specified as 5'- rather than 3'-end. The molecule must have at least one 5'- and one 3'-end (circular molecules are not supported currently).
Fifth and other lines specify consecutive base pairs, starting with base pair 1. Each line has two integers (two residue numbers). The bases in a pair must not be complementary, and pair may consist of one base only. If a base pair consists of one base only, the other still must be listed as zero.
Example 1. A DNA:RNA hybrid d(AAA):r(UUU).
Sequence file "aaa.seq"
6 rrr AAAUUU 5 35 3 1,6 2,5 3,4
This molecule has six residues that are assigned to three base pairs in the sequence file. Each pair has six degrees of freedom (Propeller, Buckle, Opening, Shear, Stretch, and Stagger); these parameters must be specified for each pair in the input file with helical parameters. Also, each of two steps in this molecule has six degrees of freedom (Twist, Tilt, Roll, Shift, Slide, and Rise); they too must be specified in the same input file.
Example 2. An RNA hairpin loop r(GGUUUCC).
7 rrrrrrr GGUUUCC 5 3 1,7 2,6 3,0 4,0 5,0
Example 3. DNA triplex d(AA):d(TT):d(TT).
6 AATTTT 535353 1,4, 5 2,3, 6
To run miniCarlo, you need to place the executable miniCarlo in a
directory included in the path. Also, you need to have the file
"dupinp.txt" in the working directory. ("dupinp.txt" contains information
about standard geometries, force field, partial charges, etc. Do not
edit this file!)
Running miniCarlo requires two parameters, and also several options may be specified:
miniCarlo [-cdfhimoOp] -s <sequence file> -w <protocol file>
Parameters in square brackets are optional; parameters "-s" and "-w" are required. Instead of <sequence file> and <protocol file> actual sequence file and protocol file must be specified.
|-c <file>||compile all sub-protocols into a single|
|-d||checks the syntax of the protocol file (debug)|
|-f <pdb file>||to input pdb file and bypass calculation of coordinates|
|-h||prints short help|
|-i <input file> <record>||reads internal coordinates from <input file> using record # <record>|
|-m <number>||sets the number of multiple copies to <number> (default 1).|
|-O||allows overwriting output files with internal coordinates|
|-o <output file>||sets file name for output of internal coordinates|
|-p||skips protocol file and generates pdb files from input internal coordinates|
|-v||verbose; prints more messages onto stdio|
Internal coordinates can be also input from the protocol file. The output file for internal coordinates can be also set in the protocol file. (Specifying both input and output files in the command line has the advantage of possibility to use the same protocol file with different input files.) The rest of options (including setting of the number of multiple copies) can be used only in the command line. miniCarlo also produces a fair amount of stdio output. This should be redirected into a file if you want to run the program in background.