Revised October 2, 2000

miniCarlo input files

Input file with internal coordinates (helical parameters).

Below is an example of input internal coordinates (helical parameters) for a DNA:RNA hybrid molecule d(AAA):r(UUU). This file is called "aaa.inp" in the examples of protocols 1 through 5 (all examples can be downloaded).

hybrid                                             Mon Jul 10 18:33:19 2000
     3  1  TEMP=  300.  DIEL=-1.0   unwind = 0.0 RITER=      0.   BEG= 0
    1  AAAUUU
       5 35 3

           TWIST       TILT       ROLL          DX         DY         DZ
    1     37.199     -1.519     -6.752        0.0621    -0.0947     3.1723
    2     37.199     -1.519     -6.752        0.0621    -0.0947     3.1723

                  PROP     BUCKLE    OPENING          SX         SY         SZ
  1 A 1:U 6    -10.042      6.545     -0.650       -0.0226    -0.0281     0.0746
  2 A 2:U 5    -10.042      6.545     -0.650       -0.0226    -0.0281     0.0746
  3 A 3:U 4    -10.042      6.545     -0.650       -0.0226    -0.0281     0.0746

        P       Chi   IGJG  TET3   TET5   Gamma    Beta   Alpha   Dzeta     Eps   PhiRib  PhiMet
 A 1  158.98  141.83
 A 2  158.98  141.83  1 2  -0.11  -0.11    59.1   180.5   287.7   248.3   186.9
 A 3  158.98  141.83  1 2  -0.11  -0.11    59.1   180.5   287.7   248.3   186.9
 U 4  138.95  127.77                                                                0.0
 U 5  138.95  127.77  1 2  -0.13  -0.12    61.2   174.7   292.9   256.7   185.5     0.0
 U 6  138.95  127.77  1 2  -0.13  -0.12    61.2   174.7   292.9   256.7   185.5     0.0

tot =       33.187 sug =  67.82 tors =  4.05 val =  0.00 clos = 0.111E+00 restr =    0.00 conf =   33.19

This is an example of the input file for a molecule shown in Figure 2. This molecule has 6 residues, 3 base pairs, and 2 steps between pairs. Because the input file has a rather complicated format, the most convinient way to prepare input is to use or edit one of the existing files. A number of files will be provided with examples; also it can be obtained as an output of miniCarlo (using command STOR) or output of a stand-alone program fitparam. Below follows the description of the format of this file.

The input and output files with helical parameters have exactly the same format, but not all information is important for the input. There are three section of input data in the example file above: step parameters, pair parameters, and backbone parameters. The numeric data in each line are in free format, however, number of lines and their order are important. The first six lines of the input file are read (as character strings) and ignored (they must be present but their content is not important for the input. During output, these lines contain title of the job, date, sequence and some other parameters which are self-explanatory in most cases. The content of these lines will most likely change in next releases). Then goes the first section with step parameters (twist, tilt, roll, shift, slide, rise), six parameters for each step (there are two steps in this molecule). The step numbers ("1" and "2") are read but ignored. The next two lines are also read and ignored. The next section is that with pair parameters (propeller, buckle, opening, shear, stretch, and stagger), six for each of base pairs (three base pairs, in this case). These parameters are read starting with the 12-th position of each line. so that the base pair identifiers (e.g., "1 A 1:U 6") are skipped. The next two lines are again read and ignored.

Then the section with the backbone parameters starts, which has one entry for each residue. Each line in the backbone section is read starting with the fifth position, so that residue identifiers (such as "A 1") are ignored. For each residue, a sugar pseudorotation P value is read from the first column, and a glycosydic angle Chi value is read from the second column. If SUG4 = 1, then pseudorotation P is used to define the sugar conformations. If SUG4 > 1, pseudorotation parameter P is ignored, and sugar parameters are read from the other section of the input file (not present in this example). Also, each non-5'-terminal residue has the following parameters in this section: IGJG (this is a pair of integers which have to do with backbone closure procedure. These parameters will be explained later; for most standard structures they must be "1 2"); TET3 and TET5 (deviations of bond angles C3'-O3'-P and C5'-O5'-P, respectively, from their ideal values, in degrees); and backbone torsions gamma, beta, alpha, dzeta, epsilon (make a figure explaining the order of the torsions). Parameters IGJG are input; they affect they way how the backbone is calculated. However, TET3 and TET5 and backbone torsion angles are not independent parameters: they are calculated as a result of backbone closure and reported in this file during output. During input, they are read and ignored. In addition, all ribo residues have parameter PhiRib defining the orientation of the 2'-hydroxyl group. They can be omitted from the input file; their default value is zero. And finally, all thymines have parameter PhiMet defining the orientation of the methyl group (not shown in this file). The value of PhiMet is input in the end of this line in the format: "m <value>". The PhiMet values can be also omitted from the input file; their default value is 90 degrees.

In the end, four lines are read as character strings and ignored. During output, the energy components are printed here.

[ under construction: SUG4=4 and SUG4=10 ]

All described above constitutes a single record in the input (or output) file with helical parameters. There may be many sequential records in a single file. In order to input the first record from the file "aaa.inp", the following commands must be present in the protocol file:

Alternatively, this input can be specified in the command line of miniCarlo using "-i aaa.inp 1". Input specified in the protocol file overwrites that from the command line. And in general, any new input with the INPT command overwrites the current structure.

Input file with distance restraints

Below is shown a fragment of file with distance restraints "Aform.restraints"; this file is used in the protocol example 8.

ATOM- i ATOM- j   r_low   r_up     k_low     k_up
H4'   1 2H2'  1   2.700   2.780    10.00    10.00
H4'   1 H1'   1   3.340   3.460    10.00    10.00
H2    1 H1'   1   4.050   4.290    10.00    10.00
H8    1 2H2'  1   3.970   4.410    10.00    10.00
H8    1 1H2'  1   3.730   3.970    10.00    10.00
This file can start with an arbitrary number of commentary lines. A line with 'ATOM' in first four columns serves as an indication that distance restraints follow immediately below. Lines with pound sign "#" in the first column are ignored.

Distance restraints must be in "mardigras format". Atom names and residue numbers are read using FORTRAN format ( a4, i3, 1x, a4, i3 ). Lower and upper distance limits, and two force constants are given in the same line in a free format; these numbers define a "flat-well" potential for the penalty function: penalty is zero when the actual distance is between the two limits, and it increases quadratically otherwise:
Erestraint(r) = { 0 if rlow < r < rup
klow · (r-rlow)2 if r < rlow
kup · (r-rup)2 if r > rup

Distances must be in Å and force constants k_low and k_up in kcal/(mol·Å2). The force constants can be rescaled later in the protocol using WRES. After the distance restraints were input, the penalty Erestraint is always added to the total energy. Setting WRES to zero will zero this energy term.

If there is only one copy of the molecule, actual distances r are used to calculate Erestraint(r). In the case of multiple copies, distances are third-root ensemble averaged accounting for the probabilities of each copy. In the latter case the term Erestraint is common for all copies of the molecule.

Atom names may give problems during input. The code makes some effort to guess the atom name (e.g., it will understand any of H2'2, 2H2', H2"), but the safest is to output the pdb file with MMOL and use the atom names consistent with the pdb file.

This file is input using the following protocol:


Input file with dipolar relaxation rates

This input file is required for multiple-copy refinement [5] when probability of each copy is calculated by invoking the pdqpro routines.

Experimental NOE-derived dipolar cross-relaxation rates must be specified in a "spectral density file" ("spt-file"). This file controls the RELAX subroutines [7] invoked by the miniCarlo program. This file has a complicated format, which is described in detail in Ref. [7]. The experimental relaxation rates can obtained from experimental NOE intensities with the use of the program "mardigras. In the future, we will provide a program for automated generation of spt-files using the output of mardigras runs. Meanwhile, it is best to modify manually sample spt-file "ab70.spt" from the Examples 11 and 12.

The following portions of the spt-file must be modified according to the specifics of your system:

1. Spectrometer frequency is specified in the line

frequency 500
2. Classes of nuclei. See full description of classes in [7]; classes pertinent here are "DEFAULT" (class 0) and "Methyl" (class 1). An effective correlation time (in seconds) must be provided for each class, e.g.,
class 0
occupancy 1
order 1 1 1 1
correlation 3.0E-9 3.0E-9 3.0E-9 3.0E-9
internal  1.0e-10 1.0e-10 1.0e-10 1.0e-10
calibration 0 0 0 0
3. Section "assign". This section must provide residue numbers and atom names of all relevant protons (most commonly, all non-exchangeable protons of the molecule). For example,
assign res 3 atom H1' shift ??? class 0
assign res 3 atom 2H2' shift ??? class 0
assign res 3 atom 1H2' shift ??? class 0
assign res 3 atom H3' shift ??? class 0
assign res 3 atom H4' shift ??? class 0
assign res 3 atom 1H5' shift ??? class 0
assign res 3 atom 2H5' shift ??? class 0
assign res 3 atom M7 shift ??? class 1
assign res 3 atom H6 shift ??? class 0
In this example (protons of a dT residue), please note that chemical shifts don't have to be specified; the exchangeable iminoproton (H3) is not listed; all protons except of the methyl group belong to class 0 (DEFAULT); methyl group belongs to class 1 (Methyl), and it is listed as "M7" and not as three individual protons (1H7, 2H7, 3H7).

4. Section "peak" ([7]) does not affect multiple-copy calculations.

5. Section "rate" specifies cross-relaxation rates (in Hz) for individual proton pairs, e.g.,
rate assign H5 2 1H2' 2 -0.1428
rate assign H5 2 H3' 2 -0.0479
rate assign H6 3 H6 2 -0.1301
rate assign H6 3 2H2' 3 -1.6298
rate assign M7 3 H1' 2 -0.1179
In contrast to input of distance restraints, this portion of code does not try to "guess" atom names. Atom names and residue numbers used in section "rate" must correspond exactly to those given in section "assign", and both must correspond to those used internally by miniCarlo. The latter can be checked by creating the pdb file with the MMOL command. Also, a simple csh script fixatoms recognizes most commonly used variants of atom names for nucleic acids, and converts them into the standard used in miniCarlo.

The spt-file with cross-relaxation rates must be specified in the protocol with the PDQX command. This will instruct miniCarlo to invoke RELAX routines, which calculate theoretical cross-relaxation rates for the simulated ensemble, and pdqpro routines which calculate probabilities of each copy of the molecule (number of copies must be set in the miniCarlo command line). Simultaneously, the pdqpro routines compute the relaxation rates-based quadratic objective function Qr, which is added as penalty to the total energy of the system (the interaction between miniCarlo, RELAX and pdqpro routines is covered in more detail in [5]). Weight of this penalty is set by the parameter WOBJ.

It is allowed to use simultaneously the relaxation rate restraints (spt-file) and distance restraints. In this case, both penalties are added to the total energy of the system. The copies' probabilities are calculated using relaxation rates with pdqpro, and these probabilities are used for the appropriate averaging of the distances.