**mardigras** (*Matrix Analysis of Relaxation for DIscerning the Geometry of an
Aqueous Structure*)

is a FORTRAN program for calculating proton-proton distances from cross-peak intensities measured from a 2D NOE experiment.

**The** **general** **approach**

A model structure is used to generate a theoretical 2D NOE spectrum. Wherever
possible, experimental intensities are substituted into the theoretical
spectrum to yield a full spectrum (hybrid intensity matrix) suitable for
direct solution of the relaxation rate matrix **R** and, subsequently,
the distance matrix. Iterative steps require that (a) crossrelaxation
rates between protons with known interproton distance are not allowed
to change; (b) diagonal and offdiagonal elements are required to be consistent;
and (c) cross-relaxation rates corresponding to resolved and observed
cross-peaks only are allowed to change. This process incorporates all
the effects of network relaxation and multiple spin effects.

**Programs for structure elucidation by NMR**

The **mardigras** and **corma** programs are two separate programs
for structure elucidation by nmr. **newhyd** prepares a pdb file for
input to **corma.in**. If the user has the coordinates in the PDB format
for heavy atoms but not protons, it supplies the missing protons and their
coordinates; it is a stand alone program. **corma.in** prepares the
output pdb file from **newhyd** for input to **mardigras** or **corma**.
**mardigras** uses the output pdb file from **corma.in** for the
iterative relaxation matrix calculation.

**New** **features**

The programs **mardigras**, and **corma** have been modified to
incorporate (i) an improved computation of distances when averaging due
to motion or overlap of methyl groups, methylene proton pairs or symmetry-equivalent
aromatic ring protons exist, (ii) internal motions as modeled by a `model-free'
approach, and (iii) chemical exchange described by a kinetic matrix of
exchange rates. **corma.in** has also been modified to include internal
motion parameters.

**Calculation** **of** **distances** **for** **methyl,**
**methylene** **or** **symmetry****equivalent** **aromatic**
**protons.**

One of the most important additions to the **mardigras** or **corma**
algorithm lies in the calculation of distances involving methyl groups.
Methyl rotation can be modeled in one of four ways by **mardigras**
(see METHYL JUMP below under **INPUT**). For all models except model
0, distances are calculated iteratively to a pseudoatom at the geometric
center of the three methyl protons. Geometric parameters in models 1,
2 and 3 are taken from the input model coordinates and are assumed to
be correct. One should therefore have some degree of confidence in the
relative orientations of methyl groups in the input model before using
one of these more accurate methods of calculating distances to methyl
pseudoatoms. The iterative fitting procedure, then, is simply a one-parameter
(*viz*. distance) fit. (One could conceivably perform a multi-parameter
fit over methyl orientations as well, but this would not yield a unique
solution.) The advantage of this technique is that, when using **mardigras**
for structure refinement, methyl distances can be calculated very accurately,
and no additional uncertainty need be added to calculated distance bounds.
For model 0, inter-residue distances to methyls (and other pseudoatoms)
are calculated for the two extreme geometric cases. These extremes define
the upper and lower bounds on possible distances. This will prevent unwarranted
restraint on distances when the modelstructure geometry is unreliable.

A similar procedure is followed for distances to pseudoatoms which represent
either (1) a methylene proton pair or (2) symmetry-equivalent aromatic
ring protons. These protons are chemically equivalent, or sufficiently
close such that their cross-peaks are not resolvable. In both cases we
assume that motion is occurring which is slow relative to the overall
tumbling time (and thus does not contribute to relaxation) but fast relative
to the relaxation times. *Effective* pseudoatom distances are obtained
by *r*-6 averaging over the two actual distances. This approach is
rigorous for aromatic ring flip but not for methylene pairs. The equivalence
(often approximate) of methylene chemical shifts is not necessarily a
result of their mobility, although the assumption of motion may be better
for protein side-chains and other freely-rotatable methylenes. In some
cases, namely for constrained methylenes in DNA and Proline, it may be
a better approximation to treat them as static atoms. Note that this *r*-6
averaging reduces to the same interaction as the case for static methylene
protons in the limit of isolated spins. All of the error is introduced
in the term for transfer of magnetization between the two methylene protons.
In this model, magnetization is transferred perfectly, since the two atoms
are collapsed into a single pseudoatom. The result of this treatment will
be an error in the calculation involving local spin-diffusion effects.
The error is only in a single spin-pair, and the results of preliminary
tests indicate that the the overall error introduced is minor. Consider
a methylene pair near two other protons A and B. The error introduced
is at its maximum when the four protons are in line and at the closest
possible distance. The error is zero when the intra-methylene axis is
perpendicular to the A-B axis.

Nevertheless, the pseudoatom approach is the simplest reasonable model, since we have only a single experimental cross-peak and are justified in deriving only one distance. If the model geometry is correct, then we can extract the proper distances from the effective distance (using an iterative fitting procedure similar to that for methyls). If the model geometry is poor, we can at least calculate error bounds on that distance assuming extreme-case geometries. This is what is done. If methyl jump model 0 is chosen, the reported distance is the effective average distance accounting for all surrounding protons, with upper and lower bounds adjusted accordingly. Models 1, 2 and 3 assume that the model geometry should be taken to be correct.

**Model-free** **approach** **to** **internal** **motion**

When the details of internal motion are not known, the model-free approach
is useful. This model is now one of the options for the calculation of
the relaxation matrix. The spectral density *Jmodfree* for the model-free
model is defined in terms of the frequency C, anisotropy parameter *A*,
order parameter *S*2, overall correlation times I1 and I2, and
the internal correlation time I*e* as

*Jmodfree*(C,*A*,*S*2,I1,I2,I*e*)

=*AS*2*J*(I1,C)+(1-*A*)*S*2*J*(I2,C)+*A*(1-*S*2)*J*(I*p*,C)+(1-*A*)(1-*S*2)*J*(I*pp*,C)

where

*J*(I*n*,C)=I*n*/(1+(I*n*C)2)

I*p*=I*e*I1/(I*e*+I1)

and

I*pp*=I*e*I2/(I*e*+I2)

In the program the ratio I2/I1 is used. For I2/I1=1, or *A*=1,
*isotropic* *overall* *motion* is obtained. If the order
parameter *S*2 is set to 1, the *internal* *motion* is
ignored. If both *A* and *S*2 are set to one, the result is
*isotropic* *overall* *motion* *with* *no* *internal*
*motion*. The `model-free' approach is same as the well-known `wobble-in-a-cone'
model. (*Ref*.: Lipari, G. and Szabo, A. *J*. *Am*. *Chem*.
*Soc*. , **104**, 4546, 4559 (1982))

To use model free approach, the input PDB file has to be prepared using
**corma.in** with model free option. The modified PDB file contains
the key word and parameters that are required by model free approach.
The order parameters are not included in the PDB file because they are
defined for proton pairs, not for individual protons. A seperate input
file with the same format as the intensity file is required for the order
parameters. It is not necessary to have a complete list of order parameters
in the file. The default value is 1.0 for all proton pairs. Only those
order parameters which are different from 1.0 need to be included in the
order parameter file.

The order parameter may be assessed via molecular dynamics calculations
(*Ref*.: Koning, T.M.G. *et* *al*., *Biochemistry*,
**30**, 3787-3797 (1991)) or via experimental determination.(*Ref*.:
Lane, A.N., *J*. *Magn*. *Reson*., **78**, 425-439 (1988))
Of course, if the order parameter is less than one but identical for all
proton resonances, then the relative cross-peak intensities will not be
altered.

**Quality** **factors**

We added quality factors *Q*, *Q*2, *Qx* and *Q*2x
to **mardigras**/**corma** output. The quality *Q* factor is
described in J. M. Withka, J. Srinivasan and P. H. Bolton, *J*. *Magn*.
*Reson*. **98**, 611-617 (1992). The sixth root quality factors,
*Qx* and *Q*2x, are obvious extrapolations from sixth
root *R* factors.

**Chemical** **exchange** **described** **by** **kinetic**
**matrix** Exchange with bulk solvent protons and exchange between
two amino protons are take into account in the iterative procedure of
generating distances from a hybrid NOE intensity matrix. NOE intensities
of exchangeable protons are reduced by the exchange with solvent protons.
Neglect of such effects produces the upper bounds for the distances. Using
the upper limit of the exchange rates gives the lower bounds for the distances.
Of course, the actual values of the exchange rates can be utilized if
the data are available. When the two amino protons are unsolvable, a mean
distance is calculated iteratively from the relaxation rate based on *r*-6
averaging.

The relaxation matrix is modified by adding exchange matrices. The exchange
with solvent contributes to direct relaxation, and is described by a diagonal
matrix **K**. The element *Kii* equals exchange rates for exchangeable
protons, and zero for nonexchangeable protons. A second exchange matrix
**K'** is formed for proton pairs which exchange the positions. If
two protons, *i* and *j*, are exchanging positions with each
other with a rate *k*, then *K*'*ij* = *K*'*ji*
= -*k*, and *K*'*ii* = *K*'*jj* = *k*. Zero
is assigned to **K'** elements if proton pair exchange is not involved.
(See, J. Jeener, B. H. Meier, P. Bachmann and R. R. Ernst, *J*. *Chem*.
*Phys*. **71**, 4546 (1979).) The modified relaxation matrix is
then **R'** = **R**+**K**+**K'**.

Note that if the option ARBITRARY is specified then the modified relaxation
matrix is **R'** = **R**+**K** and the exchange matrix **K**
is read as is from the exchange file and is not modified in any manner..

**Peak** **Integration** **Errors**

The percentage errors in the intensities due to peak integrations may
be included as an additional column in the intensity file. The format
is free, but a key word ERROR% is required at the top of the column, see
example1.INT.1. Note, peak integration errors in the intensity file are
not used in **corma**. They are used in **mardigras**, together
with the signal to noise level given in the PARM file, to determine the
upper and lower distance bounds (in output file prefix.BNDS).

system-prompt:mardigras input-file

or if a file with name INP.PARM is present in your working directory.

system-prompt:mardigras

If the command line arguments are not allowed on your system, then the second form with the default input file name INP.PARM must be used.

**Parameters for mardigras**

The INP.PARM (default name for the *input*-*file*) or an equivalent
file is a required file. It contains the required and optional parameters
for mardigras. The following is an example of *input*-*file*
INP.PARM.example2 for mardigras:

is the INP.PARM.example2 input file for mardigras.
Comments may be put anywhere but must not begin with keywords. All keywords are upper
case. In the following the lines beginning with the keywords PDB, INT
and OUT are required.

1. The input pdb file prepared by corma.in: PDB FILE example2.pdb

2. The input intensity file:

INT FILE example2.INT.1

3. The prefix for the output files:

OUT FILE test52

4. The file containing the kinetic exchange rate matrix and is optional:

EXCHANGE RATE FILE example2.exch

5. The exchange rates are arbitrary.(optional specification): ARBITRARY

6. The spectrometer frequency in MHz.(optional specification, the default
is 500.0 MHz):

FREQUENCY 600.0

7. DELTA6D is the minimum value of the shift in the sixth root R factor.

(optional specification, the default is 0.0005): DELTA6D 0.0

8. R6THD is the minimum value of the sixth root R factor. (optional specification,
the default is 0.0005): R6THD 0.0

9. MINITN is the minimum number of iterations. (optional specification,
the default is 2): MINITN 3

10. MAXITN is the maximum number of iterations. (optional specification,
the default is 5): MAXITN 8

11. The NOISE level.

(optional specification, the default is 0): NOISE ABSOLUTE NORMALIZED
0.0025

12. The cross-peak intensity normalization. (optional specification):

NORMALIZE ALL

13. The methyl jump model.

(optional specification):

METHYL JUMP 3

As seen above the INP.PARM (default name for the *input*-*file*)
is a self-documenting file. However, we provide a more detailed description
of some of the contents of this file. As specified in this file, **mardigras**
requires the following files as input:

**Input****coordinate****file**in the PDB format A template file*example2.pdb*is provided. The format of this file is explained more fully in the**corma****4.0**program man pages. It must be preprocessed by**corma.in**.

**Input intensity file**, again in the same format as for**corma**. The template file is*example2.INT.1*. Refer to the**corma**man pages for more details.

**Input exchange rate file**, again in the same format as for**corma**.

- Input and output file names cannot exceed 20 characters. Only
one space must separate the words and numbers in the input-file. The coordinate
file must be specified with the character string:

- The experimental intensity file must be specified by the string:

The characters ".INT.1" **must** appear at the end of the
file name.

- The exchange rate file must be specified by the string:

The exchange rates are in [1/sec].

- The prefix for output intensity files must be specified by the
string

All output file names will begin with this string.

- The minimum number of iterations for
**mardigras**to cycle through. This number is entered as the string:

where n is an integer.

- The maximum number of iterations for
**mardigras**to cycle through. This number is entered as the string:

where n is an integer.

- An estimate of experimental noise level. This is taken to be
an absolute error in experimental intensities. When
**mar****digras**calculates errors in distances, it takes into account this absolute error as well as the difference in the fit between each experimental NOE intensity and the corresponding intensity calculated from the converged relaxation matrix.

NOISE ABSOLUTE NORMALIZED 0.0025

The keyword ABSOLUTE indicates that the noise is an absolute, and not a relative quantity. The keyword NORMALIZED means that this value has been normalized so that the 2D NOE intensity of a diagonal peak extrapolated to mixing time 0 is equal to 1.0. So in this case, the noise was estimated to be 0.25% of the diagonal peak intensity at mixing time 0.

NOISE ABSOLUTE UNNORMALIZED 11000.

The keyword UNNORMALIZED means that the value is in the same units as the input experimental intensities. This value, then, will depend on the integration routine used to calculate peak volumes. This is probably the most useful option, since one can get at least a reasonable estimate of the spectral noise level by looking at the size of the smallest observable cross-peak. The noise level is some fraction of this smallest peak intensity.

NOISE RELATIVE PERCENT 1.0

The keyword RELATIVE means that the noise should be considered as a constant percentage of the cross-peak intensity. That is, the error in a given cross-peak is proportional to its magnitude. This is rarely applicable.

- How
**mardigras**should normalize the input experimental intensities relative to the calculated absolute values. There are three choices:

NORMALIZE ALL

The keyword NORMALIZE means that the experimental intensities need to
be normalized. The keyword ALL means that the normalization factor should
be computed from the sums over *all* cross-peaks which have observed
experimental intensities as follows:

*fnorm* = ||||R *Iexpt*i*j*R *Icalc*i*j*_______|||
This is the recommended option when the starting model is reasonably good.

NORMALIZE FIXED

The keyword FIXED tells **mardigras** to perform the normalization
summation only over those experimental intensities which correspond to
fixed distances, e.g., methylene proton distances and aromatic ring proton
distances. This option is recommended when the starting model is very
poor, e.g., an extended chain structure of a protein.

- How
**mardigras**should treat methyl protons. The treatment of methyl protons is specified by

where n is an integer. The keyword METHYL refers to treatment of methyl rotors. The keyword JUMP indicates that the methyl rotors should be treated as rotating fast compared to the overall molecular tumbling rate. The keyword JUMP must be followed by an integer from 0 to 3. Each value corresponds to a different method of calculating distances to a methyl pseudoatom. In all cases the distance is calculated to a pseudoatom which is the geometric mean position of the three methyl protons. This distance is found by an iterative fitting procedure, and therefore requires no additional adjustment to distance bounds.

JUMP 0: This model calculates intra-residue distances using a 3-site jump model. It assumes that the geometry of the residue itself as given in the model PDB file is essentially correct regardless of the final conformation of the molecule. Therefore, the geometric dependence of methyl relaxation can be calculated for intra-residue NOE cross-peaks. It assumes, however, that absolutely nothing else is known about the molecular conformation. In this case, the methyl orientation relative to any other group is not known, and the distance is calculated according to a simple isotropicmotion only model. This distance has been shown to be very close to the value obtained by averaging over all possible methyl-group orientations. Thus we recommend using this model when the starting model-structure is very poor.

JUMP 1: This model calculates intra-residue distances using the 3-site model as above. Inter-residue distances are calculated according to an 18-site jump, which is a good approximation to free rotational diffusion. While this model is clearly not appropriate for methyl rotation, where the staggered conformation is highly favored, it is probably the best model to use when the model is of intermediate quality. In these cases, approximate relative positions are known. The model angle between the methyl plane of rotation and the other protons is therefore a better approximation than averaging over all possible angles. Nevertheless, the model may not be accurate enough to consider the actual distribution of protons in the plane to be accurate. In this case, it makes sense to average over all possible positions of the methyl protons in the circle described by their rotation-- i.e we should use free rotational diffusion.

JUMP 2: This model calculates both intra- and interresidue methyl distances according to an 18-site jump model. This model is appropriate when the starting model does not have methyl protons placed consistently in energetically favorable (i.e. generally eclipsed) positions. A three-site model assumes that the protons spend all of their time in one of three allowed positions. If this assumption is not appropriate, we should average over all possible positions (wellapproximated by an 18-site model).

JUMP 3: This model calculates both intra- and interresidue methyl distances according to a 3-site jump model. This is the most realistic and precise model, and is recommended when the starting model-structure is good.

- The input parameter lines described above can be entered in
any order into the file INP.PARM. The maximum number of lines allowed
in this file is to 99.

*prefix*.*DSTn* prints out all distances calculated for iteration
cycle *n*. It prints out experimental and calculated intensities
for comparison, as well as a comparison of model and **mardigras**
distances. Because it is fairly timeconsuming, iterative distance fits
for methyl distances are only calculated for the final cycle, and reported
distances for earlier cycles are simple isotropic-motion-only estimates.

*prefix*.*OUT*prints out various informational messages.Particularly important are methylene and aromatic ring pseudoatoms found that have been recognized in the input intensity file. Very importantly, this file reports when an atom name in the input intensity file is not recognized. This is often indicative of a nomenclature difference between the PDB file and the intensity file or of a failure to follow pseudoatom name-formation rules (see**corma**program man pages). This file also contains the proton pairs which have been classified as fixed-distance due to connectivity constraints (for a list of possible fixed distances, see the file*constrain.dat*).

*prefix*.*CNSTRNT* prints out parameters for constraints in

- molecular dynamics simulations.These consist of distances of minimum
energy and the force constant for a parabolic potential pseudoenergy curve
for deviations from this distance. The smaller the estimated error in
distance, the larger the value of this constant. The constant is currently
scaled to a yield an energy of 1/2kT when the displacement is equal to
the estimated distance error. Preliminary results suggest that this is
probably too high. Alternatively, a flat well might be allowed over the
region of uncertainty. The format for this file is compatible with AMBER.

*prefix*.*ERRORS* prints out any errors which may have occurred
during the run. Especially importantly, **mardigras** prints out a
list of any proton pairs which have observed intensities, but to which
**mardigras** was unable to assign a new relaxation rate (and hence
a new distance). This seems to occur sometimes when the model is poor
over a region which is underdetermined by experimental data-- the resulting
inconsistencies may cause physically meaningless rates to appear upon
back-transformation from intensities. After the initial efforts at structure
refinement, an improved starting model may be available which can eliminate
many from this errors list.

*prefix*.*DSTREJ* and *prefix*.*BNDSREJ* (used to
be called *prefix*.*DGREJ*) which print out rejected distances.
Rejected distances are those for which the relaxation rate is very small
-- corresponding to about 5 A -- and thus the distances have high uncertainty.

Lacking an X-ray or homologous model structure, begin with an extended-chain
structure. Run **mardigras** using methyl-jump model 0 (see above)
to calculate distances. These distances are still much more accurate than
those derived using the isolated spin-pair approximation. Put these distances,
allowing for fairly large deviations estimated by **mardigras**, into
a distance geometry algorithm to generate a family of possible structures.
Then use several of these structures as starting models for a second cycle
through **mardigras**, this time choosing either methyl-jump model
1 or 3. The spread of **mardigras** distances obtained for independent
runs will reveal the dependence of the solution on starting model, and
will generally be quite small, though less so if the data are inaccurate
or misassigned or grossly incomplete. At this point one can generate more
distance geometry structures or refine the earlier structures using molecular
dynamics.

**mardigras** permits a logical means to calculate bounds, which are
needed for distance geometry calculations or setting flat well size in
restrained molecular dynamics calculations. There are various considerations
which can be used to establish those bounds. For one thing, the distance
"error bar" W*rij* can be estimated from

(1) the error W*aij* calculated between an observed 2D NOE cross-peak
intensity *aij* and the corresponding MARDIGRASconverged matrix element
intensity, or (2) the minimum error dictated by the experimental noise
level since, using the two-spin assumption,

*rij* ~~ (*k*I*m*)619/*aij*619 W*rij*
~~ -(*k*I*m*)619. W*aij*/6*aij*67

where the constant *k* incorporates physical constants and the correlation
time.

(3) distances calculated from spectra at different mixing times.

(4) distances calculated using different starting structures.

(5) in the case of motional or overlap averaging, worst case geometries enable calculation of maximum and minimum distances.

(6) distances involving exchangeable protons (imino, amino, amide) an upper limit on the exchange rate with bulk water modifies the lower bound.

This procedure for using **mardigras** is, of course, just a general
suggestion based on our experience with the program. It is a relatively
simple computational method for generating molecular structures (for roughly
globular and globally rather rigid molecules) from NOE data. The primary
assumptions behind **mardigras** are (1) that the overall tumbling
is isotropic (prolate ellipsoids with eccentricity of less than 2.5 should
be fine), and (2) that, aside from small local motions such as methyl
rotation and aromatic ring flip, the molecule is rigid and interproton
distances can be described by a single number. However, by using the model-free
approach the effects of anisotropy and flexibility of the molecule can
also be taken into account.

Paul D. Thomas

He Liu

Anil Kumar

Borgias and T. L. James, *Meth*. *Enzymol*. **176** (1989)
with features described in B.A. Borgias, M. Gochin, D.J. Kerwood, and
T.L. James, In *Progress* *in* *Nuclear* *Magnetic*
*Resonance* *Spectroscopy*, J.W. Emsley, J. Feeney, and L.H.
Sutcliffe (Eds), Oxford: Pergamon Press, **22**, 83-100 (1990), T.
L. James, *Curr*. *Opin*. *Struct*. *Biol*., **1**,
1042-1053, (1991). The R factor and the sixth root R factor are considered
in P.D. Thomas, V.J. Basus and T.L. James. *Proc*. *Nat*. *Acad*.
*Sci*. *USA*, **88**, 1237-1241 (1991). The description for
averaging of pseudoproton relaxation rates and distances can be found
in H. Liu, P. D. Thomas and T. L. James, *J*. *Magn*. *Reson*.
**98**, 163-175 (1992). Effect of internal motion on the interproton
distances is considered in A. Kumar, T. L. James, and G. C. Levy, *Israel*
*J*. *Chem*. **32**, 257-261 (1992). Application to the chemical
exchange problems is described in H. Liu, A. Kumar, K. Weisz, U. Schmitz,
K. D. Bishop, and T. L. James, *J*. *Am*. *Chem*. *Soc*.,
**115**, 1590 (1993).

Thomas James

Department of Pharmaceutical Chemistry

University of California

San Francisco, CA 94143-0446

fax: 415.476.8780

e-mail james@picasso.ucsf.edu

back to the software page