RNAstructure logo

RNAstructure Command Line Help
Fold and Fold-smp

Fold is used to predict the lowest free energy structure and a set of suboptimal structures, i.e. low free energy structures, using a variety of constraints. Fold-smp is a parallel processing version for use on multi-core computers, built using OpenMP.

USAGE: Fold <seq file> <ct file> [options]

OR: Fold-smp <seq file> <ct file> [options]

Required parameters:

<seq file> The name of a sequence file containing input data.
Note that lowercase nucleotides are forced single-stranded in structure prediction.
<ct file> The name of a CT file to which output will be written.

Options that do not require added values:

-d, -D, --DNA Specify that the sequence is DNA, and DNA parameters are to be used.
Default is to use RNA parameters. (Note that this is superseded by --alphabet and that --alphabet DNA also invoked DNA parameters.)
--disablecoax Specify that coaxial stacking recusions should not be used. This option uses a less realistic energy function in exchange for a faster calculation.
-h, -H, --help Display the usage details message.
-i, -I, --isolated Allow isolated base pairs. The default is to use a heuristic to forbid isolated base pairs. The heuristic prevents pairs (i-j) if (i+1 - j-1) and (i-1 - j+1) pairs are non-canonical pairs.
-k, --bracket Write the predicted structure in dot-bracket notation (DBN) instead of CT
format.
-mfe, -MFE, --MFE Specify that only the minimum free energy structure is needed. No savefiles can be generated. The -p and -m options are ignored. This saves nearly half the calculation time, but provides less information.
-q, --quiet Suppress unnecessary output. This option is implied when the output file is '-' (STDOUT).
-v, --version Display version and copyright information for this interface.
-y, -Y, --simple_iloops Specify that the O(N^3) internal loop search should be used. This speeds up the calculation if large internal loops are allowed using the -l option.

Options that require added values:

-a, -A, --alphabet Specify the name of a folding alphabet and associated nearest neighbor parameters. The alphabet is the prefix for the thermodynamic parameter files, e.g. "rna" for RNA parameters or "dna" for DNA parameters or a custom extended/modified alphabet. The thermodynamic parameters need to reside in the at the location indicated by environment variable DATAPATH.
The default is "rna" (i.e. use RNA parameters). This option overrides the --DNA flag.
-boot, -B, --bootstrap Specify the number of bootstrap iterations to be done to retrieve base pair confidence.
Defaults to no bootstrapping.
-c, -C, --constraint Specify a folding constraints file to be applied. For Fold, current supported constraints are force pairs, force a nucleotide to be single stranded, and force a nucleotide to be double stranded.
Default is to have no constraints applied.
Note: Constraints should be added with caution. The current folding algorithm does not allow base pairs that cannot stack on adjacent pairs. So constraining a pair effectively also requires that at least one adjacent pair can be formed.
-cmct, -CMC, --CMCT Specify a CMCT constraints file to be applied. These constraints are pseudoenergy constraints.
Default is to have no constraints applied.
-dms, -DMS, --DMS Specify a file with normalized DMS reactivity data. These data will be applied as a pseudoenergy restraint. The data are specified in a file using the SHAPE data file format.
-dsh, -DSH, --DSHAPE Specify a differential SHAPE data file to be used to generate restraints in addition to SHAPE restraints specified by --SHAPE. These restraints specifically use SHAPE pseudoenergy restraints where the offset is zero. Also, the function for calculating the pseudo free energy for nucleotide i is = (differential slope) * (differential SHAPE for nucleotide i). These pseudoenergies are added to those generated with the --SHAPE option.
Default is no differential SHAPE data file specified.
-dsm, -DSM, --DSHAPEslope Specify a slope used with differential SHAPE restraints.
Default is 2.11 kcal/mol.
-dso, -DSO, --doubleOffset Specify a double-stranded offset file, which adds energy bonuses to particular double-stranded nucleotides.
Default is to have no double-stranded offset specified.
-l, -L, --loop Specify the maximum number of unpaired nucleotides in an internal or bulge loop.
Default is 30 unpaired nucleotides.
-m, -M, --maximum Specify a maximum number of structures. Note that suboptimal structures are generated until either the maximum number of structures are reached or the maximum percent difference is reached (below).
Default is 20 structures.
-md, -MD, --maxdistance Specify a maximum pairing distance; that is, the maximum number of bases between the two nucleotides in a pair.
Default is no restriction on the distance between pairs.
--name Specify a name for the sequence. This will override the name in the sequence file.
-p, -P, --percent Specify a maximum percent difference in folding free energy change for generating suboptimal structures. For example, 20 would indicate 20%.
Default is determined by the length of the sequence.
-s, -S, --save Specify the name of a save file, needed for dot plots and refolding.
Default is not to generate a save file.
-sh, -SH, --SHAPE Specify a SHAPE data file to be used to generate restraints. These restraints specifically use SHAPE pseudoenergy restraints.
Default is no SHAPE data file specified.
-si, -SI, --SHAPEintercept Specify an intercept used with SHAPE restraints.
Default is -0.6 kcal/mol.
-sm, -SM, --SHAPEslope Specify a slope used with SHAPE restraints.
Default is 1.8 kcal/mol.
-sso, -SSO, --singleOffset Specify a single-stranded offset file, which adds energy bonuses to particular single-stranded nucleotides.
Default is to have no single-stranded offset specified.
-t, -T, --temperature Specify the temperature at which calculation takes place in Kelvin.
Default is 310.15 K, which is 37 degrees C.
-usi, -USI, --unpairedSHAPEintercept Specify an intercept used with single-stranded SHAPE constraints.
Default is 0 kcal/mol.
-usm, -USM, --unpairedSHAPEslope Specify a slope used with single-stranded SHAPE constraints.
Default is 0 kcal/mol.
-w, -W, --window Specify a window size.
Default is determined by the length of the sequence.
--warnings, --warn Set the behavior for non-critical warnings (e.g. related to invalid nucleotide positions or duplicate data points in SHAPE data). Valid values are:
* on -- Warnings are written to standard output. (default)
* err -- Warnings are sent to STDERR. This can be used in automated scripts, etc., to detect problems.
* off -- Do not display warnings at all (not recommended).
-x, -X, --experimentalPairBonus Specify a text file with a two-dimensional matrix of bonuses (in kcal/mol) to apply to each residue pair, as might be obtained from a mutate/map measurement. The matrix must have the same number of rows and columns as the target RNA. Bonus is applied once to edge base pairs, twice to internal base pairs.
Default is no experimental pair bonus file specified.
-xo Specify an intercept (overall offset) to use with the 2D experimental pair bonus file.
Default is 0.0 (no change to input bonuses).
-xs Specify a number to multiply the 2D experimental pair bonus matrix.
Default is 1.0 (no change to input bonuses).

Notes about using DMS data:

There are two ways to use DMS reactivity to improve the accuracy of secondary structure prediction:

  1. The first approach, explained in reference 3, is better when the DMS data are analyzed qualitatively. In this approach, the DMS-reactive nucleotides are specified as nucleotides accessible to chemical modification in a constraint file. The constraint file is then read using the -c option.
  2. The second approach is better when the DMS reactivities are quantified. This approach requires specification of normalized reactvities, using a SHAPE data file format. This file is then read using the -dms option. This approach is explained in reference 5.

Notes for smp:

Fold-smp, by default, will use all available compute cores for processing. The number of cores used can be controlled by setting the OMP_NUM_THREADS environment variable.

References:

  1. Reuter, J.S. and Mathews, D.H.
    "RNAstructure: software for RNA secondary structure prediction and analysis."
    BMC Bioinformatics, 11:129. (2010).
  2. Deigan, K.E., Li, T.W., Mathews, D.H. and Weeks, K.M.
    "Accurate SHAPE-directed RNA structure determination."
    Proc. Natl. Acad. Sci. U.S.A., 106:97-102. (2009).
  3. Mathews, D.H., Disney, M.D., Childs, J.L., Schroeder, S.J., Zuker, M. and Turner, D.H.
    "Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure."
    Proc. Natl. Acad. Sci. USA, 101:7287-7292. (2004).
  4. Mathews, D.H., Sabina, J., Zuker, M. and Turner, D.H.
    "Expanded sequence dependence of thermodynamic parameters provides improved prediction of RNA secondary structure."
    J. Mol. Biol., 288:911-940. (1999).
  5. Cordero, P., Kladwang, W., VanLang, C.C., and Das, R.
    "Quantitative dimethyl sulfate mapping for automated RNA secondary structure inference."
    Biochemistry, 51: 7037-7039. (2012).