RNAstructure logo

RNAstructure Command Line Help
Rsample and Rsample-smp

Rsample is used to calculate the partition function for a sequence that has multiple conformations. To perform a complete Rsample calculation, the following steps should be performed:
  1. Run Rsample, using the options listed below, to produce a Partition Save File (PFS)
  2. Run stochastic, using this PFS file as input, to produce a CT file with Boltzmann ensemble of 1,000 structures.
  3. Read the CT file from step 2 (as a command line argument in Linux and MacOS) with R script RsampleCluster.R (optionally depends on calcdistf90.R and calcdist.f90). This program uses the algorithm by Ding and Lawrence to calculate optimal number of clusters and their centroids. In addition to R, it requires the installation of R package with clustering procedures called fpc which can be done by typing install.packages("fpc") inside R.
Rsample-smp is a parallel processing version for use on multi-core computers, built using OpenMP.

USAGE: Rsample <seq file> <restraints file> <pfs file> [options]

OR: Rsample-smp <seq file> <restraints file> <pfs file> [options]

Required parameters:

<seq file> The name of a sequence file containing input data.
Note that lowercase nucleotides are forced single-stranded in structure prediction.
<restraints file> Specify a file with experimental restraints. SHAPE data file format should be used.
<pfs file> The name of a binary partition function save file to which output will be written.

Options that do not require added values:

--DMS Specify that the data are DMS, not SHAPE (by default). DMS data are calibrated to in vivo DMS-MaP experiments with SSU rRNA. See Kumar et al.
-d, -D, --DNA Specify that the sequence is DNA, and DNA parameters are to be used.
Default is to use RNA parameters.
-h, -H, --help Display the usage details message.
-v, -V, --version Display version and copyright information for this interface.

Options that require added values:

-a, -A, --alphabet Specify the name of a folding alphabet and associated nearest neighbor parameters. The alphabet is the prefix for the thermodynamic parameter files, e.g. "rna" for RNA parameters or "dna" for DNA parameters or a custom extended/modified alphabet. The thermodynamic parameters need to reside in the at the location indicated by environment variable DATAPATH.
The default is "rna" (i.e. use RNA parameters). This option overrides the --DNA flag.
-c, --cparam Specify a C parameter used in Rsample calculations.
Default value derived for SHAPE experiments is 0.5 kcal/mol.
--MAX Specify a maximum reactivity value for data. This is 1000 by default. Using in vivo DMS-MaP experiments, we found that a value of 5 improved secondary structure prediction accuracy. The default (effectively no maximum) should work well for SHAPE data.
-md, --maxdistance Specify a maximum paired distance between nucleotides.
Default is no restriction on distance between pairs.
-ns, --numsamples Specify number of samples for stochastic sampling calculation used in Rsample.
Default is 10,000.
-O, --offset Specify and offset parameter used in Rsample calculations.
Default value derived for SHAPE experiments is 1.1 kcal/mol.
-rPE, --reactPairedEnd Give full path to file with end-of-helix paired nucleotide reactvities dataset.
Default values (from SHAPE experiments) are in rsample directory in DATAPATH.
-rPM, --reactPairedMiddle Give full path to file with middle-of-helix paired nucleotide reactvities dataset.
Default values (from SHAPE experiments) are in rsample directory in DATAPATH.
-rUP, --reactUnpaired Give full path to file with unpaired nucleotide reactvities dataset.
Default values (from SHAPE experiments) are in rsample directory in DATAPATH.
-s, --seed Specify a random seed.
Default is to set random seed from current time.
-t, -T, --temperature Specify the temperature at which calculation takes place in Kelvin.
Default is 310.15 K, which is 37 degrees C.

Notes for smp:

Rsample-smp, by default, will use all available compute cores for processing. The number of cores used can be controlled by setting the OMP_NUM_THREADS environment variable.

References

  1. Reuter, J.S. and Mathews, D.H.
    "RNAstructure: software for RNA secondary structure prediction and analysis."
    BMC Bioinformatics, 11:129. (2010).
  2. Spasic, A., Assmann, S.M., Bevilacqua, P.C. and Mathews, D.H.
    "Modeling RNA secondary structure folding ensemble using SHAPE mapping data."
    Nucleic Acids Research, 46: 4883 (2018).
  3. Kumar, J., Lackey, L., Waldern, J.M., Dey, A., Mathews, D.H., and Laederach, A.
    "Quantitative integration of RNA structure and splicing elements to explain alternative splicing of Microtubule-Associated Protein Tau gene"
    In Preparation.
  4. Ding, Y. and Lawrence, C.E.
    "A statistical sampling algorithm for RNA secondary structure prediction." 
    Nucleic Acids Research, 31:7280 (2003).