Rsample is used to calculate the
partition function for a sequence that has multiple conformations.
To perform a complete Rsample calculation, the following steps
should be performed:
- Run Rsample, using the options listed below, to produce a Partition Save File (PFS)
- Run
stochastic, using this PFS file as input, to produce a CT
file with Boltzmann ensemble of 1,000 structures.
- Read the CT file from step 2 (as a command line argument in
Linux and MacOS) with R script RsampleCluster.R (optionally depends on calcdistf90.R and calcdist.f90).
This program uses the algorithm by Ding and Lawrence to
calculate optimal number of clusters and their
centroids. In addition to R, it requires the installation of R package with clustering procedures called fpc which can be
done by typing install.packages("fpc") inside R.
Rsample-smp is a parallel processing version for use on multi-core
computers, built using OpenMP.
USAGE: Rsample <seq file> <restraints
file> <pfs file> [options]
OR: Rsample-smp <seq file>
<restraints file> <pfs file> [options]
<seq file> |
The name of a sequence
file containing input data.
Note that lowercase nucleotides are forced single-stranded
in structure prediction. |
<restraints file> |
Specify a file with experimental restraints. SHAPE
data file format should be used. |
<pfs file> |
The name of a binary partition function save file to
which output will be written. |
--DMS |
Specify that the data are DMS, not SHAPE (by default). DMS data are calibrated to in vivo DMS-MaP experiments with SSU rRNA. See Kumar et al. |
-d, -D, --DNA |
Specify that the sequence is DNA, and DNA parameters
are to be used.
Default is to use RNA parameters. |
-h, -H, --help |
Display the usage details message. |
-v, -V, --version |
Display version and copyright information for this
interface. |
-a, -A, --alphabet |
Specify the name of a folding alphabet and associated
nearest neighbor parameters. The alphabet is the prefix
for the thermodynamic parameter files, e.g. "rna" for RNA
parameters or "dna" for DNA parameters or a custom
extended/modified alphabet. The thermodynamic parameters
need to reside in the at the location indicated by
environment variable DATAPATH.
The default is "rna" (i.e. use RNA parameters). This
option overrides the --DNA flag. |
-c, --cparam |
Specify a C parameter used in Rsample calculations.
Default value derived for SHAPE experiments is 0.5
kcal/mol. |
--MAX |
Specify a maximum reactivity value for data. This is 1000 by default. Using in vivo DMS-MaP experiments, we found that a value of 5 improved secondary structure prediction accuracy. The default (effectively no maximum) should work well for SHAPE data. |
-md, --maxdistance |
Specify a maximum paired distance between nucleotides.
Default is no restriction on distance between pairs.
|
-ns, --numsamples |
Specify number of samples for stochastic sampling
calculation used in Rsample.
Default is 10,000. |
-O, --offset |
Specify and offset parameter used in Rsample
calculations.
Default value derived for SHAPE experiments is 1.1
kcal/mol. |
-rPE, --reactPairedEnd |
Give full path to file with end-of-helix paired
nucleotide reactvities dataset.
Default values (from SHAPE experiments) are in rsample
directory in DATAPATH. |
-rPM, --reactPairedMiddle |
Give full path to file with middle-of-helix paired
nucleotide reactvities dataset.
Default values (from SHAPE experiments) are in rsample
directory in DATAPATH. |
-rUP, --reactUnpaired |
Give full path to file with unpaired nucleotide
reactvities dataset.
Default values (from SHAPE experiments) are in rsample
directory in DATAPATH. |
-s, --seed |
Specify a random seed.
Default is to set random seed from current time. |
-t, -T, --temperature |
Specify the temperature at which calculation takes
place in Kelvin.
Default is 310.15 K, which is 37 degrees C. |
Rsample-smp, by default, will use all available compute cores
for processing. The number of cores used can be controlled by
setting the OMP_NUM_THREADS environment variable.
- Reuter, J.S. and Mathews, D.H.
"RNAstructure: software for RNA secondary structure prediction
and analysis."
BMC Bioinformatics, 11:129. (2010).
- Spasic, A., Assmann, S.M., Bevilacqua,
P.C. and Mathews, D.H.
"Modeling RNA secondary structure folding ensemble using SHAPE
mapping data."
Nucleic Acids Research, 46: 4883 (2018).
- Kumar, J., Lackey, L., Waldern, J.M., Dey, A., Mathews, D.H., and Laederach, A.
"Quantitative integration of RNA structure and splicing elements to explain alternative splicing of Microtubule-Associated Protein Tau gene"
In Preparation.
- Ding, Y. and Lawrence, C.E.
"A statistical sampling algorithm for RNA secondary structure
prediction."
Nucleic Acids Research, 31:7280 (2003).
|