|
CycleFold predicts both canonical and non-canonical base pairs using nucleotide cyclic motifs (Parisien & Major, 2008). It predicts either lowest free energy structures or base pairing probabilities.
By default, the minimum free energy structure is predicted. The options, below, can change the behavior to predicting base pair probabilities or using predicted base pair probabilities to predict the maximum expected accuracy structure. If multiple sequences are entered, the TurboFold mode can also be specified. All output is to standard out.
CycleFold requires the environment variable, CYCLEFOLD_DATAPATH, be set to the location of the datatables, which reside in RNAstructure/CycleFold/datafiles.
USAGE: CycleFold <seq file> [options]
| <seq file> |
The name of a fasta file that contains the sequence.
|
| -b, -B, --bigloops |
Allow large internal loops or hairpin loops, whose energies are not tabulated in the NCM model. This is sometimes necessary for structure calculations with constraints (-c or -ct opions). |
| -fc, -FC, --fastaConstraints |
Specify that the input fasta file contains secondary structure constraints (in dot-bracket format) to be applied to each structure. Use this format for TurboFold calculations with constraints for multiple sequences.
Default: off. |
| -h, -H, --help |
Display the usage details message. |
| -m, -M, --maxExpect |
Specify that a MaxExpect calculation should be performed. (This predicts a structure composed of probable base pairs.) |
| -p, -P, --partitionfunction |
Specify that pair probabilities should be printed. |
| -s, -S, --seqFormat |
Switch the expected input from FASTA to .seq format.
|
| -t, -T, --turbo |
Specify that a TurboFold calculation should be performed. TurboFold writes all of the structures or pair probability tables, with their labels, in the order that they were provided. |
| -u, -U, --unpairingConstraints |
Specify whether restraints should be treated as unpairing constraints. The unpairing constraints force the first nucleotide in the specified pair(s) to be single-stranded.
Default: off. Pairing constraints that force base pairs will be used when the --constraintFile (-c), --constraintCT (-ct) or --fastaConstraints (-fc) options are used. |
| -v, -V, --version |
Display the version and copyright information. |
| -c, -C, --constraintFile |
Specify a constraint file to be applied. The pairs in the constraints file will be used to force base pairs in the prediction. Constraints should be written as rows of pairs (e.g. "1 10" for a base-pair between nucleotides 1 and 10). Follow the format of "Pairs" in constraints file format with no header.
Default is to have no constraints applied. |
| -ct, -CT, --constrainCt |
Specify a constraint file to be applied (in CT format). The pairs in the ct file will be used to force base pairs in the prediction.
|
| -g, -G, --gamma |
Set gamma, the weighting parameter for extrinsic information in the
turbo calculation. The default is 0.6.
|
| -i, -I, --iterations |
Set the number of iterations for the turbo calculation. The default is 2.
|
Note:
Unlike most other programs in RNAstructure, lowercase nucleotides are allowed to base pair.
In the default mode, which predicts the lowest free energy structure, the text output is in the CT format.
For MaxExpect predictions of structure (-m, -M, --maxExpect), the text output is in the CT format.
For base pair probability predictions (-p, -P, --partitionfunction), the text output is a tab-delimited matrix of pair probabilities.
For TurboFold predictions (-t, -T, --turbo) with multiple constraints, use the fasta constraints file format (-fc, -FC, --fastaConstraints).
Unpairing constraints (-u, -U, --unpairingConstraints) only force the first pair in the constraints file or ct file to be single-stranded.
-
Sloma, M.F. and Mathews, D.H.
"Base pair probability estimates improve the prediction accuracy of RNA non-canonical base pairs."
PLoS Computational Biology, 13: e1005827. (2017).
- Reuter, J.S. and Mathews, D.H.
"RNAstructure: software for RNA secondary structure prediction and analysis."
BMC Bioinformatics, 11:129. (2010).
- Parisien, M. and Major, F.
"The MC-Fold and MC-Sym pipeline infers RNA structure from sequence data."
Nature, 452: 51-55. (2008).
|