RNAstructure logo

RNAstructure Command Line Help
TurboHomology

TurboHomology predict the secondary structure and the alignment for a newly discovered sequence of an RNA family having multiple existing homologs with known secondary structures and a known multiple sequence alignment(MSA).

USAGE 1: TurboHomology <configuration file>

Required parameters:

<configuration file> The name of a file containing required configuration data.

Options that do not require added values:

-h, -H, --help Display the usage details message.

Options which require added values:

NONE

Notes:

  • Unknown nucleotides (e.g. N or X) get randomly mapped to A, C, G, or U by HMM.

Configuration file format:

The following is a description of valid options allowed in the configuration file.

# IMPORTANT CONFIG FILE FORMAT NOTES:
#
# Option lines may be specified by the option name followed by an equals sign
# and the option's desired value. Option names are not case sensitive.
# When specifying an option, there may be nothing else on the line.
# <option> = <value>
#
# Specifying comment lines:
# Comment lines must begin with "#" followed by a space.
# There may not be more than one "#" in a comment line.
# However, a comment line may be an unbroken string of "#", as in a divider 
# between sets of options.
#
# Blank lines are skipped.

# Mode specifies the resolving algorithm TurboFold uses after its initial fold.
# A valid mode is required for TurboFold to run properly.
# Valid modes can be one of three options:
#       1. MEA (Maximum expected accuracy)
#       2. ProbKnot (For pseudoknotted sequences)
#       3. Threshold (Finding most probable pairs)
# Modes should be specified as text strings: MEA, ProbKnot, or Threshold.
# The default mode is MEA.
Mode = MEA|ProbKnot|Threshold

#### Listing Input Sequences ####
# Place sequence file names in brackets separated by semicolons.
#    Filenames may contain spaces, but no extra space is allowed before or after
#    semicolons or braces.
InSeq = {path/to/input_New.seq;path/to/refSeq1.seq;path/to/refSeq2.seq;path/to/refSeq3.seq;}

#### Listing Reference CT files ####
# The order of reference sequences' CT files should match with the order of seq files 
# at above. 
# Place CT file names in brackets separated by semicolons.
#    Filenames may contain spaces, but no extra space is allowed before or after
#    semicolons or braces.
RefCT = {path/to/refCT1.ct;path/to/refCT2.ct;path/to/refCT3.ct;}

#### Reference Existing Alignment file ####
# Alignment file is in fasta format.

ExistingAln = path/to/existingAlignment.fasta


####  Output CT file ####
# Predicted CT file for the input_New.seq.
OutCT = {path/to/output1.ct;}



# Partiton function save file (PFS) names can be specified for each sequence
# if this type of output is desired. 
SaveFiles = {path/to/file1.pfs;}

# The output multiple sequence alignment filename can be specified. 
# Default is output.aln.
OutAln = <filename>


################################################################
# TurboHomology options
################################################################
# TurboHomology options affect output regardless of the mode specified.


# MaximumPairingDistance specified the maximum distance between nucleotides that can pair.
# i.e. for nucleotide i to pair with j, [i - j| < MaximumPairingDistance.
# This applies to each sequence.
# Its default is no limit, which is indicated by a value of zero.
MaximumPairingDistance = 0

# Temperature specifies the temperature at which TurboFold is run, in Kelvin.
# Its default value is 310.15 K, which is 37 degrees C.
Temperature = 310.15

# Processors specifies the number of processors TurboFold is run on.
# Note that this flag only has an effect when TurboFold-smp, the parallel version 
# of TurboFold, is run.
# Its default value is 1.
Processors = 1

# The format of output multiple sequence alignment can be choosen from Fasta or Clustal.
# Default is Clustal.
AlnFormat = Fasta|Clustal

# The number of columns of output multiple sequence alignment can be specified.
# Default is 60
ColumnNumber = 60

################################################################
# Maximum expected accuracy (MEA) mode options
################################################################
# The following options only have an effect when MEA mode is specified. 
# If they are specified when TurboFold is in a different mode, they are ignored.

# MaxPercent specifies the maximum percent energy difference.
# Its default value is 50 (percent).
MaxPercent = 50

# MaxStructures specifies the maximum number of structures to calculate.
# Its default value is 1000 structures.
MaxStructures = 1000

# MeaGamma specifies the MEA mode gamma value.
# This should not be confused with Gamma (above).
# Its default value is 1.0.
MeaGamma = 1.0

# Window specifies the window size.
# Its default value is 5 nucleotides.
Window = 5

################################################################
# Pseudoknot (ProbKnot) mode options
################################################################
# The following options only have an effect when ProbKnot mode is specified. 
# If they are specified when TurboFold is in a different mode, they are ignored.

# MinHelixLength is the minimum helix length allowed during folding.
# Its default value is 3 nucleotides.
MinHelixLength = 3

# Iterations specifies the number of iterations ProbKnot goes through.
# This should not be confused with Iterations (above).
# Its default value is 1.
PkIterations = 1

################################################################
# Probable Pairs (Threshold) mode options
################################################################
# The following options only have an effect when Threshold mode is specified. 
# If they are specified when TurboFold is in a different mode, they are ignored.

# Threshold specifies the probability threshold at which pairs are included in a structure.
# If a threshold is explicitly specified, it should be expressed as a number >= 0.5 and <= 1.0.
# Its default value is 0.
# This signifies that structures should be generated at the following thresholds:
#       >= 0.99, >= 0.97, >= 0.95, >= 0.90, >= 0.80, >= 0.70, >= 0.60, >= 0.50
Threshold = 0


                        
                      

References:

  1. Harmanci, A.O., Sharma, G., and Mathews, D.H.
    "TurboFold: Iterative Probabilistic Estimation of Secondary Structures for Multiple RNA Sequences."
    BMC Bioinformatics, 12:108. (2011).
  2. Reuter, J.S. and Mathews, D.H.
    "RNAstructure: software for RNA secondary structure prediction and analysis."
    BMC Bioinformatics, 11:129. (2010).