RNAstructure Classes  Version 6.0.1
ProbScan Class Reference

#include <ProbScan.h>

Inheritance diagram for ProbScan:
RNA Thermodynamics

Public Member Functions

 ProbScan (std::string sequence, bool isRNA=true)
 Constructor - user provides a sequence as a c string. More...
 
 ProbScan (const char filename[], bool from_sequence_file, bool isRNA=true)
 Constructor - user provides a filename for existing file as a c string. More...
 
double probability_of_hairpin (int i, int j)
 Returns probability of a hairpin closed at a specific position. More...
 
std::vector< hairpin_tprobability_of_all_hairpins (int min, int max, double threshold)
 Calculates the probabilities of all possible hairpins in this sequence. More...
 
double probability_of_internal_loop (int i, int j, int k, int l)
 Returns probability of an internal loop or bulge loop closed at a specific position. More...
 
std::vector< internal_loop_tprobability_of_all_internal_loops (double threshold, std::string mode=std::string("both"))
 Calculates the probabilities of all possible internal loops and/or bulge loops in this sequence. More...
 
double probability_of_stack (int i, int j)
 
double probability_of_helix (const int i, const int j, const int how_many_stacks)
 Calculates probability of an helix at a specific position. More...
 
std::vector< basestack_tprobability_of_all_helices (double threshold, int length)
 Calculates the probabilities of all possible helices in this sequence of a specific length. More...
 
double probability_of_multibranch_loop (const multibranch_loop_t &mb)
 Calculates probability of a multibranch loop at a specific position. More...
 
- Public Member Functions inherited from RNA
 RNA (const char sequence[], const bool IsRNA=true)
 Constructor - user provides a sequence as a c string. More...
 
 RNA (const char filepathOrSequence[], const RNAInputType fileType, const char *const alphabetName, const bool allowUnknownBases=false, const bool skipThermoTables=false)
 
 RNA (const char filepathOrSequence[], const RNAInputType fileType, const Thermodynamics *copyThermo)
 
 RNA (const char filename[], const RNAInputType fileType, const bool IsRNA=true)
 Constructor - user provides a filename for existing file as a c string. More...
 
 RNA (const bool IsRNA=true)
 
int GetErrorCode () const
 Return an error code, where a return of zero is no error. More...
 
string GetFullErrorMessage () const
 
const string GetErrorDetails () const
 Returns extended details about the last error. (e.g. error messages produced during file read operations that are otherwise lost.) More...
 
void SetErrorDetails (const string &details)
 Set extended details about the last error. (e.g. error messages produced during file read operations that are otherwise lost.) More...
 
string GetErrorMessageString (const int error) const
 Return error messages based on code from GetErrorCode and other error codes. More...
 
void ResetError ()
 
void EnsureStructureCapcacity (const int minimumStructures)
 Ensure that at a minumum number of structures have been created. More...
 
int SpecifyPair (const int i, const int j, const int structurenumber=1)
 Specify a base pair between nucleotides i and j. More...
 
int RemovePairs (const int structurenumber=1, bool removeIfLastStructure=true)
 Remove all the current base pairs in a specified structure. More...
 
int RemoveBasePair (const int i, const int structurenumber=1)
 Remove a specified pair in a specified structure. More...
 
double CalculateFreeEnergy (const int structurenumber=1, const bool UseSimpleMBLoopRules=false)
 Return the predicted Gibb's free energy change for structure # structurenumber, defaulted to 1. More...
 
int WriteThermodynamicDetails (const char filename[], const bool UseSimpleMBLoopRules=false)
 Calculate the folding free energy change for all structures and write the details of the calculation to a file. More...
 
int FoldSingleStrand (const float percent=20, const int maximumstructures=20, const int window=5, const char savefile[]="", const int maxinternalloopsize=30, bool mfeonly=false, bool simple_iloops=true, bool disablecoax=false)
 Predict the lowest free energy secondary structure and generate suboptimal structures using a heuristic. More...
 
int GenerateAllSuboptimalStructures (const float percent=5, const double deltaG=0.6)
 Predict the lowest free energy secondary structure and generate all suboptimal structures. More...
 
int MaximizeExpectedAccuracy (const double maxPercent=20, const int maxStructures=20, const int window=1, const double gamma=1.0)
 Predict the structure with maximum expected accuracy and suboptimal structures. More...
 
int PartitionFunction (const char savefile[]="", double temperature=-10.0, bool disablecoax=false, bool restoreSHAPE=true)
 Predict the partition function for a sequence. More...
 
int Rsample (const vector< double > &experimentalRestraints, RsampleData &refdata, const int randomSeed=0, const char savefile[]="", const double cparam=0.5, const double offset=1.10, const int numsamples=10000)
 
int PredictProbablePairs (const float probability=0)
 Predict structures containing highly probable pairs. More...
 
int ProbKnot (int iterations=1, int MinHelixLength=1)
 Predict maximum expected accuracy structures that contain pseudoknots from either a sequence or a partition function save file. More...
 
int ProbKnotFromSample (int iterations=1, int MinHelixLength=1)
 Predict maximum expected accuracy structures that contain pseudoknots from a file containing ensemble of structures. More...
 
int ReFoldSingleStrand (const float percent=20, const int maximumstructures=20, const int window=5)
 Re-predict the lowest free energy secondary structure and generate suboptimal structures using a heuristic. More...
 
int Stochastic (const int structures=1000, const int seed=1)
 Sample structures from the Boltzman ensemable. More...
 
int ForceDoubleStranded (const int i)
 Force a nucleotide to be double stranded (base paired). More...
 
int ForceFMNCleavage (const int i)
 Indicate a nucleotide that is accessible to FMN cleavage (a U in GU pair). More...
 
int ForceMaximumPairingDistance (const int distance)
 Force a maximum distance between apired nucleotides. More...
 
int ForceModification (const int i)
 Force modification for a nucleotide. More...
 
int ForcePair (const int i, const int j)
 Force a pair between two nucleotides. More...
 
int ForceProhibitPair (const int i, const int j)
 Prohibit a pair between two nucleotides. More...
 
int ForceSingleStranded (const int i)
 Force a nucleotide to be single stranded. More...
 
int GetForcedDoubleStranded (const int constraintnumber)
 Return a nucleotide that is forced double stranded. More...
 
int GetForcedFMNCleavage (const int constraintnumber)
 Return a nucleotide that is accessible to FMN cleavage. More...
 
int GetForcedModification (const int constraintnumber)
 Return a nucleotide that is accessible to modification. More...
 
int GetForcedPair (const int constraintnumber, const bool fiveprime)
 Return a nucleotide in a forced pair. More...
 
int GetForcedProhibitedPair (const int constraintnumber, const bool fiveprime)
 Return a nucleotide in a prohibited pair. More...
 
int GetForcedSingleStranded (const int constraintnumber)
 Return a nucleotide that is forced single stranded. More...
 
int GetMaximumPairingDistance ()
 Return the maximum pairing distance. More...
 
int GetNumberOfForcedDoubleStranded ()
 Return the number of nucletides forced to be paired. More...
 
int GetNumberOfForcedFMNCleavages ()
 Return the number of nucleotides accessible to FMN cleavage. More...
 
int GetNumberOfForcedModifications ()
 Return the number of nucleotides accessible to chemical modification. More...
 
int GetNumberOfForcedPairs ()
 Return the number of forced base pairs. More...
 
int GetNumberOfForcedProhibitedPairs ()
 Return the number of prohibited base pairs. More...
 
int GetNumberOfForcedSingleStranded ()
 Return the number of nucleotides that are not allowed to pair. More...
 
int ReadConstraints (const char filename[])
 Read a set of folding constraints to disk in a plain text file. More...
 
int ReadSHAPE (const char filename[], const double slope, const double intercept, RestraintType modifier=RESTRAINT_SHAPE, const bool IsPseudoEnergy=true)
 Read SHAPE data from disk. More...
 
int ReadSHAPE (const char filename[], const double dsSlope, const double dsIntercept, const double ssSlope, const double ssIntercept, RestraintType modifier=RESTRAINT_SHAPE)
 Read SHAPE data from disk including single-stranded SHAPE pseudo free energys. More...
 
int ReadDMS (const char filename[])
 
int ReadDSO (const char filename[])
 Read double strand offset data from disk. More...
 
int ReadSSO (const char filename[])
 Read single strand offset data from disk. More...
 
int ReadExperimentalPairBonus (const char filename[], double const experimentalOffset, double const experimentalScaling)
 Read experimental pair bonuses from disk. More...
 
void RemoveConstraints ()
 Remove all folding constraints. More...
 
int SetExtrinsic (int i, int j, double k)
 Add extrinsic restraints for partition function calculations. More...
 
int WriteConstraints (const char filename[])
 Write the current set of folding constraints to disk in a plain text file. More...
 
int AddComment (const char comment[], const int structurenumber=1)
 Add a comment associated with a structure. More...
 
int WriteCt (const char filename[], bool append=false, const CTCommentProvider &commentProvider=DEFAULT_CT_ENERGY_COMMENTS) const
 Write a ct file of the structures. More...
 
int WriteDotBracket (const char filename[], const int structurenumber=-1, const DotBracketFormat format=DBN_FMT_MULTI_TITLE, const CTCommentProvider &commentProvider=DEFAULT_CT_ENERGY_COMMENTS) const
 Write dot-bracket file of structures. More...
 
int BreakPseudoknot (const bool minimum_energy=true, const int structurenumber=0, const bool useFastMethod=true)
 Break any pseudoknots that might be in a structure. More...
 
bool ContainsPseudoknot (const int structurenumber)
 Report if there are any pseudoknots in a structure. More...
 
double GetEnsembleEnergy ()
 Get the ensemble folding free energy change. More...
 
double GetEnsembleDefect (const int structurenumber=1)
 Get the ensemble defect of a secondary structure. More...
 
double GetFreeEnergy (const int structurenumber)
 Get the folding free energy change for a predicted structure. More...
 
int GetPair (const int i, const int structurenumber=1)
 Get the nucleotide to which the specified nucleotide is paired. More...
 
double GetPairEnergy (const int i, const int j)
 Get the lowest folding free energy possible for a structure containing pair i-j. More...
 
double GetPairProbability (const int i, const int j)
 Get a base pair probability. More...
 
int GetPairProbabilities (double *arr, const int size)
 
int GetStructureNumber () const
 Get the total number of specified or predicted structures. More...
 
int DetermineDrawingCoordinates (const int height, const int width, const int structurenumber=1)
 Determine the coordinates for drawing a secondary structure. More...
 
std::string GetCommentString (const int structurenumber=1)
 Provide the comment from the ct file as a string. More...
 
int GetNucleotideXCoordinate (const int i)
 Get the X coordinate for nucleotide i for drawing a structure. More...
 
int GetNucleotideYCoordinate (const int i)
 Get the Y coordinate for nucleotide i for drawing a structure. More...
 
int GetLabelXCoordinate (const int i)
 Get the X coordinate for placing the nucleotide index label specified by i. More...
 
int GetLabelYCoordinate (const int i)
 Get the Y coordinate for placing the nucleotide index label specified by i. More...
 
char GetNucleotide (const int i)
 
int GetSequenceLength () const
 Get the total length of the sequence. More...
 
const char * GetSequence () const
 
bool GetBackboneType () const
 Get the backbone type. More...
 
structure * GetStructure ()
 
void SetProgress (ProgressHandler &Progress)
 
void StopProgress ()
 
ProgressHandler * GetProgress ()
 
 ~RNA ()
 Destructor. More...
 
void CopyThermo (Thermodynamics &copy)
 Copy thermodynamic parameters from an instance of an RNA class. More...
 
- Public Member Functions inherited from Thermodynamics
 Thermodynamics (const bool isRNA=true, const char *const alphabetName=NULL, const double temperature=310.15)
 
 Thermodynamics (const Thermodynamics &copyThermo)
 
int SetTemperature (double temperature)
 Set the temperature of folding in K. More...
 
double GetTemperature () const
 
string GetAlphabetName () const
 Get the name of the extended alphabet for which thermodynamic parameters should be loaded. More...
 
int ReadThermodynamic (const char *directory=NULL, const char *alphabet=NULL, const double temperature=-1.0)
 Function to read the thermodynamic parameters. More...
 
int ReloadDataTables (const double new_temperature=-1.0)
 
bool VerifyThermodynamic ()
 Force the datatables to be read if they haven't already. Return true if the tables were already loaded or if the attempt to (re)open them succeded. More...
 
datatable * GetDatatable ()
 
datatable * GetEnthalpyTable (const char *alphabet=NULL)
 
void ClearEnergies ()
 Clear the currently loaded energy datatable and release its resources. More...
 
void ClearEnthalpies ()
 Clear the currently loaded enthalpy datatable and release its resources. More...
 
bool GetEnergyRead () const
 Return whether this instance of Thermodynamics has the paremters populated (either from disk or from another Thermodynamics class). More...
 
bool IsAlphabetRead () const
 
 ~Thermodynamics ()
 

Private Member Functions

PFPRECISION equilibrium_constant_for_multibranch_loop (const multibranch_loop_t &)
 
std::vector< mb_elementconstruct_mb_element_array (const multibranch_loop_t &)
 

Additional Inherited Members

- Static Public Member Functions inherited from RNA
static const char * GetErrorMessage (const int error)
 Return error messages based on code from GetErrorCode and other error codes. More...
 
- Public Attributes inherited from Thermodynamics
bool isrna
 
- Protected Member Functions inherited from RNA
int FileReader (const char filename[], const RNAInputType fileType)
 
void init (const char *sequenceOrFileName, const RNAInputType fileType, const bool allowUnknownBases=false, const bool skipThermoTables=false)
 
- Protected Member Functions inherited from Thermodynamics
virtual void CopyThermo (const Thermodynamics &copy)
 Copy thermodynamic parameters from an instance of Thermodynamics class. More...
 
- Protected Attributes inherited from RNA
int ErrorCode
 
ProgressHandler * progress
 
PFPRECISION * w5
 
PFPRECISION * w3
 
PFPRECISION ** wca
 
pfdatatable * pfdata
 
DynProgArray< PFPRECISION > * w
 
DynProgArray< PFPRECISION > * v
 
DynProgArray< PFPRECISION > * wmb
 
DynProgArray< PFPRECISION > * wl
 
DynProgArray< PFPRECISION > * wmbl
 
DynProgArray< PFPRECISION > * wcoax
 
PFPRECISION Q
 
- Protected Attributes inherited from Thermodynamics
datatable * data
 
datatable * enthalpy
 
bool copied
 
double nominal_temperature
 
string nominal_alphabetName
 
bool skipThermoTables
 

Constructor & Destructor Documentation

ProbScan::ProbScan ( std::string  sequence,
bool  isRNA = true 
)

Constructor - user provides a sequence as a c string.

The partition function will be calculated. If the sequence is long, this may take some time. Input sequence should contain A,C,G,T,U,a,c,g,t,u,x,X. Capitalization makes no difference. T=t=u=U. If IsRNA is true, the backbone is RNA, so U is assumed. If IsRNA is false, the backbone is DNA, so T is assumed. x=X= nucleotide that neither stacks nor pairs. For now, any unknown nuc is considered 'X'. Note that sequences will subsequently be indexed starting at 1 (like a biologist), so that the 0th position in the sequence array will be nucleotide 1.

Parameters
sequenceis a NULL terminated c string containing the nucleotide sequence.
isRNAis a bool that indicates whether this sequence is RNA or DNA. true=RNA. false=DNA. Default is true.
ProbScan::ProbScan ( const char  filename[],
bool  from_sequence_file,
bool  isRNA = true 
)

Constructor - user provides a filename for existing file as a c string.

The existing file, specified by filename, can either be a ct file, a sequence, or an RNAstructure save file. Therefore, the user provides a flag for the file: type = 1 => .ct file, type = 2 => .seq file, type = 3 => partition function save (.pfs) file, type = 4 => folding save file (.sav). If the input file is ont a partition function save file, the partition function will be calculated. If the sequence is long, this may take some time. This constructor generates internal error codes that can be accessed by GetErrorCode() after the constructor is called. 0 = no error. The errorcode can be resolved to a c string using GetErrorMessage. Note that the contructor needs to be explicitly told, via IsRNA, what the backbone is because files do not store this information. Note also that save files explicitly store the thermodynamic parameters, therefore changing the backbone type as compaared to the original calculation will not change structure predictions.

Parameters
filenameis null terminated c string containing the path to the input file.
from_sequence_fileis a bool which tells the constructor whether we are initializing from a sequence file, in which case the partition function must be calculated
isRNAis a bool that indicates whether this sequence is RNA or DNA. true=RNA. false=DNA. Default is true.

Member Function Documentation

vector< mb_element > ProbScan::construct_mb_element_array ( const multibranch_loop_t mb)
private
PFPRECISION ProbScan::equilibrium_constant_for_multibranch_loop ( const multibranch_loop_t mb)
private
vector< hairpin_t > ProbScan::probability_of_all_hairpins ( int  min,
int  max,
double  threshold 
)

Calculates the probabilities of all possible hairpins in this sequence.

Parameters
minThe minimum size of a hairpin
maxThe maximum size of a hairpin
thresholdThe minimum probability for candidate hairpins
Returns
A vector of hairpin objects, containing the positions of the hairpins and their probabilities
std::vector< basestack_t > ProbScan::probability_of_all_helices ( double  threshold,
int  length 
)

Calculates the probabilities of all possible helices in this sequence of a specific length.

Parameters
thresholdthe minimum probability of candidate helices
lengththe number of base pair stacks to search for
Returns
A vector of helix objects, containing the positions of the helices and their probabilities
vector< internal_loop_t > ProbScan::probability_of_all_internal_loops ( double  threshold,
std::string  mode = std::string("both") 
)

Calculates the probabilities of all possible internal loops and/or bulge loops in this sequence.

Parameters
thresholdthe minimum probability of candidate loops
modea string which indicates what type of loops should be searched for. Allowed values are "internal", "bulge", and "both"
Returns
A vector of internal loop objects, containing the positions of the loops and their probabilities
double ProbScan::probability_of_hairpin ( int  i,
int  j 
)

Returns probability of a hairpin closed at a specific position.

Parameters
iThe 5' nucleotide closing the hairpin
jThe 3' nucleotide closing the hairpin
Returns
A double containing the probability of the hairpin
double ProbScan::probability_of_helix ( const int  i,
const int  j,
const int  how_many_stacks 
)

Calculates probability of an helix at a specific position.

Parameters
iThe 5' nucleotide closing the helix on the exterior
jThe 3' nucleotide closing the helix on the exterior
how_many_stacksThe number of base pair STACKS in the helix (this is the number of pairs minus 1)
Returns
A double containing the probability of the helix
double ProbScan::probability_of_internal_loop ( int  i,
int  j,
int  k,
int  l 
)

Returns probability of an internal loop or bulge loop closed at a specific position.

Parameters
iThe 5' nucleotide closing the loop on the exterior
jThe 3' nucleotide closing the loop on the exterior
kThe 5' nucleotide closing the loop on the interior
lThe 3' nucleotide closing the loop on the interior
Returns
A double containing the probability of the internal loop
double ProbScan::probability_of_multibranch_loop ( const multibranch_loop_t mb)

Calculates probability of a multibranch loop at a specific position.

Parameters
mbA multibranch loop object, containing a vector of pairs describing the multibranch loop. These can be created with the multibranch_loop function. See the text interface for the ProbScan program for an example of usage.
Returns
A double containing the probability of the multibranch loop
double ProbScan::probability_of_stack ( int  i,
int  j 
)

Calculates probability of a base pair stack closed at a specific position Note that this is a special case of probability_of_helix where the size is set to 1

Parameters
iThe 5' nucleotide closing the stack
jThe 3' nucleotide closing the stack
Returns
A double containing the probability of the stack

The documentation for this class was generated from the following files: