LocARNA-1.8.11
Public Types | Public Member Functions | Protected Types | Protected Member Functions | Protected Attributes | Friends | List of all members
LocARNA::RnaData Class Reference

represent sparsified data of RNA ensemble More...

#include <rna_data.hh>

Inheritance diagram for LocARNA::RnaData:
Inheritance graph
[legend]
Collaboration diagram for LocARNA::RnaData:
Collaboration graph
[legend]

Public Types

typedef SparseMatrix< double > arc_prob_matrix_t
 arc probability matrix
 
typedef size_t size_type
 usual size type
 

Public Member Functions

 RnaData (const RnaEnsemble &rna_ensemble, double p_bpcut, double max_bps_length_ratio, const PFoldParams &pfoldparams)
 Construct from RnaEnsemble with cutoff probability. More...
 
 RnaData (const std::string &filename, double p_bpcut, double max_bps_length_ratio, const PFoldParams &pfoldparams)
 Construct from file. More...
 
 RnaData (const RnaData &rna_dataA, const RnaData &rna_dataB, const Alignment &alignment, double p_expA, double p_expB, bool only_local=false)
 Construct as consensus of two aligned RNAs. More...
 
virtual ~RnaData ()
 destructor
 
const Sequencesequence () const
 Get the multiple alignment as sequence. More...
 
const MultipleAlignmentmultiple_alignment () const
 Get the multiple alignment. More...
 
size_type length () const
 Get the sequence length. More...
 
double arc_cutoff_prob () const
 Get base pair cutoff probability. More...
 
double arc_prob (pos_type i, pos_type j) const
 Get arc probability. More...
 
std::string mea_structure (double gamma=1.) const
 maximum expected accuracy structure More...
 
vrna_plist_t * plist () const
 Construct plist (pair list of Vienna RNA) More...
 
double joint_arc_prob (pos_type i, pos_type j) const
 Get arc probability. More...
 
double stacked_arc_prob (pos_type i, pos_type j) const
 Get arc probability. More...
 
double prob_paired_upstream (pos_type i) const
 Probability that a position is paired upstream. More...
 
double prob_paired_downstream (pos_type i) const
 Probability that a position is paired downstream. More...
 
double prob_unpaired (pos_type i) const
 Unpaired probability. More...
 
std::ostream & write_pp (std::ostream &out, double p_outbpcut=0) const
 
std::ostream & write_size_info (std::ostream &out) const
 Write object size information. More...
 
bool has_stacking () const
 Availability of stacking terms. More...
 
void set_anchors (const SequenceAnnotation &anchors)
 Write access to alignment anchors. More...
 

Protected Types

typedef arc_prob_matrix_t::const_iterator arc_probs_const_iterator
 type of constant iterator over arcs with probability above cutoff
 

Protected Member Functions

 RnaData (double p_bpcut)
 Almost empty constructor. More...
 
arc_probs_const_iterator arc_probs_begin () const
 begin of arcs with probability above cutoff Supports iteration over arcs More...
 
arc_probs_const_iterator arc_probs_end () const
 begin of arcs with probability above cutoff Supports iteration over arcs More...
 
virtual void init_from_fixed_structure (const SequenceAnnotation &structure, bool stacking)
 initialize from fixed structure More...
 
virtual void init_from_rna_ensemble (const RnaEnsemble &rna_ensemble, const PFoldParams &pfoldparams)
 initialize from rna ensemble More...
 
bool read_autodetect (const std::string &filename, bool stacking)
 read and initialize from file, autodetect format More...
 
virtual bool inloopprobs_ok () const
 check in loop probabilities More...
 
virtual void read_pp (const std::string &filename)
 
virtual std::istream & read_pp (std::istream &in)
 
void read_old_pp (const std::string &filename)
 
void read_ps (const std::string &filename)
 

Protected Attributes

RnaDataImplpimpl_
 

Friends

class RnaDataImpl
 
class ExtRnaDataImpl
 

Detailed Description

represent sparsified data of RNA ensemble

knows sequence, cutoff probability and base pair probabilities greater than the cutoff probability; potentially knows stacking probabilities

Note
This class knows the cutoff probability of its data. This cutoff can be different from the cutoff in classes like BasePairs which defines the structure elements that are considered by algorithms.
the class guarantees that sequences are normalized (uppercase, T->U) even when read in unnormalized form, e.g. from file or stream

Constructor & Destructor Documentation

LocARNA::RnaData::RnaData ( const RnaEnsemble rna_ensemble,
double  p_bpcut,
double  max_bps_length_ratio,
const PFoldParams pfoldparams 
)

Construct from RnaEnsemble with cutoff probability.

Parameters
rna_ensembleRNA ensemble data
p_bpcutcutoff probability
max_bps_length_ratiomax ratio of bps to length (0=no effect)
pfoldparamsfolding parameters (controls stacking)
Note
RnaData copies all required data from rna_ensemble and does not keep a reference; if pfoldparams.stacking is true, copy stacking terms
LocARNA::RnaData::RnaData ( const std::string &  filename,
double  p_bpcut,
double  max_bps_length_ratio,
const PFoldParams pfoldparams 
)

Construct from file.

Parameters
filenameinput file name
p_bpcutcutoff probability
pfoldparamsfolding parameters
max_bps_length_ratiomaximal ratio of number of base pairs divided by sequence length. This serves as a second filter on the "significant" base pairs. Value 0 turns this filter off.
Note
autodetect format of input; for fa or aln input formats, predict base pair probabilities
Todo:
consider to allow reading from istream; use istream::seekg(0) to reset stream to beginning (needed for format autodetect.) Is there a problem due to fail on eofbit in C++98?
LocARNA::RnaData::RnaData ( const RnaData rna_dataA,
const RnaData rna_dataB,
const Alignment alignment,
double  p_expA,
double  p_expB,
bool  only_local = false 
)

Construct as consensus of two aligned RNAs.

Parameters
rna_dataARNA ensemble data A
rna_dataBRNA ensemble data B
alignmentAlignment of A and B
p_expAbackground probability for A
p_expBbackground probability for B
only_localif true, construct only local alignment

The object uses mean cutoff probability of the given objects; The background probability is used in computing consensus probabilities. If both input rna data objects have stacking probabilities, stacking consensus probabilities are computed as well. If the object contain sequence anchors, we construct the new object with a consensus anchor string. (The latter is done as part of the consensus sequence computation.)

LocARNA::RnaData::RnaData ( double  p_bpcut)
explicitprotected

Almost empty constructor.

Parameters
p_bpcutcutoff probability

Member Function Documentation

double LocARNA::RnaData::arc_cutoff_prob ( ) const

Get base pair cutoff probability.

Returns
cutoff probability p_bpcut
double LocARNA::RnaData::arc_prob ( pos_type  i,
pos_type  j 
) const

Get arc probability.

Parameters
ileft sequence position
jright sequence position
Returns
probability p_ij of basepair (i,j) if p_ij>p_bpcut; otherwise, 0
RnaData::arc_probs_const_iterator LocARNA::RnaData::arc_probs_begin ( ) const
protected

begin of arcs with probability above cutoff Supports iteration over arcs

Returns
constant iterator
RnaData::arc_probs_const_iterator LocARNA::RnaData::arc_probs_end ( ) const
protected

begin of arcs with probability above cutoff Supports iteration over arcs

Returns
constant iterator
bool LocARNA::RnaData::has_stacking ( ) const

Availability of stacking terms.

Returns
whether stacking terms are available
void LocARNA::RnaData::init_from_fixed_structure ( const SequenceAnnotation structure,
bool  stacking 
)
protectedvirtual

initialize from fixed structure

Parameters
structurefixed structure
stackingwhether to initialize stacking terms
Note
can be overloaded to initialize with additional information (in loop probabilities)

Reimplemented in LocARNA::ExtRnaData.

void LocARNA::RnaData::init_from_rna_ensemble ( const RnaEnsemble rna_ensemble,
const PFoldParams pfoldparams 
)
protectedvirtual

initialize from rna ensemble

Parameters
rna_ensemblerna ensemble
stackingwhether to initialize stacking terms
Note
can be overloaded to initialize with additional information (in loop probabilities)

Reimplemented in LocARNA::ExtRnaData.

virtual bool LocARNA::RnaData::inloopprobs_ok ( ) const
inlineprotectedvirtual

check in loop probabilities

Returns
true iff loop probabilities are available or not required
Note
use to indicate the need for recomputation in read_autodetect(); always true in RnaData

Reimplemented in LocARNA::ExtRnaData.

double LocARNA::RnaData::joint_arc_prob ( pos_type  i,
pos_type  j 
) const

Get arc probability.

Parameters
ileft sequence position
jright sequence position
Returns
joint probability p^(2)_ij of basepair (i,j) and (i+1,j-1) if p_i+1j-1>p_bpcut and p^(2)_ij > p_bpcut; otherwise, 0
size_type LocARNA::RnaData::length ( ) const

Get the sequence length.

Returns
length of RNA sequence
std::string LocARNA::RnaData::mea_structure ( double  gamma = 1.) const

maximum expected accuracy structure

Parameters
gamma
Returns
maximum non-crossing expected accuracy structure

Works as interface to the RNAlib function MEA. From the ViennaRNA docu: Each base pair (i,j) gets a score 2*gamma*p_ij and the score of an unpaired base is given by the probability of not forming a pair (compare RNAfold)

const MultipleAlignment & LocARNA::RnaData::multiple_alignment ( ) const

Get the multiple alignment.

Returns
multiple alignment
vrna_plist_t * LocARNA::RnaData::plist ( ) const

Construct plist (pair list of Vienna RNA)

Note
the plist has to be deleted by the caller
Returns
pointer to plist
double LocARNA::RnaData::prob_paired_downstream ( pos_type  i) const

Probability that a position is paired downstream.

Parameters
isequence position
Returns
probability that a position i is paired with a position j<i (downstream)
Note
O(sequence.length()) implementation
See also
prob_paired_upstream
double LocARNA::RnaData::prob_paired_upstream ( pos_type  i) const

Probability that a position is paired upstream.

Parameters
isequence position
Returns
probability that a position i is paired with a position j>i (upstream)
Note
O(sequence.length()) implementation
See also
prob_paired_downstream
double LocARNA::RnaData::prob_unpaired ( pos_type  i) const

Unpaired probability.

Parameters
isequence position
Returns
probability that a position i is unpaired
Note
O(sequence.length()) implementation
bool LocARNA::RnaData::read_autodetect ( const std::string &  filename,
bool  stacking 
)
protected

read and initialize from file, autodetect format

Parameters
filenamename of input file
stackingwhether to read stacking terms
Returns
whether probabilities were read completely
Note
: this method is designed such that it can be used for RnaData and ExtRnaData
void LocARNA::RnaData::read_old_pp ( const std::string &  filename)
protected

Read data in the old pp format

Parameters
filenamename of input file

Reads only base pairs with probabilities greater than p_bpcut_; reads stacking probabilities only if has_stacking_ is true

Note
the old pp format starts with the sequence/alignment and then simply lists the arcs (i,j) with their probabilities p and optionally stacking probabilities p2. pp-files contain entries i j p [p2]. p denotes the probabilitiy of base pair (i,j). The optional stacking probability p2 is the joint probability of base pairs (i,j) and (i+1,j+1).
handling of stacking: after the call, has_stacking_ is true only if the file specified at least one stacking probability and has_stacking_ was true before.
void LocARNA::RnaData::read_pp ( const std::string &  filename)
protectedvirtual

Read data in pp format 2.0

Parameters
filenamename of input file

Reads only base pairs with probabilities greater than p_bpcut_; reads stacking probabilities only if has_stacking_ is true

Note
can be overloaded to read extension sections
pp is a proprietary format of LocARNA. In its simplest version, it starts with the sequence/alignment and then simply lists the arcs (i,j) with their probabilities p and optionally stacking probabilities p2. pp-files contain entries i j p [p2]. p denotes the probabilitiy of base pair (i,j). The optional stacking probability p2 is the joint probability of base pairs (i,j) and (i+1,j+1).
handling of stacking: after the call, has_stacking_ is true only if the file specified the STACK keyword and has_stacking_ was true before.
std::istream & LocARNA::RnaData::read_pp ( std::istream &  in)
protectedvirtual

Read data in pp format 2.0

Parameters
ininput stream
See also
read_pp(std::string)

Reimplemented in LocARNA::ExtRnaData.

void LocARNA::RnaData::read_ps ( const std::string &  filename)
protected

Read data in Vienna's dot plot ps format

Parameters
filenamename of input file

Reads only base pairs with probabilities greater than p_bpcut_; reads stacking probabilities only if has_stacking_ is true

Note
Recently changed behavior: reads sequence name from file (instead of guessing from filename!); stacking probabilities are read if available (then, sets has_stacking_ to true)
throws wrong_format exception if not in ps format
reading dot plot ps format is extremely fragile since the dot plot ps format was designed for viewing not for data input

sequence characters should be upper case, and Ts translated to Us

const Sequence & LocARNA::RnaData::sequence ( ) const

Get the multiple alignment as sequence.

Returns
sequence
void LocARNA::RnaData::set_anchors ( const SequenceAnnotation anchors)

Write access to alignment anchors.

Parameters
anchorsalignment anchors
See also
MultipleAlignment::set_annotation()
double LocARNA::RnaData::stacked_arc_prob ( pos_type  i,
pos_type  j 
) const

Get arc probability.

Parameters
ileft sequence position
jright sequence position
Returns
conditional probability p_ij|i+1j-1 of basepair (i,j) under condition of base pair (i+1,j-1) if p_i+1j-1>p_bpcut and p^(2)_ij > p_bpcut; throw exception if p_i+1j-1<=p_bpcut; otherwise, 0
std::ostream & LocARNA::RnaData::write_pp ( std::ostream &  out,
double  p_outbpcut = 0 
) const

Write data in pp format

Parameters
outoutput stream
p_outbpcutcutoff probability
Returns
stream

Writes only base pairs with probabilities greater than p_outbpcut

std::ostream & LocARNA::RnaData::write_size_info ( std::ostream &  out) const

Write object size information.

Parameters
outoutput stream

Writes numbers of stored probabilities to stream

Member Data Documentation

RnaDataImpl* LocARNA::RnaData::pimpl_
protected
  • pointer to corresponding implementation object

The documentation for this class was generated from the following files: