LocARNA-1.9.2
 All Classes Files Functions Variables Typedefs Enumerations Enumerator Friends
Public Types | Public Member Functions | Static Public Member Functions | Protected Member Functions | Protected Attributes
LocARNA::AlignerP Class Reference

Computes partition function of alignment, arc match and base match probabilities. More...

#include <aligner_p.hh>

List of all members.

Public Types

typedef size_t size_type
 size
typedef std::pair< size_type,
size_type
size_pair
 pair of size_type
typedef BasePairs__Arc Arc
 arc

Public Member Functions

 AlignerP (const AlignerParams &ap)
 Construct from parameters.
 AlignerP (const AlignerP &p)
 copy constructor p AlignerP object to be copied
 ~AlignerP ()
 Destructor.
pf_score_t align_inside ()
void align_outside ()
void compute_basematch_probabilities (bool basematch_probs_include_arcmatch)
void compute_arcmatch_probabilities ()
void write_arcmatch_probabilities (std::ostream &out)
 write the arc match probabilities to a stream
void write_basematch_probabilities (std::ostream &out)
 write the base match probabilities to a stream
pf_score_t virtual_Mprime (size_type al, size_type bl, size_type i, size_type j, size_type max_ar, size_type max_br) const
 Access virtual Mprime matrix.
double compute_fragment_match_prob (size_type i, size_type j, size_type k, size_type l)
 Fragment match probability.
void freeD ()
 free the space of D, take care!
void freeMprime ()
 free the space of D, take care!

Static Public Member Functions

static AlignerPParams create ()
 create with named parameters

Protected Member Functions

void init_M (size_type al, size_type ar, size_type bl, size_type br)
 initialize first column and row of M, for inside recursion
void init_E (size_type al, size_type ar, size_type bl, size_type br)
 initialize E
void init_Mrev (size_type al, size_type ar, size_type bl, size_type br)
void init_Erev (size_type al, size_type ar, size_type bl, size_type br)
pf_score_t comp_E_entry (size_type al, size_type bl, size_type i, size_type j)
 initialize first column and row of M' for outside recursion
pf_score_t comp_F_entry (size_type al, size_type bl, size_type i, size_type j)
 compute one entry in F (inside recursion cases)
pf_score_t comp_M_entry (size_type al, size_type bl, size_type i, size_type j)
 compute one entry in M (inside recursion cases)
pf_score_t comp_Mprime_entry (size_type al, size_type bl, size_type i, size_type j, size_type max_ar, size_type max_br)
 compute one entry in Mprime (outside recursion cases)
pf_score_t comp_Eprime_entry (size_type al, size_type bl, size_type i, size_type j)
 compute one entry in Eprime (outside recursion cases)
pf_score_t comp_Fprime_entry (size_type al, size_type bl, size_type i, size_type j)
 compute one entry in Fprime (outside recursion cases)
pf_score_t comp_Erev_entry (size_type i, size_type j)
 compute one entry in Erev
pf_score_t comp_Frev_entry (size_type i, size_type j)
 compute one entry in Frev
pf_score_t comp_Mrev_entry (size_type i, size_type j, size_type ar, size_type br)
void align_inside_arcmatch (size_type al, size_type ar, size_type bl, size_type br)
void align_outside_arcmatch (size_type al, size_type ar, size_type max_ar, size_type bl, size_type br, size_type max_br)
void align_reverse (size_type al, size_type ar, size_type bl, size_type br, bool copy=false)
void align_D ()
void align_Dprime ()
void fill_D (size_type al, size_type bl, size_type max_ar, size_type max_br)
void fill_Dprime (size_type al, size_type bl, size_type min_ar, size_type min_br, size_type max_ar, size_type max_br)
pf_score_t & D (const ArcMatch &am)
 returns lvalue of matrix D
pf_score_t & D (const Arc &arcA, const Arc &arcB)
 returns lvalue of matrix D
pf_score_t & Dprime (const ArcMatch &am)
 returns lvalue of matrix D'
pf_score_t & Dprime (const Arc &arcA, const Arc &arcB)
 returns lvalue of matrix D'
size_type leftmost_covering_arc (size_type s, const BasePairs &bps, size_type l, size_type r) const
std::pair< size_type, size_typeleftmost_covering_arcmatch (size_type al, size_type bl, size_type ar, size_type br) const
size_type rightmost_covering_arc (const BasePairs &bps, size_type l, size_type r, size_type s) const
std::pair< size_type, size_typerightmost_covering_arcmatch (size_type al, size_type bl, size_type ar, size_type br) const
void alloc_inside_matrices ()
 allocate space for the inside matrices
void alloc_outside_matrices ()
 allocate space for the outside matrices

Protected Attributes

const AlignerPParamsparams
 the parameter for the alignment
const Scoringscoring
 the scores
const SequenceseqA
 sequence A
const BasePairsbpsA
 base pairs A
const SequenceseqB
 sequence B
const BasePairsbpsB
 base pairs B
const ArcMatchesarc_matches
 (potential) arc matches of A and B
AlignerPRestriction r
 restriction of alignment
pf_score_t pf_scale
pf_score_t partFunc
PFScoreMatrix Dmat
PFScoreVector E
pf_score_t F
PFScoreMatrix M
PFScoreMatrix Mrev
PFScoreVector Erev
pf_score_t Frev
PFScoreMatrix Erev_mat
PFScoreMatrix Frev_mat
PFScoreMatrix Dmatprime
PFScoreVector Eprime
pf_score_t Fprime
PFScoreMatrix Mprime
SparseProbMatrix am_prob
 probabilities of arc matchs, as computed by the algo
SparsePFScoreMatrix bm_prob
bool D_created
 flag, is D already created?
bool Dprime_created
 flag, is Dprime already created?

Detailed Description

Computes partition function of alignment, arc match and base match probabilities.

Performs partition function computation over alignment consensus structure pairs of two sequences and their associated sets of weighted basepairs

Implements indise and outside algorithm.

Implements calculation of match probabilities.

The class knows about the two sequences and the two weighted base pair sets.


Constructor & Destructor Documentation

Construct from parameters.

Parameters:
apparameter for aligner
Note:
ap is copied to allow reference to a temporary
allow implicit conversion for named parameter idiom

Member Function Documentation

void LocARNA::AlignerP::align_D ( ) [protected]

create the entries in the D matrix. This function is called by align() (unless D_created)

void LocARNA::AlignerP::align_Dprime ( ) [protected]

create the entries in the Dprime matrix This function is called by align() (unless Dprime_created) uses inside recursion

compute the partition function by the inside algorithm and fill the D matrix

Returns:
partition function
void LocARNA::AlignerP::align_inside_arcmatch ( size_type  al,
size_type  ar,
size_type  bl,
size_type  br 
) [protected]

align subsequences enclosed by two arcs

Parameters:
alleft end of arc in seqA
arright end of arc in seqA
blleft end of arc in seqB
brright end of arc in seqB Align subsequences seqA(al+1,ar-1) to seqB(bl+1,br-1). Computes matrix entries in M, E, F. post: entries (i,j) are valid in the range al<i<ar, bl<j<br

perform the outside algorithm and fill the Dprime matrix, assumes that D matrix is computed already

void LocARNA::AlignerP::align_outside_arcmatch ( size_type  al,
size_type  ar,
size_type  max_ar,
size_type  bl,
size_type  br,
size_type  max_br 
) [protected]

align outside of an arc-match

Parameters:
alleft end of arc in seqA
arright end of arc in seqA
blleft end of arc in seqB
brright end of arc in seqB
max_arleftmost right end in seqA, for which the score can simply be composed from M and Mrev.
max_brleftmost right end in seqB, for which the score can simply be composed from M and Mrev.
void LocARNA::AlignerP::align_reverse ( size_type  al,
size_type  ar,
size_type  bl,
size_type  br,
bool  copy = false 
) [protected]

align reversed. fills matrices Mrev, Erev, Frev, such that Mrev(i,j) codes for subsequences seqA(i+1..ar) and seqB(j+1..br) and is valid for al-1<=i<=ar and bl-1<=j<=br

Parameters:
alleft position delimiting range of positions in seqA
arright position delimiting range of positions in seqA
blleft position delimiting range of positions in seqB
brright position delimiting range of positions in seqB pre: matrix Mrev has size 0..lenA x 0..lenB
copyif true, make a copy of Erev/Frev in Erev/Frev_mat

Note that al, ar, bl, br denote the actual limits of the subsequences, which differs from their use in align_inside_arcmatch. Otherwise the two methods correspond to each other by respectively performing forward and backward computation!

pf_score_t LocARNA::AlignerP::comp_E_entry ( size_type  al,
size_type  bl,
size_type  i,
size_type  j 
) [inline, protected]

initialize first column and row of M' for outside recursion

initialize first row of E' for outside recursion compute one entry in E (inside recursion cases)

pf_score_t LocARNA::AlignerP::comp_Mrev_entry ( size_type  i,
size_type  j,
size_type  ar,
size_type  br 
) [inline, protected]

compute one entry in Mrev, where Mrev(i,j) codes for subsequences seqA(i+1..ar) and seqB(j+1..br)

Parameters:
arright position delimiting range of positions in seqA
brright position delimiting range of positions in seqB
iposition in seqA
jposition in seqB assert i<=ar and j<=br. pre: matrix entries Mrev(i',j') and Erev(j') computed/initialised for i<=i'<=ar, j<=j'<=br, (i,j)!=(i',j')
Returns:
score of entry M(i,j)

computes the probabilitites of all arc matches and stores them internally (in a sparse matrix), no probability filtering

Todo:
check: why is that long double? do we need it? should we rather use pf_t?
void LocARNA::AlignerP::compute_basematch_probabilities ( bool  basematch_probs_include_arcmatch)

computes the probabilitites of all base matches and stores them internally (in a 2D-matrix), no probability filtering

Fragment match probability.

Computes the probability that two fragments [i..j] and [k..l] are matched in an alignment.

Parameters:
istart of first fragment
jend of first fragment
kstart of second fragment
lend of second fragment
Returns:
fragment match probability
static AlignerPParams LocARNA::AlignerP::create ( ) [inline, static]

create with named parameters

Returns:
parameter object
void LocARNA::AlignerP::fill_D ( size_type  al,
size_type  bl,
size_type  max_ar,
size_type  max_br 
) [protected]

fill in D the entries with left ends al,bl, where adjlA, adjlB are adjanceny lists of the arcs uses inside recursion

void LocARNA::AlignerP::fill_Dprime ( size_type  al,
size_type  bl,
size_type  min_ar,
size_type  min_br,
size_type  max_ar,
size_type  max_br 
) [protected]

fill in D the entries with right ends ar,br, where adjrA, adjrB are adjanceny lists of the arcs uses outside recursion

void LocARNA::AlignerP::init_Erev ( size_type  al,
size_type  ar,
size_type  bl,
size_type  br 
) [protected]

initialize the reversed E matrix/vector Erev[j] codes for subsequences seqA(i+1..ar) and seqB(j+1..br) and is valid for bl-1<=j<=br

Parameters:
alleft position delimiting range of positions in seqA
arright position delimiting range of positions in seqA
blleft position delimiting range of positions in seqB
brright position delimiting range of positions in seqB pre: Erev has size 0..lenB
void LocARNA::AlignerP::init_Mrev ( size_type  al,
size_type  ar,
size_type  bl,
size_type  br 
) [protected]

initialize the reversed M matrix, such that Mrev(i,j) codes for subsequences seqA(i+1..ar) and seqB(j+1..br) and is valid for al-1<=i<=ar and bl-1<=j<=br

Parameters:
alleft position delimiting range of positions in seqA
arright position delimiting range of positions in seqA
blleft position delimiting range of positions in seqB
brright position delimiting range of positions in seqB pre: matrix Mrev has size 0..lenA x 0..lenB

determine leftmost end of an arc that covers the range l..r

Parameters:
ssequence position, limit to base pairs right of s or equal
bpsbase pairs
lsequence position, left end of range
rsequence position, right end of range
Returns:
leftmost end l'>=s of any arc in bps that covers (l,r). Return l if there is no such arc An arc (l',r') covers (l,r) iff l'<l and r'>r.

compute the leftmost left ends of an arc match that covers (al,ar);(bl,br) (or smaller positions).

Parameters:
alleft end of base pair in seqA
arright end of base pair in seqA
blleft end of base pair in seqB
brright end of base pair in seqB
Parameters:
bpsbase pairs
lsequence position, left end of range
rsequence position, right end of range
ssequence position, limit to base pairs left of s or equal
Returns:
rightmost end r'<=s of an arc in bps that covers (l,r). Return r if there is no such arc An arc (l',r') covers (l,r) iff l'<l and r'>r.
Parameters:
alleft end of base pair in seqA
arright end of base pair in seqA
blleft end of base pair in seqB
brright end of base pair in seqB returns the rightmost left ends of an arc match that covers (al,ar);(bl,br) (or smaller positions).
pf_score_t LocARNA::AlignerP::virtual_Mprime ( size_type  al,
size_type  bl,
size_type  i,
size_type  j,
size_type  max_ar,
size_type  max_br 
) const

Access virtual Mprime matrix.

tests whether Mprime value can be composed from M and Mrev and thus is not materialized (i.e. only virtual)

Parameters:
al
bl
i
j
max_ar
max_br
Returns:
entry of virtual Mprime matrix

write the arc match probabilities to a stream

probabilities are filtered by threshold params->min_am_prob

Parameters:
outoutput stream

write the base match probabilities to a stream

probabilities are filtered by threshold params->min_bm_prob

Parameters:
outoutput stream

Member Data Documentation

probabilities of base matchs, as computed by the algo. Because we use bm_prob to accumulate conditional partition functions before dividing by the total partition function to obtain probabilities, use PFScoreMatrix. We assume that the type is more general than ProbMatrix

D(a,b) is the partition function of the subsequences seqA(al..ar) and seqB(bl..br), where the arcs a and b match

D'(a,b) is the partition function of the subsequences seqA(1..al-1,ar+1..lenA) and seqB(1..bl-1,br+1..lenB) times the contribution of the arc match (al,ar);(bl,br)

PFScoreVector LocARNA::AlignerP::E [protected]

For the current pair of left arc ends (al,bl) and a current line i E(j) is the partition function of the subsequences seqA(al+1..i) and seqB(bl+1..j) covering only alignments that gap the last position of seqA

In the algorithm, this is constantly overwritten, i.e. for a current j all entries E(j') j'<j are for the current line i and all entries j'>j are for the line i-1

PFScoreVector LocARNA::AlignerP::Eprime [protected]

For the current pair of left arc ends (al,bl) and line i, E'(j) is the partition function of the subsequences seqA(1..al-1,i+1..lenA) and seqB(1..bl-1,j+1..lenB) where i+1 is aligned to a gap

PFScoreVector LocARNA::AlignerP::Erev [protected]

reverse E "matrix"

See also:
Mrev

for outside optimization, store a complete copy of Erev and Frev

pf_score_t LocARNA::AlignerP::F [protected]

For the current pair of left arc ends (al,bl) and current indices (i,j), F is the the partition function of the subsequences seqA(al+1..i) and seqB(bl+1..j) covering only alignments that gap the last position of seqB

In the algorithm, this is constantly overwritten, i.e. when we compute the entries for (i,j) it will still contain the value of (i,j-1) and is then updated to the value for (i,j)

pf_score_t LocARNA::AlignerP::Fprime [protected]

For the current pair of left arc ends (al,bl) and (i,j), F' is the partition function of the subsequences seqA(1..al-1,i+1..lenA) and seqB(1..bl-1,j+1..lenB) where j+1 is aligned to a gap

pf_score_t LocARNA::AlignerP::Frev [protected]

reverse F "matrix"

See also:
Mrev

complete copy of Frev

See also:
Erev_mat

For the current pair of left arc ends (al,bl), M(i,j) is the partition function of the subsequences seqA(al+1..i) and seqB(bl+1..j)

For the current pair of left arc ends (al,bl), M'(i,j) is the partition function of the subsequences seqA(1..al-1,i+1..lenA) and seqB(1..bl-1,j+1..lenB)

For the current pair of left arc ends (al,bl), Mrev(i,j) is the partition function of the subsequences seqA(i+1..al-1) and seqB(j+1..bl-1)

pf_score_t LocARNA::AlignerP::partFunc [protected]

the total partition function (only defined after call of align_inside())

pf_score_t LocARNA::AlignerP::pf_scale [protected]

scales the partition function. We compute partition functions divided by pf_scale in order to avoid overflow of the double floating point range.

restriction of alignment

See also:
corrsespoding member r of Aligner
Note:
Currently, this is not used for an actual purpose in AlignerP, since we do not implement local alignment and/or k-best for partition functions.

The documentation for this class was generated from the following files:
 All Classes Files Functions Variables Typedefs Enumerations Enumerator Friends