LocARNA-1.9.2
 All Classes Files Functions Variables Typedefs Enumerations Enumerator Friends
Public Types | Public Member Functions
LocARNA::AnchorConstraints Class Reference

Represents anchor constraints between two sequences. More...

#include <anchor_constraints.hh>

List of all members.

Public Types

typedef size_t size_type
 size type
typedef std::pair< size_type,
size_type
size_pair_t
 size pair
typedef size_pair_t range_t
 type of range

Public Member Functions

 AnchorConstraints (size_type lenA, const std::vector< std::string > &seqCA, size_type lenB, const std::vector< std::string > &seqCB, bool strict)
 Construct from sequence lengths and anchor names.
 AnchorConstraints (size_type lenA, const std::string &seqCA, size_type lenB, const std::string &seqCB, bool strict)
 Construct from sequence lengths and anchor names.
bool allowed_match (size_type i, size_type j) const
 is match allowed
bool allowed_del_unopt (size_type i, size_type j) const
 is deletion allowed? (unoptimized)
bool allowed_del (size_type i, size_type j) const
 is deletion allowed? (unoptimized version)
bool allowed_ins_unopt (size_type i, size_type j) const
 is insertion allowed? (unoptimized)
bool allowed_ins (size_type i, size_type j) const
 is insertion allowed? (unoptimized)
std::string get_name_a (size_type i) const
 get the name of position i in A
std::string get_name_b (size_type j) const
 get the name of position j in B
size_type name_size () const
 returns length/size of the names
bool empty () const
 is the constraint declaration empty
size_pair_t rightmost_anchor () const
 Get rightmost anchor.
size_pair_t leftmost_anchor () const
 Get leftmost anchor.
bool is_anchored_a (size_type i) const
 Is position in A anchored?
bool is_anchored_b (size_type i) const
 Is position in B anchored?
bool is_named_a (size_type i) const
 Is position in A named?
bool is_named_b (size_type i) const
 Is position in B named?
void print_debug ()
 write some debug information to stderr

Detailed Description

Represents anchor constraints between two sequences.

Maintains the constraints on (non-structural) alignment edges that have to be satisfied during the alignment

alignment algorithms can

and ask informations about sequence names.

SEMANTIC OF ANCHOR CONSTRAINTS

Generally, anchor constraints (i,j) enforce that positions i in A and j in B are matched; neither i nor j are deleted (for local alignment, this implies that both positions occur in the local alignment) The class allows to choose between two semantics of anchor constraints. The relaxed semantics can drop constraints and produce inconsisitencies during multiple alignment, when some names occur only in a subset of the sequences. Therefore, the strict semantics is introduced, which avoids such problems by introducing additional (order) dependencies between different names (consequently, the constraint specification is somewhat less flexible).

Relaxed semantics (originally, the only implemented semantics):

a) Positions with equal names must be matched (aligned to each other) Consequently, positions with names that occur also in the other sequence cannot be deleted. b) Names that occur in only one sequence, do not impose any constraints. Therefore, names can occur in arbitrary order.

Strict (ordered) semantics:

a) Names must be strictly lexicographically ordered in the annotation of each sequence b) Positions of equal names must be matched. c) Alignment columns must not violate the lex order, in the following sense: each alignment column, where at least one position is named, receives this name; the names of alignment columns must be lex-ordered (in the order of the columns).


Constructor & Destructor Documentation

LocARNA::AnchorConstraints::AnchorConstraints ( size_type  lenA,
const std::vector< std::string > &  seqCA,
size_type  lenB,
const std::vector< std::string > &  seqCB,
bool  strict 
)

Construct from sequence lengths and anchor names.

Parameters:
lenAlength of sequence A
seqCAvector of anchor strings for sequence A
lenBlength of sequence B
seqCBvector of anchor strings for sequence B
strictuse strict semantics

The constraints (=alignment edges that have to be satisfied) are encoded as follows: equal symbols in the sequences for A and B form an edge

In order to specify an arbitrary number of sequences, the strings can consist of several lines, then a symbol consists of all characters of the column. '.' and ' ' are neutral character, in the sense that columns consisting only of neutral characters do not specify names that have to match. However, neutral characters are not identified in names that contain at least one non-neutral character!

Example: seqCA={"..123...."} seqCB={"...12.3...."}

specifies the edges (3,4), (4,5), and (5,7)

Example 2: seqCA={"..AAB....", "..121...."} seqCB={"...AA.B....", "...12.1...."} specifies the same constraints, allowing a larger name space for constraints.

LocARNA::AnchorConstraints::AnchorConstraints ( size_type  lenA,
const std::string &  seqCA,
size_type  lenB,
const std::string &  seqCB,
bool  strict 
)

Construct from sequence lengths and anchor names.

Parameters:
lenAlength of sequence A
seqCAconcatenated anchor strings for sequence A (separated by '#')
lenBlength of sequence B
seqCBconcatenated anchor strings for sequence B (separated by '#')
strictuse strict semantics

for semantics of anchor strings see first constructor


Member Function Documentation

bool LocARNA::AnchorConstraints::allowed_del ( size_type  i,
size_type  j 
) const [inline]

is deletion allowed? (unoptimized version)

Parameters:
iposition/matrix index of first sequence
jposition/matrix index of second sequence
Returns:
whether it is allowed to delete i immediately right of j
See also:
allowed_del_unopt()

is deletion allowed? (unoptimized)

Parameters:
iposition/matrix index of first sequence
jposition/matrix index of second sequence
Returns:
whether it is allowed to delete i immediately right of j
See also:
allowed_match(), allowed_ins()

Definition (strict semantics): allowed_del(i, j) iff (! is_anchored(i) && names_a_[ max { i'<=i | named(i') ] < names_b_[ min { j'>=j+1 | named(j') ] && names_a_[ min { i'>=i | named(i') ] > names_b_[ max { j'<=j | named(j') ])

Definition (relaxed semantics): allowed_del(i,j) iff i~"j+0.5" does not cross (or touch) any edge i'~j', where name_a_[i']=name_b_[j']

Todo:
profile and potentially optimize
bool LocARNA::AnchorConstraints::allowed_ins ( size_type  i,
size_type  j 
) const [inline]

is insertion allowed? (unoptimized)

Parameters:
iposition/matrix index of first sequence
jposition/matrix index of second sequence
Returns:
whether it is allowed to insert j immediately right of i
See also:
allowed_match(), allowed_del()

is insertion allowed? (unoptimized)

Parameters:
iposition/matrix index of first sequence
jposition/matrix index of second sequence
Returns:
whether it is allowed to insert j immediately right of i
See also:
allowed_match(), allowed_del()

is match allowed

Parameters:
iposition/matrix index of first sequence
jposition/matrix index of second sequence
Returns:
whether i~j is an allowed match

Test whether the alignment edge i~j (i.e. the match of i and j) is allowed? An alignment edge is allowed, iff it is not in conflict with any anchor constraint.

Definition (strict semantics): allowed_match(i,j) iff (names_a_[ max { i'<=i | named(i') ] <= names_b_[ min { j'>=j | named(j') ] && names_a_[ min { i'>=i | named(i') ] >= names_b_[ max { j'<=j | named(j') ])

Definition (relaxed semantics): allowed_match(i,j) iff i~j does not cross (or touch) any edge i'~j' != i~j, where name_a_[i']=name_b_[j']

Is position in A anchored?

Parameters:
iposition in A
Note:
defined only for positions i in 0..lenA_+1

Is position in B anchored?

See also:
is_anchored_a

Is position in A named?

Parameters:
iposition in A

Is position in B named?

See also:
is_named_a

Get leftmost anchor.

Returns:
the positions (i,j) of the leftmost anchor constraint
Note:
if there are no anchors, return (lenA+1,lenB+1)

Get rightmost anchor.

Returns:
the positions (i,j) of the rightmost anchor constraint
Note:
if there are no anchors, return (0,0)

The documentation for this class was generated from the following files:
 All Classes Files Functions Variables Typedefs Enumerations Enumerator Friends