Alphabets¶
The following alphabets are available
| Nucleotide based | Amino acid based | Quality based |
|---|---|---|
ivs::dna2 |
ivs::aa27 |
ivs::pthred42 |
ivs::dna4 |
ivs::aa20 |
ivs::pthred63 |
ivs::dna5 |
ivs::aa10li |
ivs::pthred68solexa |
ivs::rna4 |
ivs::aa10murphy |
ivs::pthred94 |
ivs::rna5 |
||
ivs::iupac |
||
ivs::dna3bs |
||
ivs::d_dna4 |
||
ivs::d_dna5 |
||
ivs::d_rna4 |
||
ivs::d_rna5 |
||
ivs::d_iupac |
||
ivs::d_dna3bs |
Common functionality¶
- All functions are guaranteed to not throw.
- Alphabets starting with
d_are alphabets with a additional delimiter$as rank 0. -
static size_t size()Returns the number of elements of the alphabet.
-
static uint8_t char_to_rank(char c)Converts ASCII symbol
cto an rankeduint8_trepresentation (0 ≤ r < size()). Invalid ASCII symbols will return value 255. -
static char rank_to_char(uint8_t r)Converts the ranked value
rto its ACSII corresponding value. Value r must be fulfill (0 ≤ r < size()). -
static char normalize_char(char c)Normalizes the ASCII value
c. Normalizing depends on the alphabet. Typically this includes representing the value in captial letter.
Nucleotide based¶
Additionall functionality for Nucleotide based values:
-
static char complement_char(char c)Computes the complement of the ASCII value
c. Example given: the complement of dna4 for 'A' is 'T'. -
static uint8_t complement_rank(uint8_t)Computes the complement in rank space. Example given: the complement of dna4 of rank value 0 is 3. (A -> T).
Ambigous bases¶
Some alphabets have ambiguous bases. Like in dna5 the letter 'N' can stand for 'A', 'C', 'G' or 'T'. For this we provide a sepecial functionality:
-
static auto ambigous_bases() -> std::array<uint8_t, /*alphabet dependent*/>Returns an array of bases that are ambiguous.
-
static auto base_alternatives(uint8_t base) -> std::vector<uint8_t>Returns a list of values that
basecould stand for.