Documentation ¶
Overview ¶
Package align allows aligning Antha sequences using the biogo implementation of the Needleman-Wunsch and Smith-Waterman alignment algorithms
Index ¶
- Constants
- Variables
- type Alignment
- func (a *Alignment) Match() string
- func (a *Alignment) QueryEnd() int
- func (a *Alignment) QueryFrame() int
- func (a *Alignment) QueryStart() int
- func (a *Alignment) Split(maxSectionLength int) ([]Alignment, error)
- func (a *Alignment) TemplateEnd() int
- func (a *Alignment) TemplateFrame() int
- func (a *Alignment) TemplateStart() int
- type Position
- type RawAlignment
- type Result
- func DNA(template, query wtype.DNASequence, alignmentMatrix ScoringMatrix) (alignment Result, err error)
- func DNAFwd(template, query wtype.DNASequence, alignmentMatrix ScoringMatrix) (Result, error)
- func DNARev(template, query wtype.DNASequence, alignmentMatrix ScoringMatrix) (alignment Result, err error)
- func DNASet(query wtype.DNASequence, templates []wtype.DNASequence, ...) ([]Result, error)
- func (r Result) Coverage() float64
- func (r Result) Gaps() int
- func (r Result) Identity() float64
- func (r Result) LongestContinuousSequence() wtype.DNASequence
- func (r Result) Matches() int
- func (r Result) Mismatches() int
- func (r Result) Positions() (result sequences.SearchResult)
- func (r Result) Score() int
- func (r Result) String() string
- type ScoringMatrix
Constants ¶
const GAP rune = rune('-')
GAP defines a standard character representing an alignment gap
const OutputMatch = "|"
OutputMatch defines a character representing an alignment match
const OutputMismatch = " "
OutputMismatch defines a character representing an alignment mismatch
Variables ¶
var ( // Fitted is the linear gap penalty fitted Needleman-Wunsch aligner type. // Query letter // - A C G T // - 0 -5 -5 -5 -5 // A -5 10 -3 -1 -4 // C -5 -3 9 -5 0 // G -5 -1 -5 7 -3 // T -5 -4 0 -3 8 Fitted = align.Fitted{ {0, -5, -5, -5, -5}, {-5, 10, -3, -1, -4}, {-5, -3, 9, -5, 0}, {-5, -1, -5, 7, -3}, {-5, -4, 0, -3, 8}, } // FittedAffine is the affine gap penalty fitted Needleman-Wunsch aligner type. // Query letter // - A C G T // - 0 -1 -1 -1 -1 // A -1 1 -1 -1 -1 // C -1 -1 1 -1 -1 // G -1 -1 -1 1 -1 // T -1 -1 -1 -1 1 // // Gap open: -5 FittedAffine = align.FittedAffine{ Matrix: align.Linear{ {0, -1, -1, -1, -1}, {-1, 1, -1, -1, -1}, {-1, -1, 1, -1, -1}, {-1, -1, -1, 1, -1}, {-1, -1, -1, -1, 1}, }, GapOpen: -5, } // NW is the linear gap penalty Needleman-Wunsch aligner type. // Query letter // - A C G T // - 0 -5 -5 -5 -5 // A -5 10 -3 -1 -4 // C -5 -3 9 -5 0 // G -5 -1 -5 7 -3 // T -5 -4 0 -3 8 NW = align.NW{ {0, -5, -5, -5, -5}, {-5, 10, -3, -1, -4}, {-5, -3, 9, -5, 0}, {-5, -1, -5, 7, -3}, {-5, -4, 0, -3, 8}, } // NWAffine is the affine gap penalty Needleman-Wunsch aligner type. // Query letter // - A C G T // - 0 -1 -1 -1 -1 // A -1 1 -1 -1 -1 // C -1 -1 1 -1 -1 // G -1 -1 -1 1 -1 // T -1 -1 -1 -1 1 // // Gap open: -5 NWAffine = align.NWAffine{ Matrix: align.Linear{ {0, -1, -1, -1, -1}, {-1, 1, -1, -1, -1}, {-1, -1, 1, -1, -1}, {-1, -1, -1, 1, -1}, {-1, -1, -1, -1, 1}, }, GapOpen: -5, } // SW1 is the Smith-Waterman aligner type. Matrix is a square scoring matrix with the last column and last row specifying gap penalties. Currently gap opening is not considered. // w(gap) = -1 // w(match) = +2 // w(mismatch) = -1 SW1 = align.SW{ {0, -1, -1, -1, -1}, {-1, 2, -1, -1, -1}, {-1, -1, 2, -1, -1}, {-1, -1, -1, 2, -1}, {-1, -1, -1, -1, 2}, } // SW2 is the Smith-Waterman aligner type. Matrix is a square scoring matrix with the last column and last row specifying gap penalties. Currently gap opening is not considered. // w(gap) = 0 // w(match) = +2 // w(mismatch) = -1 SW2 = align.SW{ {0, 0, 0, 0, 0}, {0, 2, -1, -1, -1}, {0, -1, 2, -1, -1}, {0, -1, -1, 2, -1}, {0, -1, -1, -1, 2}, } // SWAffine is the affine gap penalty Smith-Waterman aligner type. // Query letter // - A C G T // - 0 -1 -1 -1 -1 // A -1 1 -1 -1 -1 // C -1 -1 1 -1 -1 // G -1 -1 -1 1 -1 // T -1 -1 -1 -1 1 // // Gap open: -5 SWAffine = align.SWAffine{ Matrix: align.Linear{ {0, -1, -1, -1, -1}, {-1, 1, -1, -1, -1}, {-1, -1, 1, -1, -1}, {-1, -1, -1, 1, -1}, {-1, -1, -1, -1, 1}, }, GapOpen: -5, } )
var Algorithms = map[string]ScoringMatrix{ "Fitted": Fitted, "FittedAffine": FittedAffine, "NW": NW, "NWAffine": NWAffine, "SW1": SW1, "SW2": SW2, "SWAffine": SWAffine, }
Algorithms provides a map to lookup ScoringMatrix algorithms based on names. Algorithms available: Fitted: a modified Needleman-Wunsch algorithm which finds a local region of the reference with high similarity to the query. FittedAffine: a modified Needleman-Wunsch algorithm which finds a local region of the reference with high similarity to the query. NW: the Needleman-Wunsch algorithm NWAffine: the affine gap penalty Needleman-Wunsch algorithm SW1 and SW2: the Smith-Waterman algorithm
Functions ¶
This section is empty.
Types ¶
type Alignment ¶
type Alignment struct { TemplateResult string QueryResult string Raw []RawAlignment TemplatePositions []int QueryPositions []int Score int }
Alignment stores the string result of an alignment of a query sequence against a template The original RawAlignments are also included
func (*Alignment) Match ¶
Match produces a formatted line indicating matches between aligned sequences
GCTTTTTTAT res1 | |||||| <- like this GGG-TTTTAT res2
func (*Alignment) QueryFrame ¶
QueryFrame returns -1 if the query is aligned the reverse direction, 1 otherwise
func (*Alignment) QueryStart ¶
QueryStart returns the start position of the alignment in the query
func (*Alignment) Split ¶
Split an alignment into sections of up to a specified length, help formatting
func (*Alignment) TemplateEnd ¶
TemplateEnd returns the end position of the alignment in the template
func (*Alignment) TemplateFrame ¶
TemplateFrame returns -1 if the template is aligned the reverse direction, 1 otherwise
func (*Alignment) TemplateStart ¶
TemplateStart returns the start position of the alignment in the template
type RawAlignment ¶
RawAlignment contains the positions aligned between the template and query sequences
type Result ¶
type Result struct { Template wtype.BioSequence Query wtype.BioSequence Algorithm ScoringMatrix Alignment Alignment }
Result stores the full results of an alignment of a query against a template sequence, including the algorithm used.
func DNA ¶
func DNA(template, query wtype.DNASequence, alignmentMatrix ScoringMatrix) (alignment Result, err error)
DNA aligns two DNA sequences using a specified scoring algorithm. It returns an alignment description or an error if the scoring matrix is not square, or the sequence data types or alphabets do not match. algorithms available are: Fitted: a modified Needleman-Wunsch algorithm which finds a local region of the reference with high similarity to the query. FittedAffine: a modified Needleman-Wunsch algorithm which finds a local region of the reference with high similarity to the query. NW: the Needleman-Wunsch algorithm NWAffine: the affine gap penalty Needleman-Wunsch algorithm SW1 and SW2: the Smith-Waterman algorithm SWAffine: the affine gap penalty Smith-Waterman Alignment of the reverse complement of the query sequence will also be attempted and if the number of matches is higher the reverse alignment is returned. In the resulting alignment, mismatches are represented by lower case letters, gaps represented by the GAP character "-".
func DNAFwd ¶
func DNAFwd(template, query wtype.DNASequence, alignmentMatrix ScoringMatrix) (Result, error)
DNAFwd returns an alignment of a query sequence to a template sequence in the forward frame of the template, using a specified scoring algorithm
func DNARev ¶
func DNARev(template, query wtype.DNASequence, alignmentMatrix ScoringMatrix) (alignment Result, err error)
DNARev returns the alignment of a query sequence to a template sequence in the reverse frame of the template, using a specified scoring algorithm
func DNASet ¶
func DNASet(query wtype.DNASequence, templates []wtype.DNASequence, alignmentMatrix ScoringMatrix, maxResults int) ([]Result, error)
DNASet aligns a query to a collection (or database) of sequences, testing both forward and reverse directions. It returns the top scoring alignment results found in rank order, up to a specified number.
func (Result) Coverage ¶
Coverage returns the percentage of matching nucleotides of alignment to the template sequence a value between 0 and 1 is returned 1 = 100%; 0 = 0%
func (Result) Identity ¶
Identity returns the percentage of matching nucleotides of query in the template sequence a value between 0 and 1 is returned 1 = 100%; 0 = 0%
func (Result) LongestContinuousSequence ¶
func (r Result) LongestContinuousSequence() wtype.DNASequence
LongestContinuousSequence returns the longest unbroken chain of matches as a dna sequence
func (Result) Matches ¶
Matches returns the number of matched nucleotides between the aligned query sequence and aligned template sequence.
func (Result) Mismatches ¶
Mismatches returns the number of mismatched nucleotides between the aligned query sequence and aligned template sequence.
func (Result) Positions ¶
func (r Result) Positions() (result sequences.SearchResult)
Positions returns a SearchResult detailing the positions in the template sequence of the longest continuous matching sequence from the alignment.
type ScoringMatrix ¶
type ScoringMatrix interface {
Align(reference, query align.AlphabetSlicer) ([]feat.Pair, error)
}
ScoringMatrix implements the align.Aligner interface of the biogo/align package an align.Aligner aligns the sequence data of two type-matching Slicers, returning an ordered slice of features describing matching and mismatching segments. The sequences to be aligned must have a valid gap letter in the first position of their alphabet; the alphabets {DNA,RNA}{gapped,redundant} and Protein provided by the biogo/alphabet package satisfy this.