Documentation
¶
Overview ¶
Package poly provides the core Sequence struct and methods for working with DNA, RNA, and Amino Acid sequences in Poly.
Each Sequence struct can be broken down into the following:
Meta: Author, Date, Description, Name, Version, etc. Features: A list of Feature structs containing feature locations and other information. Sequece: The actual sequence string.
As well as other info like source file checksums, etc.
This package will likely be overhauled before 1.0 release to be more flexible and robust. So be on the lookup for breaking changes in future releases.
Index ¶
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Feature ¶
type Feature struct {
Name string //Seqid in gff, name in gbk
//gff specific
Source string `json:"source"`
Type string `json:"type"`
Score string `json:"score"`
Strand string `json:"strand"`
Phase string `json:"phase"`
Attributes map[string]string `json:"attributes"`
GbkLocationString string `json:"gbk_location_string"`
Sequence string `json:"sequence"`
SequenceLocation Location `json:"sequence_location"`
SequenceHash string `json:"sequence_hash"`
Description string `json:"description"`
SequenceHashFunction string `json:"hash_function"`
ParentSequence *Sequence `json:"-"`
}
Feature holds a single annotation in a struct. from https://github.com/blachlylab/gff3/blob/master/gff3.go
func (Feature) GetSequence ¶
GetSequence is a method wrapper to get a Feature's sequence. Mutates with Sequence.
Example ¶
// Sequence for greenflourescent protein (GFP) that we're using as test data for this example. gfpSequence := "ATGGCTAGCAAAGGAGAAGAACTTTTCACTGGAGTTGTCCCAATTCTTGTTGAATTAGATGGTGATGTTAATGGGCACAAATTTTCTGTCAGTGGAGAGGGTGAAGGTGATGCTACATACGGAAAGCTTACCCTTAAATTTATTTGCACTACTGGAAAACTACCTGTTCCATGGCCAACACTTGTCACTACTTTCTCTTATGGTGTTCAATGCTTTTCCCGTTATCCGGATCATATGAAACGGCATGACTTTTTCAAGAGTGCCATGCCCGAAGGTTATGTACAGGAACGCACTATATCTTTCAAAGATGACGGGAACTACAAGACGCGTGCTGAAGTCAAGTTTGAAGGTGATACCCTTGTTAATCGTATCGAGTTAAAAGGTATTGATTTTAAAGAAGATGGAAACATTCTCGGACACAAACTCGAGTACAACTATAACTCACACAATGTATACATCACGGCAGACAAACAAAAGAATGGAATCAAAGCTAACTTCAAAATTCGCCACAACATTGAAGATGGATCCGTTCAACTAGCAGACCATTATCAACAAAATACTCCAATTGGCGATGGCCCTGTCCTTTTACCAGACAACCATTACCTGTCGACACAATCTGCCCTTTCGAAAGATCCCAACGAAAAGCGTGACCACATGGTCCTTCTTGAGTTTGTAACTGCTGCTGGGATTACACATGGCATGGATGAGCTCTACAAATAA" // initialize sequence and feature structs. var sequence Sequence var feature Feature // set the initialized sequence struct's sequence. sequence.Sequence = gfpSequence // Set the initialized feature name and sequence location. feature.Name = "Green Flourescent Protein" feature.SequenceLocation.Start = 0 feature.SequenceLocation.End = len(sequence.Sequence) // Add the GFP feature to the sequence struct. sequence.AddFeature(&feature) // get the GFP feature sequence string from the sequence struct. featureSequence := feature.GetSequence() // check to see if the feature was inserted properly into the sequence. fmt.Println(gfpSequence == featureSequence)
Output: true
type Location ¶
type Location struct {
Start int `json:"start"`
End int `json:"end"`
Complement bool `json:"complement"`
Join bool `json:"join"`
FivePrimePartial bool `json:"five_prime_partial"`
ThreePrimePartial bool `json:"three_prime_partial"`
SubLocations []Location `json:"sub_locations"`
}
Location holds nested location info for sequence region.
type Locus ¶
type Locus struct {
Name string `json:"name"`
SequenceLength string `json:"sequence_length"`
MoleculeType string `json:"molecule_type"`
GenbankDivision string `json:"genbank_division"`
ModificationDate string `json:"modification_date"`
SequenceCoding string `json:"sequence_coding"`
Circular bool `json:"circular"`
Linear bool `json:"linear"`
}
Locus holds Locus information in a Meta struct.
type Meta ¶
type Meta struct {
Name string `json:"name"`
GffVersion string `json:"gff_version"`
RegionStart int `json:"region_start"`
RegionEnd int `json:"region_end"`
Size int `json:"size"`
Type string `json:"type"`
Date string `json:"date"`
Definition string `json:"definition"`
Accession string `json:"accession"`
Version string `json:"version"`
Keywords string `json:"keywords"`
Organism string `json:"organism"`
Source string `json:"source"`
Origin string `json:"origin"`
Locus Locus `json:"locus"`
References []Reference `json:"references"`
Other map[string]string `json:"other"`
}
Meta Holds all the meta information of an Sequence struct.
type Reference ¶
type Reference struct {
Index string `json:"index"`
Authors string `json:"authors"`
Title string `json:"title"`
Journal string `json:"journal"`
PubMed string `json:"pub_med"`
Remark string `json:"remark"`
Range string `json:"range"`
}
Reference holds information one reference in a Meta struct.
type Sequence ¶
type Sequence struct {
Meta Meta `json:"meta"`
Description string `json:"description"`
SequenceHash string `json:"sequence_hash"`
SequenceHashFunction string `json:"hash_function"`
Sequence string `json:"sequence"`
Features []Feature `json:"features"`
CheckSum [32]byte `json:"checkSum"` // blake3 checksum of the parsed file itself. Useful for if you want to check if incoming genbank/gff files are different.
}
Sequence holds all sequence information in a single struct.
func (*Sequence) AddFeature ¶
AddFeature is the canonical way to add a Feature into a Sequence struct. Appending a Feature struct directly to Sequence.Feature's will break .GetSequence() method.
Example ¶
// Sequence for greenflourescent protein (GFP) that we're using as test data for this example. gfpSequence := "ATGGCTAGCAAAGGAGAAGAACTTTTCACTGGAGTTGTCCCAATTCTTGTTGAATTAGATGGTGATGTTAATGGGCACAAATTTTCTGTCAGTGGAGAGGGTGAAGGTGATGCTACATACGGAAAGCTTACCCTTAAATTTATTTGCACTACTGGAAAACTACCTGTTCCATGGCCAACACTTGTCACTACTTTCTCTTATGGTGTTCAATGCTTTTCCCGTTATCCGGATCATATGAAACGGCATGACTTTTTCAAGAGTGCCATGCCCGAAGGTTATGTACAGGAACGCACTATATCTTTCAAAGATGACGGGAACTACAAGACGCGTGCTGAAGTCAAGTTTGAAGGTGATACCCTTGTTAATCGTATCGAGTTAAAAGGTATTGATTTTAAAGAAGATGGAAACATTCTCGGACACAAACTCGAGTACAACTATAACTCACACAATGTATACATCACGGCAGACAAACAAAAGAATGGAATCAAAGCTAACTTCAAAATTCGCCACAACATTGAAGATGGATCCGTTCAACTAGCAGACCATTATCAACAAAATACTCCAATTGGCGATGGCCCTGTCCTTTTACCAGACAACCATTACCTGTCGACACAATCTGCCCTTTCGAAAGATCCCAACGAAAAGCGTGACCACATGGTCCTTCTTGAGTTTGTAACTGCTGCTGGGATTACACATGGCATGGATGAGCTCTACAAATAA" // initialize sequence and feature structs. var sequence Sequence var feature Feature // set the initialized sequence struct's sequence. sequence.Sequence = gfpSequence // Set the initialized feature name and sequence location. feature.Name = "Green Flourescent Protein" feature.SequenceLocation.Start = 0 feature.SequenceLocation.End = len(sequence.Sequence) // Add the GFP feature to the sequence struct. sequence.AddFeature(&feature) // get the GFP feature sequence string from the sequence struct. featureSequence := feature.GetSequence() // check to see if the feature was inserted properly into the sequence. fmt.Println(gfpSequence == featureSequence)
Output: true