Documentation
¶
Overview ¶
Package poly is a go package for engineering organisms.
Poly can be used in two ways.
- As a Go library where you have finer control and can make magical things happen.
- As a command line utility where you can bash script your way to greatness and make DNA go brrrrrrrr.
Installation ¶
These instructions assume that you already have a working go environment. If not see:
https://golang.org/doc/install
Building Poly CLI and package from scratch:
git clone https://github.com/TimothyStiles/poly.git && cd poly && go build ./... && go install ./...
Installing latest release of poly as a go package:
go get github.com/TimothyStiles/poly
For CLI only instructions please checkout: https://pkg.go.dev/github.com/TimothyStiles/poly/poly
Index ¶
- func BuildFASTA(sequence Sequence) []byte
- func BuildGbk(sequence Sequence) []byte
- func BuildGff(sequence Sequence) []byte
- func ComplementBase(basePair rune) rune
- func MarmurDoty(sequence string) float64
- func MeltingTemp(sequence string) float64
- func Optimize(aminoAcids string, codonTable CodonTable) string
- func ReverseComplement(sequence string) string
- func RotateSequence(sequence string) string
- func SantaLucia(sequence string, ...) (meltingTemp, dH, dS float64)
- func Translate(sequence string, codonTable CodonTable) string
- func WriteFASTA(sequence Sequence, path string)
- func WriteGbk(sequence Sequence, path string)
- func WriteGff(sequence Sequence, path string)
- func WriteJSON(sequence Sequence, path string)
- type AminoAcid
- type Codon
- type CodonTable
- type Feature
- type Location
- type Locus
- type Meta
- type Reference
- type Sequence
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func BuildFASTA ¶ added in v0.3.0
BuildFASTA builds a FASTA string from a Sequence struct.
Example ¶
sequence := ReadFASTA("data/base.fasta") // get example data
fasta := BuildFASTA(sequence) // build a fasta byte array
firstLine := string(bytes.Split(fasta, []byte("\n"))[0])
fmt.Println(firstLine)
Output: >gi|5524211|gb|AAD44166.1| cytochrome b [Elephas maximus maximus]
func BuildGbk ¶ added in v0.2.0
BuildGbk builds a GBK string to be written out to db or file.
Example ¶
sequence := ReadGbk("data/puc19.gbk")
gbkBytes := BuildGbk(sequence)
testSequence := ParseGbk(gbkBytes)
fmt.Println(testSequence.Meta.Locus.ModificationDate)
Output: 22-OCT-2019
func BuildGff ¶
BuildGff takes an Annotated sequence and returns a byte array representing a gff to be written out.
Example ¶
sequence := ReadGff("data/ecoli-mg1655-short.gff")
gffBytes := BuildGff(sequence)
reparsedSequence := ParseGff(gffBytes)
fmt.Println(reparsedSequence.Meta.Name)
Output: U00096.3
func ComplementBase ¶ added in v0.3.0
ComplementBase accepts a base pair and returns its complement base pair
func MarmurDoty ¶ added in v0.3.0
MarmurDoty calculates the melting point of an extremely short DNA sequence (<15 bp) using a modified Marmur Doty formula [Marmur J & Doty P (1962). Determination of the base composition of deoxyribonucleic acid from its thermal denaturation temperature. J Mol Biol, 5, 109-118.]
Example ¶
sequenceString := "ACGTCCGGACTT" meltingTemp := MarmurDoty(sequenceString) fmt.Println(meltingTemp)
Output: 31
func MeltingTemp ¶ added in v0.3.0
MeltingTemp calls SantaLucia with default inputs for primer and salt concentration.
Example ¶
sequenceString := "GTAAAACGACGGCCAGT" // M13 fwd expectedTM := 52.8 meltingTemp := MeltingTemp(sequenceString) withinMargin := math.Abs(expectedTM-meltingTemp)/expectedTM >= 0.02 fmt.Println(withinMargin)
Output: false
func Optimize ¶ added in v0.3.0
func Optimize(aminoAcids string, codonTable CodonTable) string
Optimize takes an amino acid sequence and CodonTable and returns an optimized codon sequence
Example ¶
gfpTranslation := "MASKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTLVTTFSYGVQCFSRYPDHMKRHDFFKSAMPEGYVQERTISFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITHGMDELYK*"
sequence := ReadGbk("data/puc19.gbk")
codonTable := GetCodonTable(11)
optimizationTable := sequence.GetOptimizationTable(codonTable)
optimizedSequence := Optimize(gfpTranslation, optimizationTable)
optimizedSequenceTranslation := Translate(optimizedSequence, optimizationTable)
fmt.Println(optimizedSequenceTranslation == gfpTranslation)
Output: true
func ReverseComplement ¶ added in v0.1.0
ReverseComplement takes the reverse complement of a sequence
func RotateSequence ¶
RotateSequence rotates circular sequences to deterministic point.
Example ¶
sequence := ReadGbk("data/puc19.gbk")
sequenceLength := len(sequence.Sequence)
testSequence := sequence.Sequence[sequenceLength/2:] + sequence.Sequence[0:sequenceLength/2]
fmt.Println(RotateSequence(sequence.Sequence) == RotateSequence(testSequence))
Output: true
func SantaLucia ¶ added in v0.3.0
func SantaLucia(sequence string, primerConcentration, saltConcentration, magnesiumConcentration float64) (meltingTemp, dH, dS float64)
SantaLucia calculates the melting point of a short DNA sequence (15-200 bp), using the Nearest Neighbors method [SantaLucia, J. (1998) PNAS, doi:10.1073/pnas.95.4.1460]
Example ¶
sequenceString := "ACGATGGCAGTAGCATGC" //"GTAAAACGACGGCCAGT" // M13 fwd testCPrimer := 0.1e-6 // primer concentration testCNa := 350e-3 // salt concentration testCMg := 0.0 // magnesium concentration expectedTM := 62.7 // roughly what we're expecting with a margin of error meltingTemp, _, _ := SantaLucia(sequenceString, testCPrimer, testCNa, testCMg) withinMargin := math.Abs(expectedTM-meltingTemp)/expectedTM >= 0.02 // checking margin of error fmt.Println(withinMargin)
Output: false
func Translate ¶ added in v0.3.0
func Translate(sequence string, codonTable CodonTable) string
Translate translates a codon sequence to an amino acid sequence
Example ¶
gfpTranslation := "MASKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTLVTTFSYGVQCFSRYPDHMKRHDFFKSAMPEGYVQERTISFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNVYITADKQKNGIKANFKIRHNIEDGSVQLADHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITHGMDELYK*" gfpDnaSequence := "ATGGCTAGCAAAGGAGAAGAACTTTTCACTGGAGTTGTCCCAATTCTTGTTGAATTAGATGGTGATGTTAATGGGCACAAATTTTCTGTCAGTGGAGAGGGTGAAGGTGATGCTACATACGGAAAGCTTACCCTTAAATTTATTTGCACTACTGGAAAACTACCTGTTCCATGGCCAACACTTGTCACTACTTTCTCTTATGGTGTTCAATGCTTTTCCCGTTATCCGGATCATATGAAACGGCATGACTTTTTCAAGAGTGCCATGCCCGAAGGTTATGTACAGGAACGCACTATATCTTTCAAAGATGACGGGAACTACAAGACGCGTGCTGAAGTCAAGTTTGAAGGTGATACCCTTGTTAATCGTATCGAGTTAAAAGGTATTGATTTTAAAGAAGATGGAAACATTCTCGGACACAAACTCGAGTACAACTATAACTCACACAATGTATACATCACGGCAGACAAACAAAAGAATGGAATCAAAGCTAACTTCAAAATTCGCCACAACATTGAAGATGGATCCGTTCAACTAGCAGACCATTATCAACAAAATACTCCAATTGGCGATGGCCCTGTCCTTTTACCAGACAACCATTACCTGTCGACACAATCTGCCCTTTCGAAAGATCCCAACGAAAAGCGTGACCACATGGTCCTTCTTGAGTTTGTAACTGCTGCTGGGATTACACATGGCATGGATGAGCTCTACAAATAA" testTranslation := Translate(gfpDnaSequence, GetCodonTable(11)) // need to specify which codons map to which amino acids per NCBI table fmt.Println(gfpTranslation == testTranslation)
Output: true
func WriteFASTA ¶ added in v0.3.0
WriteFASTA writes a Sequence struct out to FASTA.
Example ¶
sequence := ReadFASTA("data/base.fasta") // get example data
WriteFASTA(sequence, "data/test.fasta") // write it out again
testSequence := ReadFASTA("data/test.fasta") // read it in again
os.Remove("data/test.fasta") // getting rid of test file
fmt.Println(testSequence.Features[0].Description)
Output: gi|5524211|gb|AAD44166.1| cytochrome b [Elephas maximus maximus]
func WriteGbk ¶ added in v0.2.0
WriteGbk takes an Sequence struct and a path string and writes out a gff to that path.
Example ¶
sequence := ReadGbk("data/puc19.gbk")
WriteGbk(sequence, "data/test.gbk")
testSequence := ReadGbk("data/test.gbk")
os.Remove("data/test.gbk")
fmt.Println(testSequence.Meta.Locus.ModificationDate)
Output: 22-OCT-2019
func WriteGff ¶
WriteGff takes an Sequence struct and a path string and writes out a gff to that path.
Example ¶
sequence := ReadGff("data/ecoli-mg1655-short.gff")
WriteGff(sequence, "data/test.gff")
testSequence := ReadGff("data/test.gff")
os.Remove("data/test.gff")
fmt.Println(testSequence.Meta.Name)
Output: U00096.3
func WriteJSON ¶
WriteJSON writes an Sequence struct out to json.
Example ¶
sequence := ReadJSON("data/sample.json")
WriteJSON(sequence, "data/test.json")
testSequence := ReadJSON("data/test.json")
os.Remove("data/test.json")
fmt.Println(testSequence.Meta.Source)
Output: Saccharomyces cerevisiae (baker's yeast)
Types ¶
type AminoAcid ¶ added in v0.3.0
AminoAcid holds information for an amino acid and related codons in a struct
type CodonTable ¶ added in v0.3.0
CodonTable holds information for a codon table.
func GetCodonTable ¶ added in v0.5.0
func GetCodonTable(index int) CodonTable
GetCodonTable takes the index of desired NCBI codon table and returns it.
func (CodonTable) OptimizeTable ¶ added in v0.6.0
func (codonTable CodonTable) OptimizeTable(sequence string) CodonTable
OptimizeTable weights each codon in a codon table according to input string codon frequency. This function actually mutates the CodonTable struct itself.
type Feature ¶
type Feature struct {
Name string //Seqid in gff, name in gbk
//gff specific
Source string `json:"source"`
Type string `json:"type"`
Score string `json:"score"`
Strand string `json:"strand"`
Phase string `json:"phase"`
Attributes map[string]string `json:"attributes"`
GbkLocationString string `json:"gbk_location_string"`
Sequence string `json:"sequence"`
SequenceLocation Location `json:"sequence_location"`
SequenceHash string `json:"sequence_hash"`
Description string `json:"description"`
SequenceHashFunction string `json:"hash_function"`
ParentSequence *Sequence `json:"-"`
}
Feature holds a single annotation in a struct. from https://github.com/blachlylab/gff3/blob/master/gff3.go
func (Feature) GetSequence ¶ added in v0.3.0
GetSequence is a method wrapper to get a Feature's sequence. Mutates with Sequence.
type Location ¶ added in v0.1.0
type Location struct {
Start int `json:"start"`
End int `json:"end"`
Complement bool `json:"complement"`
Join bool `json:"join"`
FivePrimePartial bool `json:"five_prime_partial"`
ThreePrimePartial bool `json:"three_prime_partial"`
SubLocations []Location `json:"sub_locations"`
}
Location holds nested location info for sequence region.
type Locus ¶
type Locus struct {
Name string `json:"name"`
SequenceLength string `json:"sequence_length"`
MoleculeType string `json:"molecule_type"`
GenbankDivision string `json:"genbank_division"`
ModificationDate string `json:"modification_date"`
SequenceCoding string `json:"sequence_coding"`
Circular bool `json:"circular"`
Linear bool `json:"linear"`
}
Locus holds Locus information in a Meta struct.
type Meta ¶
type Meta struct {
Name string `json:"name"`
GffVersion string `json:"gff_version"`
RegionStart int `json:"region_start"`
RegionEnd int `json:"region_end"`
Size int `json:"size"`
Type string `json:"type"`
Date string `json:"date"`
Definition string `json:"definition"`
Accession string `json:"accession"`
Version string `json:"version"`
Keywords string `json:"keywords"`
Organism string `json:"organism"`
Source string `json:"source"`
Origin string `json:"origin"`
Locus Locus `json:"locus"`
References []Reference `json:"references"`
Other map[string]string `json:"other"`
}
Meta Holds all the meta information of an Sequence struct.
type Reference ¶
type Reference struct {
Index string `json:"index"`
Authors string `json:"authors"`
Title string `json:"title"`
Journal string `json:"journal"`
PubMed string `json:"pub_med"`
Remark string `json:"remark"`
Range string `json:"range"`
}
Reference holds information one reference in a Meta struct.
type Sequence ¶
type Sequence struct {
Meta Meta `json:"meta"`
Description string `json:"description"`
SequenceHash string `json:"sequence_hash"`
SequenceHashFunction string `json:"hash_function"`
Sequence string `json:"sequence"`
Features []Feature `json:"features"`
}
Sequence holds all sequence information in a single struct.
func ParseFASTA ¶ added in v0.3.0
ParseFASTA parses a Sequence struct from a FASTA file and adds appropriate pointers to the structs.
Example ¶
file, _ := ioutil.ReadFile("data/base.fasta")
sequence := ParseFASTA(file)
fmt.Println(sequence.Features[0].Description)
Output: gi|5524211|gb|AAD44166.1| cytochrome b [Elephas maximus maximus]
func ParseGbk ¶
ParseGbk takes in a string representing a gbk/gb/genbank file and parses it into an Sequence object.
Example ¶
file, _ := ioutil.ReadFile("data/puc19.gbk")
sequence := ParseGbk(file)
fmt.Println(sequence.Meta.Locus.ModificationDate)
Output: 22-OCT-2019
func ParseGff ¶
ParseGff Takes in a string representing a gffv3 file and parses it into an Sequence object.
Example ¶
file, _ := ioutil.ReadFile("data/ecoli-mg1655-short.gff")
sequence := ParseGff(file)
fmt.Println(sequence.Meta.Name)
Output: U00096.3
func ParseJSON ¶ added in v0.1.0
ParseJSON parses an Sequence JSON file and adds appropriate pointers to struct.
Example ¶
file, _ := ioutil.ReadFile("data/sample.json")
sequence := ParseJSON(file)
fmt.Println(sequence.Meta.Source)
Output: Saccharomyces cerevisiae (baker's yeast)
func ReadFASTA ¶ added in v0.3.0
ReadFASTA reads a Sequence struct from a FASTA file.
Example ¶
ExampleReadFASTA shows basic usage for ReadFASTA
sequence := ReadFASTA("data/base.fasta")
fmt.Println(sequence.Features[0].Description)
Output: gi|5524211|gb|AAD44166.1| cytochrome b [Elephas maximus maximus]
func ReadGbk ¶
ReadGbk reads a Gbk from path and parses into an Annotated sequence struct.
Example ¶
sequence := ReadGbk("data/puc19.gbk")
fmt.Println(sequence.Meta.Locus.ModificationDate)
Output: 22-OCT-2019
func ReadGff ¶
ReadGff takes in a filepath for a .gffv3 file and parses it into an Annotated Sequence struct.
Example ¶
sequence := ReadGff("data/ecoli-mg1655-short.gff")
fmt.Println(sequence.Meta.Name)
Output: U00096.3
func ReadJSON ¶
ReadJSON reads an Sequence JSON file.
Example ¶
sequence := ReadJSON("data/sample.json")
fmt.Println(sequence.Meta.Source)
Output: Saccharomyces cerevisiae (baker's yeast)
func (*Sequence) AddFeature ¶ added in v0.4.0
AddFeature is the canonical way to add a Feature into a Sequence struct. Appending a Feature struct directly to Sequence.Feature's will break .GetSequence() method.
func (Sequence) GetOptimizationTable ¶ added in v0.6.0
func (sequence Sequence) GetOptimizationTable(codonTable CodonTable) CodonTable
GetOptimizationTable is a Sequence method that takes a CodonTable and weights it to be used to optimize inserts.
func (Sequence) GetSequence ¶ added in v0.3.0
GetSequence is a method to get the full sequence of an annotated sequence
Directories
¶
| Path | Synopsis |
|---|---|
|
Poly command line utility installation instructions: Mac OSX brew install timothystiles/poly/poly Linux - deb/rpm Download the .deb or .rpm from the releases page https://github.com/TimothyStiles/poly/releases and install with `dpkg -i` and `rpm -i` respectively Windows Coming soon...
|
Poly command line utility installation instructions: Mac OSX brew install timothystiles/poly/poly Linux - deb/rpm Download the .deb or .rpm from the releases page https://github.com/TimothyStiles/poly/releases and install with `dpkg -i` and `rpm -i` respectively Windows Coming soon... |