Documentation
¶
Overview ¶
Package fasta contains fasta parsers and writers.
Fasta is a flat text file format developed in 1985 to store nucleotide and amino acid sequences. It is extremely simple and well supported across many languages. However, this simplicity means that annotation of genetic objects is not supported.
This package provides a parser and writer for working with Fasta formatted genetic sequences.
Example (Basic) ¶
This example shows how to open a file with the fasta parser. The sequences within that file can then be analyzed further with different software.
package main
import (
"fmt"
"github.com/TimothyStiles/poly/io/fasta"
)
func main() {
fastas := fasta.Read("data/base.fasta")
fmt.Println(fastas[1].Sequence)
}
Output: ADQLTEEQIAEFKEAFSLFDKDGDGTITTKELGTVMRSLGQNPTEAELQDMINEVDADGNGTIDFPEFLTMMARKMKDTDSEEEIREAFRVFDKDGNGYISAAELRHVMTNLGEKLTDEEVDEMIREADIDGDGQVNYEEFVQMMTAK*
Index ¶
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func Build ¶
Build writes a Fasta struct to a string.
Example ¶
ExampleBuild shows basic usage for Build
fastas := Read("data/base.fasta") // get example data
fasta := Build(fastas) // build a fasta byte array
firstLine := string(bytes.Split(fasta, []byte("\n"))[0])
fmt.Println(firstLine)
Output: >gi|5524211|gb|AAD44166.1| cytochrome b [Elephas maximus maximus]
func ParseConcurrent ¶
ParseConcurrent concurrently parses a given Fasta file in an io.Reader into a channel of Fasta structs.
func ReadConcurrent ¶
ReadConcurrent concurrently reads a flat Fasta file into a Fasta channel.
Example ¶
ExampleReadConcurrent shows how to use the concurrent parser for decompressed fasta files.
fastas := make(chan Fasta, 100)
go ReadConcurrent("data/base.fasta", fastas)
var name string
for fasta := range fastas {
name = fasta.Name
}
fmt.Println(name)
Output: MCHU - Calmodulin - Human, rabbit, bovine, rat, and chicken
func ReadGzConcurrent ¶
ReadGzConcurrent concurrently reads a gzipped Fasta file into a Fasta channel.
Example ¶
ExampleReadGzConcurrent shows how to use the concurrent parser for larger files.
fastas := make(chan Fasta, 1000)
go ReadGzConcurrent("data/uniprot_1mb_test.fasta.gz", fastas)
var name string
for fasta := range fastas {
name = fasta.Name
}
fmt.Println(name)
Output: sp|P86857|AGP_MYTCA Alanine and glycine-rich protein (Fragment) OS=Mytilus californianus OX=6549 PE=1 SV=1
func Write ¶
Write writes a string to a file.
Example ¶
ExampleWrite shows basic usage of the writer.
fastas := Read("data/base.fasta") // get example data
Write(fastas, "data/test.fasta") // write it out again
testSequence := Read("data/test.fasta") // read it in again
os.Remove("data/test.fasta") // getting rid of test file
fmt.Println(testSequence[0].Name)
Output: gi|5524211|gb|AAD44166.1| cytochrome b [Elephas maximus maximus]
Types ¶
type Fasta ¶
Fasta is a struct representing a single Fasta file element with a Name and its corresponding Sequence.
func Parse ¶
Parse parses a given Fasta file into an array of Fasta structs. Internally, it uses ParseFastaConcurrent.
Example ¶
ExampleParse shows basic usage for Parse.
file, _ := os.Open("data/base.fasta")
fastas := Parse(file)
fmt.Println(fastas[0].Name)
Output: gi|5524211|gb|AAD44166.1| cytochrome b [Elephas maximus maximus]
func Read ¶
Read reads a file into an array of Fasta structs
Example ¶
ExampleRead shows basic usage for Read.
fastas := Read("data/base.fasta")
fmt.Println(fastas[0].Name)
Output: gi|5524211|gb|AAD44166.1| cytochrome b [Elephas maximus maximus]
func ReadGz ¶
ReadGz reads a gzipped file into an array of Fasta structs.
Example ¶
ExampleReadGz shows basic usage for ReadGz on a gzip'd file.
fastas := ReadGz("data/uniprot_1mb_test.fasta.gz")
var name string
for _, fasta := range fastas {
name = fasta.Name
}
fmt.Println(name)
Output: sp|P86857|AGP_MYTCA Alanine and glycine-rich protein (Fragment) OS=Mytilus californianus OX=6549 PE=1 SV=1