Documentation ¶
Index ¶
- func Build2dSlice(rows int, cols int) [][]string
- func ChunkGenome(records <-chan fastx.Record, winSize int, winStride int, chunkSize int) <-chan Chunk
- func ConsumeChunks(chunks <-chan Chunk, metrics []string, refProfile map[int]KmerProfile, ...) chan ChunkResult
- func MakeRange(start, end, step int) []int
- func MinInt(x, y int) int
- func SeqATSkew(seq *seq.Seq) float64
- func SeqEntropy(seq *seq.Seq) float64
- func SeqGC(seq *seq.Seq) float64
- func SeqGCSkew(seq *seq.Seq) float64
- func SeqKmerDiv(seq *seq.Seq, ref KmerProfile, distMetric string) float64
- func StreamGenome(fasta string, bufSize int) <-chan fastx.Record
- type Chunk
- type ChunkResult
- type KmerProfile
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func Build2dSlice ¶
Build2dSlice builds a 2d slice of float64 of target size
func ChunkGenome ¶
func ChunkGenome(records <-chan fastx.Record, winSize int, winStride int, chunkSize int) <-chan Chunk
ChunkGenome receives fastx.Record from a channel and produces a channel of record chunks. Each chunk contains multiple windows. The chunkSize is given in number of windows the windows size and stride are in basepair.
func ConsumeChunks ¶
func ConsumeChunks(chunks <-chan Chunk, metrics []string, refProfile map[int]KmerProfile, distMetric string) chan ChunkResult
ConsumeChunks computes window-based statistics in chunks and stores them in a ChunkResult struct.
func MakeRange ¶
MakeRange generates a slice of ints from start to end, where each value is spaced by step
func MinInt ¶
MinInt returns the smallest of two integers. If both are equal, the second input is returned.
func SeqEntropy ¶
SeqEntropy computes the Shannon entropy (information entropy) of a DNA sequence. The value returned is between 0 and 1.
func SeqKmerDiv ¶
func SeqKmerDiv(seq *seq.Seq, ref KmerProfile, distMetric string) float64
SeqKmerDiv will compute the Kmer profile of the input profile and compute its distance to a reference k-mer profile.
Types ¶
type Chunk ¶
type Chunk struct { ID []byte BpStart int BpEnd int Seq *seq.Seq Starts []int // contains filtered or unexported fields }
Chunk is a piece of fastx.Record sequence, containing the associated id genomic coordinate. It also contains indices of sliding windows in which statistics will be computed.
type ChunkResult ¶
ChunkResult stores the computed statistics of windows from a chunk in the form of a table. Each row is a window, each column is a feature.
type KmerProfile ¶
KmerProfile stores kmers and their frequencies for a given kmer length
func FastaToKmers ¶
func FastaToKmers(fasta string, k int) KmerProfile
FastaToKmers reads all records in a fasta file and computes its k-mer profile
func NewKmerProfile ¶
func NewKmerProfile(k int) KmerProfile
NewKmerProfile is a helper function to generate an empty Kmer profile
func (*KmerProfile) CountsToFreqs ¶
func (p *KmerProfile) CountsToFreqs()
CountsToFreqs transforms counts in a KmerProfile into frequencies.
func (*KmerProfile) GetSeqKmers ¶
func (p *KmerProfile) GetSeqKmers(sq *seq.Seq)
GetSeqKmers compute the k-mer profile of a sequence and increments counts in the KmerProfile accordingly
func (*KmerProfile) KmerCosDist ¶
func (p *KmerProfile) KmerCosDist(ref KmerProfile) float64
KmerCosDist computes the cosine distance between a reference k-mer profile and another profile. The reference is assumed to include all k-mers present in the profile.
func (*KmerProfile) KmerEuclDist ¶
func (p *KmerProfile) KmerEuclDist(ref KmerProfile) float64
KmerEuclDist computes the euclidean distance between a reference k-mer profile and another profile. The reference is assumed to include all k-mers present in the profile.