motif

package
v1.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 1, 2024 License: BSD-3-Clause Imports: 13 Imported by: 0

Documentation

Overview

Package motif provides functions for reading, writing, and manipulating position matrices for transcription factor binding site motif analysis.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ApproxEquals added in v1.0.1

func ApproxEquals(alpha, beta string, epsilon float64) bool

AlmostEquals determines if floating-point numbers within two files are equal within a specified epsilon level.

func BuildKmerHash added in v1.0.1

func BuildKmerHash(p PositionMatrix, thresholdProportion float64) map[uint64]float64

BuildKmerHash produces a hash mapping 2bit encoded kmer sequences to their corresponding motif score for a PositionMatrix. Only kmers with a motif score above an input thresholdProportion are stored in the output map.

func ConsensusSequence

func ConsensusSequence(input PositionMatrix, tieBreak bool) fasta.Fasta

ConsensusSequence takes in a PositionMatrix and returns the consensus motif sequence as a fasta. works for PFM, PPM, and PWM, as for all three, the consensus base is represented by the max column. if tieBreak is true, a tie in the PositionMatrix will produce a random output tied base.

func ConsensusSequences

func ConsensusSequences(input []PositionMatrix, tieBreak bool) []fasta.Fasta

ConsensusSequences converts a slice of PositionMatrix structs into a slice of Fasta structs representing the consensus sequences for each matrix.

func MatchComp added in v1.0.1

func MatchComp(s MatchCompSettings)

func RapidMatch added in v1.0.1

func RapidMatch(motifs []PositionMatrix, records []fasta.Fasta, propMatch float64, outFile string, outputAsProportion bool)

RapidMatch performs genome-wide scans for TF motif occurrences from an input genome in fasta format. propMatch specifies the motif score threshold for a match, as a proportion of the consensus score. outputAsProportion formats the output score reporting as match proportion of consensus score.

func ScoreWindow added in v1.0.1

func ScoreWindow(pm PositionMatrix, seq []dna.Base, alnStart int) (float64, int, bool)

ScoreWindow returns the motif score for an input PositionMatrix for a given []dna.Base at a specified start position alnStart. First return (float64) is the motif score. Second return (int) is the end alnPos index of the window. Third return (bool) returns false if the sequence could not be scored (contains Ns, ran off end of sequence).

func WriteJaspar

func WriteJaspar(filename string, records []PositionMatrix)

WriteJaspar writes a slice of PositionMatrix structs to an output filename in JASPAR format.

func WritePositionMatrixJaspar

func WritePositionMatrixJaspar(file *fileio.EasyWriter, m PositionMatrix)

WritePositionMatrixJaspar writes an individual PositionMatrix struct to a fileio.EasyWriter in JASPAR format.

Types

type MatchCompSettings added in v1.0.1

type MatchCompSettings struct {
	MotifFile          string
	MotifType          string
	Records            []fasta.Fasta
	PropMatch          float64
	ChromName          string
	OutFile            string
	Pseudocounts       float64
	ResidualWindowSize int
	RefStart           int
	OutputAsProportion bool
	EnforceStrandMatch bool
	ResidualFilter     float64
	GcContent          float64
}

type PositionMatrix

type PositionMatrix struct {
	Id   string
	Name string
	Type PositionMatrixType
	Mat  [][]float64
}

PositionMatrix is a struct encoding a position frequency/probability/weight matrix. Mat[row][column]. Mat rows 0, 1, 2, and 3 correspond to base identities A, C, G, and T, respectively. Mat columns correspond to position in a motif. So Mat[2][4] in a PPM would correspond to the probability of a G in the 5th position of a motif.

func CopyPositionMatrix

func CopyPositionMatrix(input PositionMatrix) PositionMatrix

CopyPositionMatrix provides a memory copy of an input PositionMatrix struct.

func NextPositionMatrix

func NextPositionMatrix(file *fileio.EasyReader, t PositionMatrixType) (PositionMatrix, bool)

NextPositionMatrix reads and parses a single PositionMatrix record from an input EasyReader. Returns true when the file is fully read.

func PfmSliceToPpmSlice

func PfmSliceToPpmSlice(input []PositionMatrix, pseudocount float64) []PositionMatrix

PfmSliceToPpmSlice creates a slice of position probability matrices from a slice of position frequency matrices.

func PfmToPpm

func PfmToPpm(input PositionMatrix, pseudocount float64) PositionMatrix

PfmToPpm creates a position probability matrix from an input position frequency matrix. Pseudocounts may be applied for Laplace smoothing. The input float represents the value added to each cell of the Pfm. More info on pseudocounts; https://doi.org/10.1093/nar/gkn1019

func PpmSliceToPwmSlice

func PpmSliceToPwmSlice(input []PositionMatrix, gcContent float64) []PositionMatrix

PpmSliceToPwmSlice creates a slice of position weight matrices from a slice of position probability matrices.

func PpmToPwm

func PpmToPwm(input PositionMatrix, gcContent float64) PositionMatrix

PpmToPwm creates a position weight matrix from an input position probability matrix.

func PwmSliceToPpmSlice added in v1.0.1

func PwmSliceToPpmSlice(input []PositionMatrix) []PositionMatrix

func PwmToPpm added in v1.0.1

func PwmToPpm(input PositionMatrix) PositionMatrix

PwmToPpm creates a position probability matrix from an input position weight matrix.

func ReadJaspar

func ReadJaspar(filename string, Type string) []PositionMatrix

ReadJaspar parses a slice of PositionMatrix structs from an input file in JASPAR format.

func ReverseComplement

func ReverseComplement(input PositionMatrix) PositionMatrix

ReverseComplement creates a new PositionMatrix representing the reverse complement position matrix of the input matrix.

func ReverseComplementAll

func ReverseComplementAll(input []PositionMatrix) []PositionMatrix

ReverseComplementAll creates a new slice of PositionMatrix structs, where each entry is the reverse complement position matrix of the corresponding index of the input slice of PositionMatrix structs.

type PositionMatrixType

type PositionMatrixType byte
const (
	Frequency   PositionMatrixType = 0
	Probability PositionMatrixType = 1
	Weight      PositionMatrixType = 2
	None        PositionMatrixType = 3
)

func StringToPositionMatrixType

func StringToPositionMatrixType(s string) PositionMatrixType

StringToPositionMatrixType parses a motif.PositionMatrixType type from an input string.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL