genome

package
v0.0.0-...-7a0a068 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 7, 2023 License: AGPL-3.0 Imports: 14 Imported by: 0

Documentation

Overview

Package genome is a package for representing the genome, relative to a tile library, with Go data structures. Provided are various functions to export/import the data within genomes, along with creating new Genome data structures in memory. Should be used in conjunction with the tilelibrary and structures packages.

Index

Constants

This section is empty.

Variables

View Source
var ErrInvalidGenome = errors.New("not a valid genome file")

ErrInvalidGenome is an error for when a file that is expected to be a .genome file is not one.

View Source
var ErrNoLibraryAttached = errors.New("genome has no library attached")

ErrNoLibraryAttached is an error for when a genome does not have a library in its Library field but needs one.

Functions

func WriteGenomesPathToNumpy

func WriteGenomesPathToNumpy(genomes []*Genome, filepath string, path int) error

WriteGenomesPathToNumpy writes multiple genomes' worth of path information to a numpy file.

Types

type Genome

type Genome struct {
	Paths [][]Path // Paths represents a genome through its paths. Two phases are present here (path and counterpart path).
	// contains filtered or unexported fields
}

Genome is a struct to represent a genome. It contains a pointer to its reference library, which allows for easy tiling. The genome is split into paths, phases, and steps, and at each step there is potentially a tile variant of bases. If Paths[a][b][c] = d, then in the genome at path a, phase b, and step c, the variant is variant number d in the Library at path a and step c. Using the reference library, we can refer to the tile variant at each step using its tile variant number in the library, or using -1 to represent that there's no tile there due to a spanning tile variant.

func New

func New(library *tilelibrary.Library) *Genome

New is a function to initialize a Genome. nil is allowed for the library if the library shouldn't be set yet. It can be set manually later.

func ReadGenomeFromFile

func ReadGenomeFromFile(filepath string) (*Genome, error)

ReadGenomeFromFile reads a text file containing genome information. Current file suffix is .genome (make sure all genomes written to disk have this suffix!)

func (*Genome) Add

func (g *Genome) Add(directory string) error

Add puts the contents of a directory of FastJ files into a given Genome.

func (*Genome) AddFastJ

func (g *Genome) AddFastJ(filepath string) error

AddFastJ puts the contents of a FastJ into a Genome. Works with both gzipped and non-gzipped FastJ files.

func (*Genome) AssignLibrary

func (g *Genome) AssignLibrary(library *tilelibrary.Library) error

AssignLibrary assigns an existing genome to a library. This library must match the ID found in the libraryID field of g.

func (*Genome) Liftover

func (g *Genome) Liftover(destination *tilelibrary.Library) error

Liftover runs a liftover operation on a genome by changing what library the genome is attached to.

func (*Genome) ReadGenomePathNumpy

func (g *Genome) ReadGenomePathNumpy(filepath string) error

ReadGenomePathNumpy reads one path's worth of information from a numpy file. This path should be assigned to a path of a genome.

func (*Genome) WriteNumpy

func (g *Genome) WriteNumpy(directory string, genomePath int) error

WriteNumpy writes the values of a path of a genome to a numpy array. It alternates between each phase for each step. For example, if path 24 had two steps (0 and 1) all with complete tiles, and on phase 0 the step values were 0 and 2, and on phase 1 the step values were 1 and 1, the numpy array would be [0 1 2 1], since it writes out the data for step 0 first, and then writes out the values for step 1.

func (*Genome) WriteToFile

func (g *Genome) WriteToFile(filename string) error

WriteToFile writes a genome to a list format of indices relative to its reference library. It alternates between each phase for each step. For example, if path 24 had two steps (0 and 1) all with complete tiles, and on phase 0 the step values were 0 and 2, and on phase 1 the step values were 1 and 1, the numpy array would be [0 1 2 1], since it writes out the data for step 0 first, and then writes out the values for step 1. Will not work if the genome does not have a reference library (nil reference).

type Path

type Path []Step

Path is a type to represent a path of the genome, as a slice of steps.

type Step

type Step int // -1 for a skipped step, and any other integer refers to the tile variant number in the reference library.

Step is a type to represent a step within a path, which can take on a specific tile variant.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL