Back to

Package indexsplit

Latest Go to latest

The highest tagged major version is .

Published: today | License: MIT | Module:


Package indexsplit is used to quickly generate evenly sized (by amount of data) regions across a cohort. It does this by reading the bam (or cram) index and using the file offsets as proxies for the amount of data. It sums the values in these bins across all samples. This gives a good estimate for actual reads in the region but without having to parse the bam file.

A common use of this will be to generate regions to be use to parallelize variant calling fairly by splitting in to `N` regions with approximately equal amounts of data **across the cohort**.


func Main

func Main()

Main is called from the goleft dispatcher.

func Split

func Split(paths []string, refs []*sam.Reference, N int, probs map[string]*interval.IntTree) chan Chunk

Split takes paths of bams or crais and generates `N` `Chunks`

type Chunk

type Chunk struct {
	Chrom  string
	Start  int
	End    int
	Sum    float64 // amount of data in this Chunk
	Splits int     // number of splits

Chunk is a region of the genome create by `Split`.

func (Chunk) String

func (c Chunk) String() string

Package Files

Documentation was rendered with GOOS=linux and GOARCH=amd64.

Jump to identifier

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to identifier