rebase

package
v0.29.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 1, 2023 License: MIT Imports: 4 Imported by: 0

Documentation

Overview

Package rebase contains a rebase parser for rebase data dump #31.

In order to effectively simulate cloning reactions, we need to know how each restriction enzyme in the reaction functions. This data can be derived, in bulk, from the REBASE database.

REBASE is an amazing resource run by New England Biolabs listing essentially every known restriction enzyme. In particular, this parser parses the REBASE data dump format #31, which is what Bioperl uses.

https://bioperl.org/howtos/Restriction_Enzyme_Analysis_HOWTO.html http://rebase.neb.com/rebase/rebase.f31.html

The actual data dump itself is linked here and updated once a month: http://rebase.neb.com/rebase/link_withrefm

The header of this file gives a wonderful explanation of its structure. Here is the header with the commercial suppliers format and an example enzyme.

``` REBASE version 104 withrefm.104

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
REBASE, The Restriction Enzyme Database   http://rebase.neb.com
Copyright (c)  Dr. Richard J. Roberts, 2021.   All rights reserved.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Rich Roberts Mar 31 2021

<ENZYME NAME> Restriction enzyme name. <ISOSCHIZOMERS> Other enzymes with this specificity. <RECOGNITION SEQUENCE>

These are written from 5' to 3', only one strand being given.
If the point of cleavage has been determined, the precise site
is marked with ^.  For enzymes such as HgaI, MboII etc., which
cleave away from their recognition sequence the cleavage sites
are indicated in parentheses.

For example HgaI GACGC (5/10) indicates cleavage as follows:
                5' GACGCNNNNN^      3'
                3' CTGCGNNNNNNNNNN^ 5'

In all cases the recognition sequences are oriented so that
the cleavage sites lie on their 3' side.

REBASE Recognition sequences representations use the standard
abbreviations (Eur. J. Biochem. 150: 1-5, 1985) to represent
ambiguity.
                R = G or A
                Y = C or T
                M = A or C
                K = G or T
                S = G or C
                W = A or T
                B = not A (C or G or T)
                D = not C (A or G or T)
                H = not G (A or C or T)
                V = not T (A or C or G)
                N = A or C or G or T

ENZYMES WITH UNUSUAL CLEAVAGE PROPERTIES:

Enzymes that cut on both sides of their recognition sequences,
such as BcgI, Bsp24I, CjeI and CjePI, have 4 cleavage sites
each instead of 2.

Bsp24I
          5'      ^NNNNNNNNGACNNNNNNTGGNNNNNNNNNNNN^   3'
          3' ^NNNNNNNNNNNNNCTGNNNNNNACCNNNNNNN^        5'

This will be described in some REBASE reports as:

             Bsp24I (8/13)GACNNNNNNTGG(12/7)

<METHYLATION SITE>

The site of methylation by the cognate methylase when known
is indicated X(Y) or X,X2(Y,Y2), where X is the base within
the recognition sequence that is modified.  A negative number
indicates the complementary strand, numbered from the 5' base
of that strand, and Y is the specific type of methylation
involved:
               (6) = N6-methyladenosine
               (5) = 5-methylcytosine
               (4) = N4-methylcytosine

If the methylation information is different for the 3' strand,
X2 and Y2 are given as well.

<MICROORGANISM> Organism from which this enzyme had been isolated. <SOURCE> Either an individual or a National Culture Collection. <COMMERCIAL AVAILABILITY>

Each commercial source of restriction enzymes and/or methylases
listed in REBASE is assigned a single character abbreviation
code.  For example:

K        Takara (1/98)
M        Boehringer Mannheim (10/97)
N        New England Biolabs (4/98)

The date in parentheses indicates the most recent update of
that organization's listings in REBASE.

<REFERENCES>only the primary references for the isolation and/or purification of the restriction enzyme or methylase, the determination of the recognition sequence and cleavage site or the methylation specificity are given.

REBASE codes for commercial sources of enzymes

B        Life Technologies (3/21)
C        Minotech Biotechnology (3/21)
E        Agilent Technologies (8/20)
I        SibEnzyme Ltd. (3/21)
J        Nippon Gene Co., Ltd. (3/21)
K        Takara Bio Inc. (6/18)
M        Roche Applied Science (4/18)
N        New England Biolabs (3/21)
O        Toyobo Biochemicals (8/14)
Q        Molecular Biology Resources - CHIMERx (3/21)
R        Promega Corporation (11/20)
S        Sigma Chemical Corporation (3/21)
V        Vivantis Technologies (1/18)
X        EURx Ltd. (1/21)
Y        SinaClon BioScience Co. (1/18)

<1>AaaI <2>XmaIII,BseX3I,BsoDI,BstZI,EagI,EclXI,Eco52I,SenPT16I,TauII,Tsp504I <3>C^GGCCG <4> <5>Acetobacter aceti ss aceti <6>M. Fukaya <7> <8>Tagami, H., Tayama, K., Tohyama, T., Fukaya, M., Okumura, H., Kawamura, Y., Horinouchi, S., Beppu, T., (1988) FEMS Microbiol. Lett., vol. 56, pp. 161-166.

```

Example (Basic)

This example reads rebase into an enzymeMap and returns the AarI recognition sequence.

package main

import (
	"fmt"

	"github.com/TimothyStiles/poly/io/rebase"
)

func main() {
	enzymeMap, _ := rebase.Read("data/rebase_test.txt")
	fmt.Println(enzymeMap["AarI"].RecognitionSequence)
}
Output:

CACCTGC(4/8)

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

func Export

func Export(enzymeMap map[string]Enzyme) ([]byte, error)

Export returns a json file of the Rebase database

Example
package main

import (
	"fmt"

	"github.com/TimothyStiles/poly/io/rebase"
)

func main() {
	enzymeMap, _ := rebase.Read("data/rebase_test.txt")
	enzymeJSON, _ := rebase.Export(enzymeMap)
	fmt.Println(string(enzymeJSON)[:100])
}
Output:

{"AaaI":{"name":"AaaI","isoschizomers":["XmaIII","BseX3I","BsoDI","BstZI","EagI","EclXI","Eco52I","S

func Parse

func Parse(file io.Reader) (map[string]Enzyme, error)

Parse parses the Rebase database into a map of enzymes

func Read

func Read(path string) (map[string]Enzyme, error)

Read returns an enzymeMap from a Rebase data dump

Example
package main

import (
	"fmt"

	"github.com/TimothyStiles/poly/io/rebase"
)

func main() {
	enzymeMap, _ := rebase.Read("data/rebase_test.txt")
	fmt.Println(enzymeMap["AarI"].MicroOrganism)
}
Output:

Arthrobacter aurescens SS2-322

Types

type Enzyme

type Enzyme struct {
	Name                   string   `json:"name"`
	Isoschizomers          []string `json:"isoschizomers"`
	RecognitionSequence    string   `json:"recognitionSequence"`
	MethylationSite        string   `json:"methylationSite"`
	MicroOrganism          string   `json:"microorganism"`
	Source                 string   `json:"source"`
	CommercialAvailability []string `json:"commercialAvailability"`
	References             string   `json:"references"`
}

Enzyme represents a single enzyme within the Rebase database

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL