rat

package module
v1.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 21, 2015 License: MIT Imports: 5 Imported by: 0

README

rat - random access tar

Build Status Coverage Status GoDoc GitHub release

rat is an extension to the classical tar archive, focused on allowing constant-time random file access with linear memory consumption increase. tape archive, was originally developed to write and read streamed sources, making random access to the content very inefficient.

Based on the benchmarks, we found that rat is 4x to 60x times faster over SSD and HDD than the classic tar file, when reading a single file from a tar archive.

Any tar file produced by rat is compatible with standard tar implementation.

Installation

The recommended way to install rat

go get -u github.com/mcuadros/go-rat/...

Example

Import the package:

import "github.com/mcuadros/go-rat"

Converting a standard tar file to a rat file:

src, _ := os.Open("standard.file.tar")
dst, _ := os.Create("extended.rat.file.tar")
defer src.Close()
defer dst.Close()

if err = AddIndexToTar(src, dst); err != nil {
    panic(err)
}

Searching a specific file in a rat file:

archive, _ := os.Open("extended.rat.file.tar")

content, _ := archive.ReadFile("foo.txt")
fmt.Println(string(content))
//Prints: foo

Benchmarks Results

These are some of the benchmarking results over differrent storage systems.

Fixture name explanation: 5_1.0KB_102KB means a tar containing 5 files with a size between 1kb and 102kb.

SSD TAR (ns) RAT (ns) times
5_1.0KB_102KB 367838 77236 4.76
100_1.0KB_102KB 5925036 350116 16.92
1000_1.0KB_102KB 58735369 3503317 16.77
6000_1.0KB_102KB 349484665 20064072 17.42
60_1.0MB_21MB 146302392 3402651 43.00
HDD TAR (ns) RAT (ns) times
5_1.0KB_102KB 253406 54472 4.65
100_1.0KB_102KB 3682796 282085 13.06
1000_1.0KB_102KB 37834628 2396239 15.79
6000_1.0KB_102KB 210841382 13913158 15.15
60_1.0MB_21MB 166405959 2783659 59.78
GlusterFS TAR (ns) RAT (ns) times
5_1.0KB_102KB 293252 130652 2.24
100_1.0KB_102KB 4292723 362399 11.85
1000_1.0KB_102KB 39632581 4468976 8.87
6000_1.0KB_102KB 2413057504 16586371 145.48
60_1.0MB_21MB 623461320 112529704 5.54

License

MIT, see LICENSE

Documentation

Overview

rat is an extension to the classical tar archive, focused on allowing constant-time random file access with linear memory consumption increase. tape archive, was originally developed to write and read streamed sources, making random access to the content very inefficient.

Based on the benchmarks, we found that rat is 4x to 60x times faster over SSD and HDD than the classic tar file, when reading a single file from a tar archive.

Note: Any tar file produced by rat is compatible with standard tar implementation.

Index

Constants

View Source
const IndexVersion int64 = 1

Variables

View Source
var (
	IndexSignature              = []byte{'R', 'A', 'T'}
	UnsuportedIndex             = errors.New("Unsuported tar file")
	UnableToSerializeIndexEntry = errors.New("Unable to serialize: invalid content")
)
View Source
var (
	FileNotFound = errors.New("File not found")
	NotRegFile   = errors.New("This is not a regular file")
)

Functions

func AddIndexToTar added in v1.1.0

func AddIndexToTar(input io.Reader, output io.Writer) error

AddIndexToTar reads from input a standard tar file and writes on output a new tar file with the rat signature on it.

func Newindex

func Newindex() *index

Types

type Reader

type Reader struct {
	// contains filtered or unexported fields
}

A Reader provides random access to the contents of a rat archive. You can use archive/tar.Reader with any rat file for sequencial access.

func NewReader

func NewReader(r io.ReadSeeker) (*Reader, error)

NewReader creates a new Reader reading from r.

func (*Reader) GetNames

func (r *Reader) GetNames(onlyRegFiles bool) []string

GetNames returns all the entries from the rat signautre, you can filter only the regular files using the onlyRegFiles arg

func (*Reader) ReadFile

func (r *Reader) ReadFile(file string) ([]byte, error)

ReadFile returns the content of a file, if the entry not is a regular file the error NotRegFile is returned, if the files not exists returns FileNotFound

type Writer

type Writer struct {
	// contains filtered or unexported fields
}

A Writer provides sequential writing of a rat archive. Writer has exactly the same interfaces as http://golang.org/pkg/archive/tar/#Writer does

func NewWriter

func NewWriter(w io.Writer) *Writer

NewWriter creates a new Writer writing to w.

func (*Writer) Close

func (w *Writer) Close() error

Close closes the tar archive, flushing any unwritten data to the underlying writer and writes the rat signature at the end of the writer.

func (*Writer) Flush

func (w *Writer) Flush() error

Flush finishes writing the current file (optional).

func (*Writer) Write

func (w *Writer) Write(b []byte) (int, error)

Write writes to the current entry in the tar archive.

func (*Writer) WriteHeader

func (w *Writer) WriteHeader(hdr *tar.Header) error

WriteHeader writes hdr and prepares to accept the file's contents.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL