xrs

package module
v0.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 17, 2019 License: MIT Imports: 4 Imported by: 0

README

X-Reed-Solomon

GoDoc MIT licensed Build Status Go Report Card

Introduction:

Getting Started

  • Make sure you have read the papers.

  • XRS splits row vector into two equal parts.

e.g. 10+4:

+---------+
| a1 | b1 |
+---------+
| a2 | b2 |
+---------+
| a3 | b3 |
+---------+
    ...
+---------+
| a10| b10|
+---------+
| a11| b11|
+---------+
| a12| b12|
+---------+
| a13| b13|
+---------+
  • So it's important to choose a fit size for reading/write disks efficiently.
  • APIs are almost as same as normal Reed-Solomon Erasure Codes.

Performance

Performance depends mainly on:

  • CPU instruction extension.

  • Number of data/parity row vectors.

Platform:

MacBook Pro 15-inch, 2017 (Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz)

All test run on a single Core.

RS means Reed-Solomon Codes(for comparing), the RS lib is here

Encode:

I/O = (data + parity) * vector_size / cost

Base means no SIMD.

Data Parity Vector size RS I/O (MB/S) XRS I/O (MB/S)
12 4 4KB 12658.00 10895.15
12 4 1MB 8989.67 7530.84
12 4 8MB 8509.06 6579.53
Reconstruct:

Need Data = Data size need read in reconstruction

I/O = (need_data + reconstruct_data_num * vector_size) / cost

Data Parity Vector size Reconstruct Data Num RS Need Data XRS Need Data RS Cost XRS Cost RS I/O (MB/S) XRS I/O (MB/S)
12 4 4KB 1 48KB 34KB 2140 ns/op 3567 ns/op 24885.17 10334.99
12 4 4KB 2 48KB 48KB 3395 ns/op 5940 ns/op 16890.41 9654.17
12 4 4KB 3 48KB 48KB 4746 ns/op 7525 ns/op 12945.61 8164.76
12 4 4KB 4 48KB 48KB 5958 ns/op 8851 ns/op 10999.75 7404.41
Update:

I/O = (2 + parity_num + parity_num) * vector_size / cost

Data Parity Vector size RS I/O (MB/S) XRS I/O (MB/S)
12 4 4KB 32739.22 26312.14
Replace:

I/O = (parity_num + parity_num + replace_data_num) * vector_size / cost

Data Parity Vector size Replace Data Num RS I/O (MB/S) XRS I/O (MB/S)
12 4 4KB 1 63908.06 44082.57
12 4 4KB 2 39966.65 26554.30
12 4 4KB 3 30007.81 19583.16
12 4 4KB 4 25138.38 16636.82
12 4 4KB 5 21261.91 14301.15
12 4 4KB 6 19833.14 13121.98
12 4 4KB 7 18395.47 12028.10
12 4 4KB 8 17364.02 11300.55

PS:

And we must know the benchmark test is quite different with encoding/decoding in practice. Because in benchmark test loops, the CPU Cache may help a lot.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type XRS

type XRS struct {
	// RS is the backend of XRS>
	RS *rs.RS
	// XORSet shows how XRS combines sub-vectors by xor.
	//
	// Key: Parity index(except first parity).
	// Value: Data indexes.
	XORSet map[int][]int
}

XRS X-Reed-Solomon Codes receiver.

func New

func New(dataNum, parityNum int) (x *XRS, err error)

New create an XRS with specific data & parity numbers.

Warn: parityNum can't be 1.

func (*XRS) Encode

func (x *XRS) Encode(vects [][]byte) (err error)

Encode encodes data for generating parity. Write parity vectors into vects[r.DataNum:].

func (*XRS) GetNeedVects

func (x *XRS) GetNeedVects(needReconst int) (aNeed, bNeed []int, err error)

GetNeedVects receives needReconst index(it must be data vector) returns a_vectors' indexes and b_parity_vectors' indexes for reconstructing needReconst. It's used for ReconstOne to read correct vectors for saving I/O.

bNeed always has two elements, the first one is DataNum.

func (*XRS) Reconst

func (x *XRS) Reconst(vects [][]byte, dpHas, needReconst []int) (err error)

Reconst reconstructs missing vectors, vects: All vectors, len(vects) = dataNum + parityNum. dpHas: Survived data & parity index, need dataNum indexes at least. needReconst: Vectors indexes which need to be reconstructed.

Warn: If there is only one needReconst, it will call ReconstOne, so make sure you have correct data, if there is only one vectors need to repair.

e.g: in 3+2, the whole index: [0,1,2,3,4], if vects[0,4] are lost & they need to be reconstructed (Maybe you only need vects[0], so the needReconst should be [0], but not [0,4]). the "dpHas" will be [1,2,3] ,and you must be sure that vects[1] vects[2] vects[3] have correct data, results will be written into vects[0]&vects[4] directly.

func (*XRS) ReconstOne

func (x *XRS) ReconstOne(vects [][]byte, needReconst int) (err error)

ReconstOne reconstruct one data vector, it saves I/O. Make sure you have some specific vectors data. ( you can get the vectors' indexes from GetNeedVects)

func (*XRS) Replace

func (x *XRS) Replace(data [][]byte, replaceRows []int, parity [][]byte) (err error)

Replace replaces oldData vectors with 0 or replaces 0 with newData vectors.

In practice, If len(replaceRows) > dataNum-parityNum, it's better to use Encode, because Replace need to read len(replaceRows) + parityNum vectors, if replaceRows are too many, the cost maybe larger than Encode (Encode only need read dataNum). Think about an EC compute node, and dataNum+parityNum data nodes model.

It's used in two situations: 1. We didn't have enough data for filling in a stripe, but still did ec encode, we need replace several zero vectors with new vectors which have data after we get enough data finally. 2. After compact, we may have several useless vectors in a stripe, we need replaces these useless vectors with zero vectors for free space.

Warn: data's index & replaceRows must has the same sort.

func (*XRS) Update

func (x *XRS) Update(oldData, newData []byte, row int, parity [][]byte) (err error)

Update updates parity_data when one data_vect changes. row: It's the new data's index in the whole vectors.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL