Shardmap

package module
v0.0.0-...-5cd9a77 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 26, 2018 License: MIT Imports: 2 Imported by: 0

README

Golangs built in map sharded!

For better concurrency.

The whole point of this project was to assist me in a scenario where map access I/O became a bottleneck.

( godoc: https://godoc.org/git.zutto.fi/zutto/shardedmap )

1. Installation

go get git.zutto.fi/zutto/shardedmap

2. Usage
Intialization is easy as:

shardmap := shardedmap.NewShardMap(24)

Wondering what that 24 is? It's the amount of 'shards' shardedmap should initialize. General rule of thumb: Lower your I/O is, the less shards you need. Each shard holds their own mutex lock for rwd access.

Adding item to shardmap
type demo struct {
   str1 string
   str2 string
}

var data interface{} = demo{"Hello", "World"}

shardmap.Set("I am a string key", &data)

Input data type is interface{}, key datatype is always a string.

Reading items from shardmap
retrievedData := shardmap.Get("I am a string key")
Deleting items from shardmap
shardmap.Delete("I am a string key")
for more documentation, please look at godoc or the source code!
3. Performance

There are benchmarks comparing normal map vs shardedmap included in the shardmap_test.go file. Generally, shardedmap only becomes faster the more I/O you require, and amount of shards you are willing to create. Also, don't forget that slices exist! use slices if you can!!

Heres the benchmarks, i ran them on t460p laptop w/i7-6700HQ CPU @ 2.60GHz. Your mileage may vary, and the performance difference is best seen in large data!

$ go test -bench=.
goos: linux
goarch: amd64
pkg: git.zutto.fi/zutto/shardedmap
BenchmarkShardedMapWrite-8               1000000              2047 ns/op
BenchmarkShardedMapRead-8                1000000              1218 ns/op
BenchmarkShardedMapWR-8                   500000              3197 ns/op
BenchmarkShardedMapWRForceGet-8           300000              4242 ns/op
BenchmarkShardedMapDel-8                 1000000              1420 ns/op
BenchmarkPlainMapWrite-8                  500000              5481 ns/op
BenchmarkPlainMapRead-8                  1000000              2363 ns/op
BenchmarkPlainMapDel-8                    500000              4518 ns/op

Shards are stored in a slice. The slice where your data is added to is computed by djbhash - djbhash is dubbed as:

"It is one of the most efficient hash functions ever published."

Djbhash is very simple and insecure hash, its sole purpose is to split the I/O access across the shards, thus why fast hashing function was deemed necessary.

4. Limitations

Limitations mainly come from golangs std rwmutex & slices & maps. If you find more limitations in this project, do please let me know!

----------------

TODO:

  • license
  • documentation & readme
  • clean the code
  • more tests?
  • better error handling graceful error handling to be done.
  • look for more entrypoints for fuzzing
  • documentation & readme of the sub-libraries (shardreduce & shardquest)

This project is licensed under MIT license.

Also if you use shardedmap somewhere, I would be delighted to hear about it!

Support available via issues (and all other means you might have to contact me), commercial support available upon request.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Shard

type Shard struct {
	Lock        sync.RWMutex
	InternalMap map[string]*interface{}
}

type ShardMap

type ShardMap struct {
	Keys int
	// contains filtered or unexported fields
}

func NewShardMap

func NewShardMap(shards int) ShardMap

NewShardMap initializes new Shardmap

func (*ShardMap) Delete

func (a *ShardMap) Delete(key string)

Delete deletes an antry from the sharded map - if the entry doesnt exist, it does nothing.

func (*ShardMap) DjbHash

func (a *ShardMap) DjbHash(inp string) uint32

DjbHash is for sharding the map. according to internets, this is fastest hashing algorithm ever made. we dont need security, we need distribution which this provides for us.

func (*ShardMap) DumpKeys

func (a *ShardMap) DumpKeys()

DumpKeys dumps the keys and number of shard they are stored at, for debugging purposes.

func (*ShardMap) ForceGet

func (a *ShardMap) ForceGet(key string) *interface{}

ForceGet is for concurrent write-read. if you expect a entry to be written on another goroutine - this function is handy for waiting that entry to appear Do note - theres the loop of this as of now, and this will at minimum do 2 Get calls.

func (*ShardMap) Get

func (a *ShardMap) Get(key string) *interface{}

Get returns entry from the sharded map

func (*ShardMap) LockGet

func (a *ShardMap) LockGet(key string) *interface{}

LockGet is yet another concurrent helper function. Lockget is for using sync.rwmutex.lock instead of sync.rwmutex.rlock for reading.

func (*ShardMap) RAW

func (a *ShardMap) RAW() *[]*Shard

RAW gives you access to the underlying shardmap slice

func (*ShardMap) Set

func (a *ShardMap) Set(key string, data *interface{})

Set sets an entry into the sharded map

func (*ShardMap) SetIfNotExist

func (a *ShardMap) SetIfNotExist(key string, data *interface{}) bool

SetIfNotExist is also another concurrency helper function. SetIfNotExist will set value if it does not exist yet, otherwise it will do nothing the function will return true on success, and false if the key already exists.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL