README

Go Reference Go Report Card

go-defaultdict

Go implementation of Python's defaultdict, in a way that's both thread-safe and memory efficient.

Overview

Underneath it pairs a sync.Map with a sync.Pool, and removed all direct store/write accesses to the map. As a result, the only way to mutate the map is through Load/Get, (which either create a new value for you if this is the first access to the key, or return the value created by a previous Load/Get), then mutate the value returned directly (in a thread-safe way).

Here are 2 example usages:

  1. To implement a rowlock. See my rowlock package for detailed example.

  2. To implement a concurrent counter-by-key. See package example or below for details.

Example Code

Here's a step-by-step example to create a concurrent counter-by-key.

First, create a generator, which simply returns an *int64 so it can be used by atomic int64 functions:

generator := func() defaultdict.Pointer {
  return new(int64)
}

Then, create the map:

m := defaultdict.New(generator)

When you need to add the counter, get by key then use atomic.AddInt64:

atomic.AddInt64(m.Get(key).(*int64), 1)

When you need to get the counter value, just get by key then use atomic.LoadInt64:

fmt.Printf(
  "Key %v was added %d times\n",
  key,
  atomic.LoadInt64(
    m.Get(key).(*int64),
  ),
)

Or use Range:

m.Range(func(key defaultdict.Comparable, value defaultdict.Pointer) bool {
  fmt.Printf("Key %v was added %d times\n", key, atomic.LoadInt64(value.(*int64)))
  return true
})
Expand ▾ Collapse ▴

Documentation

Overview

Package defaultdict implements Python's defaultdict, in a way that's both thread-safe and memory-efficient.

There are two example use cases for it:

1. To implement a row lock, that every row (key) has its own lock.

2. To implement a concurrent counter, that every key has its own atomic int as the counter.

Example

This example demonstrates how to use defaultdict to implement a thread-safe counter.

Code:

package main

import (
	"fmt"
	"go.yhsif.com/defaultdict"
	"sync"
	"sync/atomic"
)

func main() {
	generator := func() defaultdict.Pointer {
		// Just create a new *int64 so it can be used as atomic int64.
		return new(int64)
	}
	m := defaultdict.New(generator)
	var wg sync.WaitGroup
	for i := 0; i < 10; i++ {
		for j := 0; j < i; j++ {
			wg.Add(1)
			go func(key defaultdict.Comparable) {
				defer wg.Done()
				atomic.AddInt64(m.Get(key).(*int64), 1)
			}(fmt.Sprintf("key-%d", i))
		}
	}

	wg.Wait()
	m.Range(func(key defaultdict.Comparable, value defaultdict.Pointer) bool {
		fmt.Printf("Key %v was added %d times\n", key, atomic.LoadInt64(value.(*int64)))
		return true
	})

}

Key key-1 was added 1 times
Key key-2 was added 2 times
Key key-3 was added 3 times
Key key-4 was added 4 times
Key key-5 was added 5 times
Key key-6 was added 6 times
Key key-7 was added 7 times
Key key-8 was added 8 times
Key key-9 was added 9 times

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Comparable

type Comparable = interface{}

Comparable is the key type of a Map.

It's used for documentation purpose only.

See https://golang.org/ref/spec#Comparison_operators for more info.

type Generator

type Generator func() Pointer

Generator defines the function used to generate the default value of the map.

func SharedPoolMapGenerator

func SharedPoolMapGenerator(g Generator) Generator

SharedPoolMapGenerator creates a Generator that returns a Map using g as the Generator.

It's different from just:

func() defaultdict.Pointer {
  return defaultdict.New(g)
}

That the Map returned by the same SharedPoolMapGenerator shares the same underlying pool, so it's more memory efficient when used as the second (or third, etc.) layer of a Map.

This is an example of running the benchmark test with go1.15.6:

$ go test -bench . -benchmem
goos: linux
goarch: amd64
pkg: go.yhsif.com/defaultdict
BenchmarkSharedPoolMapGenerator/shared-4                    9459            121219 ns/op            1093 B/op          5 allocs/op
BenchmarkSharedPoolMapGenerator/naive-4                     9246            123866 ns/op            3289 B/op         14 allocs/op
PASS
ok      go.yhsif.com/defaultdict        2.322s
Example

This example demonstrates how to use SharedPoolGenerator to implement a thread-safe counter with 2 layers of keys.

Code:

package main

import (
	"fmt"
	"go.yhsif.com/defaultdict"
	"sync"
	"sync/atomic"
)

func main() {
	generator := defaultdict.SharedPoolMapGenerator(func() defaultdict.Pointer {
		// Just create a new *int64 so it can be used as atomic int64.
		return new(int64)
	})
	m := defaultdict.New(generator)
	var wg sync.WaitGroup
	for i := 1; i < 4; i++ {
		for j := 1; j < 4; j++ {
			for k := 0; k < i*j; k++ {
				wg.Add(1)
				go func(key1, key2 defaultdict.Comparable) {
					defer wg.Done()
					atomic.AddInt64(m.Get(key1).(defaultdict.Map).Get(key2).(*int64), 1)
				}(fmt.Sprintf("key1-%d", i), fmt.Sprintf("key2-%d", j))
			}
		}
	}

	wg.Wait()
	m.Range(func(key1 defaultdict.Comparable, value1 defaultdict.Pointer) bool {
		m := value1.(defaultdict.Map)
		m.Range(func(key2 defaultdict.Comparable, value2 defaultdict.Pointer) bool {
			fmt.Printf("%v/%v was added %d times\n", key1, key2, atomic.LoadInt64(value2.(*int64)))
			return true
		})
		fmt.Println()
		return true
	})

}

key1-1/key2-1 was added 1 times
key1-1/key2-2 was added 2 times
key1-1/key2-3 was added 3 times

key1-2/key2-1 was added 2 times
key1-2/key2-2 was added 4 times
key1-2/key2-3 was added 6 times

key1-3/key2-1 was added 3 times
key1-3/key2-2 was added 6 times
key1-3/key2-3 was added 9 times

func (Generator) ToPool

func (g Generator) ToPool() *sync.Pool

ToPool creates a *sync.Pool from this Generator.

type Map

type Map interface {
	// Same as in sync.Map, see above for notes about the bool returns.
	Delete(key Comparable)
	Load(key Comparable) (Pointer, bool)
	LoadAndDelete(key Comparable) (Pointer, bool)
	Range(f func(key Comparable, value Pointer) bool)

	// Same as Load, just without the bool return.
	Get(key Comparable) Pointer
}

Map defines a map.

There are a few slight differences in Load and LoadAndDelete comparing to sync.Map:

1. The value type is guaranteed to be the same as the type returned by the Generator used to create this DefaultDict, never nil, even if this is a new key.

2. The bool return being false means that the value is directly from the Generator.

func New

func New(g Generator) Map

New creates a new DefaultDict.

It pairs a sync.Map with a sync.Pool under the hood.

type Pointer

type Pointer = interface{}

Pointer is the value type of a Map.

It's used for documentation purpose only.

In a Map, all values should be pointers, and those pointers should be safe to be mutated concurrently, for the following reasons:

1, A Map is for concurrent mutations (if you only need concurrent read-only access, a builtin map would suffice)

2. There's no Store function provided by Map interface. All mutations are done by Get/Load then mutate the returned value directly.

As an example, you can use *int64 as Pointer, and do mutations via atomic int64 operations. But slices, even though they are pointers, should not be used here directly. You usually need to pair slice with a lock, for example:

type myValue struct{
  lock sync.Lock
  // Note that this must be the pointer to the slice,
  // because append calls could change the slice itself.
  slice *[]int
}

func (m *myValue) Append(i int) {
  m.lock.Lock()
  defer m.lock.Unlock()
  *m.slice = append(*m.slice, i)
}

func myValueGenerator() defaultdict.Pointer {
  return &myValue{
    slice: new([]int),
  }
}