batch

package module
v1.4.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 12, 2025 License: MIT Imports: 7 Imported by: 0

README

Go Batch

Zero dependencies batch processing utilities for go projects.

This library provides a general batch processor that can apply to various use cases like bulk insert to the database, bulk enqueue, precompute reports, ...

Usage

Require go 1.24+

go get github.com/mawngo/go-batch
Example
package main

import (
	"github.com/mawngo/go-batch"
	"sync/atomic"
	"time"
)

func main() {
	sum := int32(0)
	// First create a batch.Processor by specifying the batch initializer and merger.
	//
	// Initializer will be called to create a new batch, 
	// here the batch.InitSlice[int] will create a slice.
	// Merger will be called to add item to a batch, 
	// here the batch.AddToSlice[int] will add item to the slice.
	//
	// A batch can be anything: slice, map, struct, channel, ...
	// The library already defined some built initializers and mergers for common data types,
	// but you can always define your own initializer and merger.
	processor := batch.NewProcessor(batch.InitSlice[int], batch.AddToSlice[int]).
		// Configure the processor.
		// The batch will be processed when the max item is reached 
		// or the max wait is reached.
		Configure(batch.WithMaxConcurrency(5), batch.WithMaxItem(10),
			batch.WithMaxWait(30*time.Second))

	// Start the processor by specifying a handler to process the batch, 
	// and optionally error handlers.
	// This will create a batch.Running processor that can accept item.
	runningProcessor := processor.Run(summing(&sum))

	for i := 0; i < 1_000_000; i++ {
		// Add item to the processor.
		runningProcessor.Put(1)
	}
	// Remember to close the running processor before your application stopped.
	// Closing will force the processor to process the left-over item, 
	// any item added after closing is not guarantee to be processed.
	runningProcessor.MustClose()
	if sum != 1_000_000 {
		panic("sum is not 1_000_000")
	}
}

func summing(p *int32) batch.ProcessBatchFn[[]int] {
	return func(ints []int, _ int64) error {
		for _, num := range ints {
			atomic.AddInt32(p, int32(num))
		}
		return nil
	}
}

More usage can be found in test and examples

Context And Cancellation

This library provides both non-context XXX and context XXXContext variants. However, it is recommended to use context variants, as non-context variants can block indefinitely (except for Close)


There is a java version of this library.

Documentation

Index

Constants

View Source
const Disabled = -1

Disabled are special values for WithMaxWait. Deprecated: use Unset.

View Source
const Unlimited = -1

Unlimited are special values for WithMaxItem. Deprecated: use Unset.

View Source
const Unset = -1

Unset is a special value for various Option functions, usually meaning unrestricted, unlimited, or disable. You need to read the doc of the corresponding function to know what this value does.

Variables

This section is empty.

Functions

func AddToSlice

func AddToSlice[T any](b []T, item T) []T

AddToSlice is MergeToBatchFn that add item to a slice.

func InitChan

func InitChan[T any](i int64) chan T

InitChan is an InitBatchFn that allocate a channel. this should not be used with unbounded processor (maxItem < 0).

func InitMap

func InitMap[K comparable, V any](i int64) map[K]V

InitMap is InitBatchFn that allocate a map.

func InitSlice

func InitSlice[T any](i int64) []T

InitSlice is InitBatchFn that allocate a slice.

func InitType

func InitType[T any](_ int64) T

InitType is an InitBatchFn that allocate a type T.

func LoggingErrorHandler

func LoggingErrorHandler[B any](_ B, count int64, err error) error

LoggingErrorHandler default error handler, always included in RecoverBatchFn chain unless disable.

func NewErrorWithRemaining

func NewErrorWithRemaining[B any](err error, remainBatch B, count int64) error

NewErrorWithRemaining create a *Error with remaining items.

Types

type Combine

type Combine[T any] = CombineFn[T]

Combine is an alias for CombineFn for backward compatibility. Deprecated: use CombineFn.

type CombineFn added in v1.4.2

type CombineFn[T any] func(T, T) T

CombineFn is a function to combine two values in to one.

type Error

type Error[B any] struct {
	// Cause the error cause. If not specified, then nil will be passed to the next error handler.
	Cause error
	// RemainingBatch the batch to pass to the next handler. The RemainingCount must be specified.
	RemainingBatch B
	// RemainingCount number of items to pass to the next handler.
	// If RemainingCount = 0 and Cause != nil then pass the original batch and count to the next handler.
	RemainingCount int64
}

Error is an error wrapper that supports passing remaining items to the RecoverBatchFn.

func (*Error[B]) Error

func (e *Error[B]) Error() string

func (*Error[B]) String

func (e *Error[B]) String() string

type ExtractFn added in v1.4.2

type ExtractFn[T any, V any] func(T) V

ExtractFn is a function to extract value from item.

type Extractor

type Extractor[T any, V any] = ExtractFn[T, V]

Extractor is an alias for ExtractFn for backward compatibility. Deprecated: use ExtractFn.

type InitBatchFn

type InitBatchFn[B any] func(int64) B

InitBatchFn function to create empty batch.

type MapRunner added in v1.2.0

type MapRunner[K comparable, T any] interface {
	Runner[T, map[K]T]
}

MapRunner shorthand for Runner that merge item into maps.

type MergeToBatchFn

type MergeToBatchFn[B any, T any] func(B, T) B

MergeToBatchFn function to add an item to batch.

func AddSelfToMapUsing

func AddSelfToMapUsing[T any, K comparable](keyExtractor ExtractFn[T, K]) MergeToBatchFn[map[K]T, T]

AddSelfToMapUsing create MergeToBatchFn that add self as item to map using key ExtractFn.

func AddToMapUsing

func AddToMapUsing[T any, K comparable, V any](keyExtractor ExtractFn[T, K], valueExtractor ExtractFn[T, V]) MergeToBatchFn[map[K]V, T]

AddToMapUsing create MergeToBatchFn that add item to map using key and value ExtractFn.

func MergeSelfToMapUsing

func MergeSelfToMapUsing[T any, K comparable](keyExtractor ExtractFn[T, K], combiner CombineFn[T]) MergeToBatchFn[map[K]T, T]

MergeSelfToMapUsing create a MergeToBatchFn that add self as item to map using key ExtractFn and apply CombineFn if key duplicated. The original value will be passed as 1st parameter to the CombineFn.

func MergeToMapUsing

func MergeToMapUsing[T any, K comparable, V any](keyExtractor ExtractFn[T, K], valueExtractor ExtractFn[T, V], combiner CombineFn[V]) MergeToBatchFn[map[K]V, T]

MergeToMapUsing create MergeToBatchFn that add item to map using key and value ExtractFn and apply CombineFn if key duplicated. The original value will be passed as 1st parameter to the CombineFn.

type Option

type Option func(*ProcessorConfig)

Option applies an option to ProcessorConfig.

func WithAggressiveMode

func WithAggressiveMode() Option

WithAggressiveMode enable the aggressive mode. In this mode, the processor does not wait for the maxWait or maxItems reached, will continue processing item and only merge into batch if needed (for example, reached concurrentLimit, or dispatcher thread is busy). The maxItems configured by WithMaxItem still control the maximum number of items the processor can hold before block. The WithBlockWhileProcessing will be ignored in this mode.

func WithBlockWhileProcessing

func WithBlockWhileProcessing() Option

WithBlockWhileProcessing enable the processor block when processing item. If concurrency enabled, the processor only blocks when reached max concurrency. This method has no effect if the processor is in aggressive mode.

func WithDisabledDefaultProcessErrorLog

func WithDisabledDefaultProcessErrorLog() Option

WithDisabledDefaultProcessErrorLog disable default error logging when batch processing error occurs.

func WithHardMaxWait

func WithHardMaxWait(wait time.Duration) Option

WithHardMaxWait set the max waiting time before the processor will handle the batch anyway. Unlike WithMaxWait, the batch will be processed even if it is empty, which is preferable if the processor must perform some periodic tasks. You should ONLY configure WithMaxWait OR WithHardMaxWait, NOT BOTH.

func WithMaxCloseWait

func WithMaxCloseWait(wait time.Duration) Option

WithMaxCloseWait set the max waiting time when closing the processor.

func WithMaxConcurrency

func WithMaxConcurrency[I Size](concurrency I) Option

WithMaxConcurrency set the max number of go routine this processor can create when processing item. Support 0 (run on dispatcher goroutine) and fixed number. Passing -1 Unset (unlimited) to this function has the same effect of passing math.MaxInt64.

func WithMaxItem

func WithMaxItem[I Size](maxItem I) Option

WithMaxItem set the max number of items this processor can hold before block. Support fixed number and -1 Unset (unlimited) When set to unlimited, it will never block, and the batch handling behavior depends on WithMaxWait. When set to 0, the processor will be DISABLED and item will be processed directly on caller thread without batching.

func WithMaxWait

func WithMaxWait(wait time.Duration) Option

WithMaxWait set the max waiting time before the processor will handle the batch anyway. If the batch is empty, then it is skipped. The max wait start counting from the last processed time, not a fixed period. Accept 0 (no wait), -1 Unset (wait util maxItems reached), or time.Duration. If set to -1 Unset and the maxItems is unlimited, then the processor will keep processing whenever possible without waiting for anything.

type ProcessBatchFn

type ProcessBatchFn[B any] func(B, int64) error

ProcessBatchFn function to process a batch.

type ProcessBatchIgnoreErrorFn

type ProcessBatchIgnoreErrorFn[B any] func(B, int64)

type ProcessorConfig

type ProcessorConfig struct {
	// contains filtered or unexported fields
}

ProcessorConfig configurable options of processor.

type ProcessorSetup

type ProcessorSetup[T any, B any] struct {
	ProcessorConfig
	// contains filtered or unexported fields
}

ProcessorSetup batch processor that is in setup phase (not running) You cannot put item into this processor, use Run to create a RunningProcessor that can accept item. See ProcessorConfig for available options.

func NewIdentityMapProcessor

func NewIdentityMapProcessor[T any, K comparable](keyExtractor ExtractFn[T, K], combiner CombineFn[T]) ProcessorSetup[T, map[K]T]

NewIdentityMapProcessor prepare a processor that backed by a map, using item as value without extracting.

func NewMapProcessor

func NewMapProcessor[T any, K comparable, V any](keyExtractor ExtractFn[T, K], valueExtractor ExtractFn[T, V], combiner CombineFn[V]) ProcessorSetup[T, map[K]V]

NewMapProcessor prepare a processor that backed by a map.

func NewProcessor

func NewProcessor[T any, B any](init InitBatchFn[B], merge MergeToBatchFn[B, T]) ProcessorSetup[T, B]

NewProcessor create a ProcessorSetup using specified functions. See ProcessorSetup.Configure and Option for available configuration. The result ProcessorSetup is in setup state. Call ProcessorSetup.Run with a handler to create a RunningProcessor that can accept item. It is recommended to set at least maxWait by WithMaxWait or maxItem by WithMaxItem. By default, the processor operates similarly to aggressive mode, use Configure to change its behavior.

func NewReplaceIdentityMapProcessor

func NewReplaceIdentityMapProcessor[T any, K comparable](keyExtractor ExtractFn[T, K]) ProcessorSetup[T, map[K]T]

NewReplaceIdentityMapProcessor prepare a processor that backed by a map, using item as value without extracting. ProcessorSetup created by this construct handles duplicated key by keeping only the last value.

func NewReplaceMapProcessor

func NewReplaceMapProcessor[T any, K comparable, V any](keyExtractor ExtractFn[T, K], valueExtractor ExtractFn[T, V]) ProcessorSetup[T, map[K]V]

NewReplaceMapProcessor prepare a processor that backed by a map. ProcessorSetup created by this construct handles duplicated key by keeping only the last value.

func NewSliceProcessor

func NewSliceProcessor[T any]() ProcessorSetup[T, []T]

NewSliceProcessor prepare a processor that backed by a slice.

func (ProcessorSetup[T, B]) Configure

func (p ProcessorSetup[T, B]) Configure(options ...Option) ProcessorSetup[T, B]

Configure apply Option to this processor. Each Configure call creates a new processor.

func (ProcessorSetup[T, B]) Run

func (p ProcessorSetup[T, B]) Run(process ProcessBatchFn[B], errorHandlers ...RecoverBatchFn[B]) *RunningProcessor[T, B]

Run create a RunningProcessor that can accept item. Accept a ProcessBatchFn and a RecoverBatchFn chain to process on error.

func (ProcessorSetup[T, B]) RunIgnoreError

func (p ProcessorSetup[T, B]) RunIgnoreError(process ProcessBatchIgnoreErrorFn[B]) *RunningProcessor[T, B]

func (ProcessorSetup[T, B]) WithSplitter

func (p ProcessorSetup[T, B]) WithSplitter(split SplitBatchFn[B]) ProcessorSetup[T, B]

WithSplitter split the batch into multiple smaller batch. When concurrency > 0 and SplitBatchFn are set, the processor will split the batch and process across multiple threads, otherwise the batch will be process on a single thread, and block when concurrency is reached. This configuration may be beneficial if you have a very large batch that can be split into smaller batch and processed in parallel.

type RecoverBatchFn

type RecoverBatchFn[B any] func(B, int64, error) error

RecoverBatchFn function to handle an error batch. Each RecoverBatchFn can further return error to enable the next RecoverBatchFn in the chain. The RecoverBatchFn must never panic.

type Runner

type Runner[T any, B any] interface {
	// Put add item to the processor.
	// This method can block until the processor is available for processing new batch,
	// and may block indefinitely.
	Put(item T)
	PutAll(items []T)
	// PutContext add item to the processor.
	// If the context is canceled and the item is not added, then this method will return false.
	// The context passed in only control the put step, after item added to the processor,
	// the processing will not be canceled by this context.
	PutContext(ctx context.Context, item T) bool
	// PutAllContext add all items to the processor.
	// If the context is canceled, then this method will return the number of items added to the processor.
	PutAllContext(ctx context.Context, items []T) int
	// ApproxItemCount return number of current item in processor, approximately.
	ApproxItemCount() int64
	// ItemCount return number of current item in processor.
	ItemCount() int64
	// ItemCountContext return number of current item in processor.
	// If the context is canceled, then this method will return approximate item count and false.
	ItemCountContext(ctx context.Context) (int64, bool)
	// Close stop the processor.
	// The implementation of this method may vary, but it must never wait indefinitely.
	Close() error
	// CloseContext stop the processor.
	// This method may process the left-over batch on caller thread.
	// Context can be used to provide deadline for this method.
	CloseContext(ctx context.Context) error
	// StopContext stop the processor.
	// This method does not process leftover batch.
	StopContext(ctx context.Context) error
	// DrainContext force process batch util the batch is empty.
	// This method may process the batch on caller thread.
	// Context can be used to provide deadline for this method.
	DrainContext(ctx context.Context) error
	// FlushContext force process the current batch.
	// This method may process the batch on caller thread.
	// Context can be used to provide deadline for this method.
	FlushContext(ctx context.Context) error
	// Flush force process the current batch.
	// This method may process the batch on caller thread.
	Flush()
	// MustClose stop the processor and panic if there is any error.
	// This method should only be used in tests.
	MustClose()
}

Runner provides common methods of a RunningProcessor.

type RunningProcessor

type RunningProcessor[T any, B any] struct {
	ProcessorSetup[T, B]
	// contains filtered or unexported fields
}

RunningProcessor processor that is running and can process item.

func (*RunningProcessor[T, B]) ApproxItemCount

func (p *RunningProcessor[T, B]) ApproxItemCount() int64

ApproxItemCount return number of current item in processor. This method does not block, so the counter may not be accurate.

func (*RunningProcessor[T, B]) Close

func (p *RunningProcessor[T, B]) Close() error

Close stop the processor. This method will process the leftover branch on caller thread. Return error if maxCloseWait passed. Timeout can be configured by WithMaxCloseWait. See getCloseMaxWait for detail.

func (*RunningProcessor[T, B]) CloseContext

func (p *RunningProcessor[T, B]) CloseContext(ctx context.Context) error

CloseContext stop the processor. This method will process the leftover branch on caller thread. Context can be used to provide deadline for this method.

func (*RunningProcessor[T, B]) DrainContext

func (p *RunningProcessor[T, B]) DrainContext(ctx context.Context) error

DrainContext force process batch util the batch is empty. This method always processes the batch on caller thread. ctx can be used to provide deadline for this method.

func (*RunningProcessor[T, B]) Flush

func (p *RunningProcessor[T, B]) Flush()

Flush force process the current batch. This method may process the batch on caller thread, depend on concurrent and block settings. It is recommended to use [FlushContext] instead.

func (*RunningProcessor[T, B]) FlushContext

func (p *RunningProcessor[T, B]) FlushContext(ctx context.Context) error

FlushContext force process the current batch. This method may process the batch on caller thread, depend on concurrent and block settings. Context can be used to provide deadline for this method.

func (*RunningProcessor[T, B]) IsDisabled

func (p *RunningProcessor[T, B]) IsDisabled() bool

IsDisabled whether the processor is disabled. Disabled processor won't do batching, instead the process will be executed on caller. All other settings are ignored when the processor is disabled.

func (*RunningProcessor[T, B]) ItemCount

func (p *RunningProcessor[T, B]) ItemCount() int64

ItemCount return number of current item in processor. This method will block the processor for accurate counting. It is recommended to use [ItemCountContext] instead.

func (*RunningProcessor[T, B]) ItemCountContext added in v1.3.0

func (p *RunningProcessor[T, B]) ItemCountContext(ctx context.Context) (int64, bool)

ItemCountContext return number of current item in processor. If the context is canceled, then this method will return approximate item count and false.

func (*RunningProcessor[T, B]) MustClose

func (p *RunningProcessor[T, B]) MustClose()

MustClose stop the processor without deadline.

func (*RunningProcessor[T, B]) Put

func (p *RunningProcessor[T, B]) Put(item T)

Put add item to the processor. This method can block until the processor is available for processing new batch. It is recommended to use [PutContext] instead.

func (*RunningProcessor[T, B]) PutAll added in v1.4.0

func (p *RunningProcessor[T, B]) PutAll(items []T)

PutAll add all item to the processor. This method will block until all items were put into the processor. It is recommended to use [PutAllContext] instead.

func (*RunningProcessor[T, B]) PutAllContext added in v1.4.0

func (p *RunningProcessor[T, B]) PutAllContext(ctx context.Context, items []T) int

PutAllContext add all items to the processor. If the context is canceled, then this method will return the number of items added to the processor. The processing order is the same as the input list, so the output can also be used to determine the next item to process if you want to retry or continue processing.

func (*RunningProcessor[T, B]) PutContext added in v1.3.0

func (p *RunningProcessor[T, B]) PutContext(ctx context.Context, item T) bool

PutContext add item to the processor.

func (*RunningProcessor[T, B]) StopContext added in v1.1.0

func (p *RunningProcessor[T, B]) StopContext(ctx context.Context) error

StopContext stop the processor. This method does not process leftover batch.

type Size

type Size interface {
	~int | ~int32 | ~int64
}

Size is a type alias for int, int32, and int64.

type SliceRunner added in v1.2.0

type SliceRunner[T any] interface {
	Runner[T, []T]
}

SliceRunner shorthand for Runner that merge item into slices.

type SplitBatchFn

type SplitBatchFn[B any] func(B, int64) []B

SplitBatchFn function to split a batch into multiple smaller batches. The SplitBatchFn must never panic.

func SplitSliceEqually

func SplitSliceEqually[T any, I Size](numberOfChunk I) SplitBatchFn[[]T]

SplitSliceEqually create a SplitBatchFn that split a slice into multiple equal chuck.

func SplitSliceSizeLimit

func SplitSliceSizeLimit[T any, I Size](maxSizeOfChunk I) SplitBatchFn[[]T]

SplitSliceSizeLimit create a SplitBatchFn that split a slice into multiple chuck of limited size.

Directories

Path Synopsis
maps command
slices command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL