hit

package
v1.3.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 30, 2026 License: GPL-3.0 Imports: 3 Imported by: 0

Documentation

Overview

Package hit provides hash-based utilities for dsco's deduplication and caching systems.

Overview

The hit package offers efficient hash-based operations that support dsco's need for detecting duplicate configurations, caching computed values, and ensuring consistent behavior across multiple configuration processing cycles. It provides fast, collision-resistant hashing suitable for configuration-scale data.

Core Functions

The package provides hash computation and comparison utilities optimized for dsco's data structures and usage patterns.

## Hash Computation

Generate consistent hash values for various Go types:

// Basic types
hashValue := hit.Hash("my-string")
hashValue = hit.Hash(42)
hashValue = hit.Hash(true)

// Complex types
config := MyConfig{Host: "localhost", Port: 8080}
hashValue = hit.Hash(config)

// Slices and maps
hashValue = hit.Hash([]string{"a", "b", "c"})
hashValue = hit.Hash(map[string]int{"key": 123})

## Content Addressing

Create content-addressable identifiers for configuration objects:

configID := hit.ContentID(configStruct)
// Use configID as a stable identifier for this configuration

if hit.ContentID(newConfig) == configID {
	// Configuration unchanged, can reuse cached results
}

Deduplication Support

The package enables efficient deduplication of configuration sources:

## Layer Deduplication

Prevent duplicate layers from being registered:

type layerRegistry struct {
	registered map[uint64]bool // Using hit.Hash values
}

func (r *layerRegistry) Register(layer Layer) error {
	layerHash := hit.Hash(layer.Identifier())
	if r.registered[layerHash] {
		return fmt.Errorf("layer already registered: %s", layer.Identifier())
	}
	r.registered[layerHash] = true
	return nil
}

## Value Deduplication

Detect when the same configuration values are provided multiple times:

func detectDuplicateValues(values []ConfigValue) []ConfigValue {
	seen := make(map[uint64]bool)
	unique := make([]ConfigValue, 0)

	for _, value := range values {
		hash := hit.Hash(value.Content)
		if !seen[hash] {
			seen[hash] = true
			unique = append(unique, value)
		}
	}

	return unique
}

Caching Integration

The package supports dsco's caching mechanisms:

## Model Caching

Cache computed struct models based on type signatures:

type ModelCache struct {
	cache map[uint64]Model
}

func (c *ModelCache) GetOrBuild(t reflect.Type) Model {
	typeHash := hit.Hash(t.String()) // Type signature hash

	if model, exists := c.cache[typeHash]; exists {
		return model
	}

	model := buildModel(t)
	c.cache[typeHash] = model
	return model
}

## Value Conversion Caching

Cache expensive type conversions:

type ConversionCache struct {
	conversions map[uint64]interface{}
}

func (c *ConversionCache) Convert(value string, targetType reflect.Type) (interface{}, error) {
	cacheKey := hit.Hash(struct{
		Value string
		Type  string
	}{
		Value: value,
		Type:  targetType.String(),
	})

	if cached, exists := c.conversions[cacheKey]; exists {
		return cached, nil
	}

	converted, err := performConversion(value, targetType)
	if err != nil {
		return nil, err
	}

	c.conversions[cacheKey] = converted
	return converted, nil
}

Hash Algorithm

The package uses a fast, collision-resistant hash algorithm suitable for configuration processing:

## Algorithm Choice

- **Speed**: Optimized for frequent hash computations during configuration processing - **Distribution**: Good distribution characteristics for typical configuration data - **Stability**: Hash values remain consistent across program runs - **Collision Resistance**: Sufficient resistance for configuration-scale datasets

## Data Serialization

Complex Go types are serialized consistently before hashing:

// Struct serialization preserves field order and types
type Config struct {
	Host string `json:"host"`
	Port int    `json:"port"`
}

// These produce the same hash:
config1 := Config{Host: "localhost", Port: 8080}
config2 := Config{Host: "localhost", Port: 8080}

// These produce different hashes:
config3 := Config{Host: "localhost", Port: 8081}

## Type Handling

Different Go types are handled appropriately:

- **Basic types**: Direct value hashing - **Strings**: UTF-8 byte sequence hashing - **Structs**: Field-by-field hashing with type information - **Slices/Arrays**: Element hashing with length and type - **Maps**: Key-value pair hashing with deterministic ordering - **Pointers**: Dereference and hash pointed-to value - **Interfaces**: Hash concrete type and value

Performance Characteristics

The package is optimized for dsco's usage patterns:

## Speed Optimization

- Fast hash computation for small to medium configuration objects - Minimal memory allocation during hashing process - Efficient serialization of Go data structures - Batch processing support for multiple values

## Memory Usage

- Minimal memory overhead per hash operation - No persistent memory usage (stateless operations) - Efficient handling of large configuration objects - Garbage collector friendly allocation patterns

## Scalability

The package scales well with: - Configuration object size (up to several MB) - Number of concurrent hash operations - Frequency of hash computations - Variety of Go types being hashed

Integration with dsco Components

## Layer System

Layers use hit package for deduplication:

func (l *EnvLayer) register(to *layerBuilder) error {
	layerID := hit.Hash(struct{
		Type   string
		Prefix string
	}{
		Type:   "env",
		Prefix: l.prefix,
	})

	if to.hasLayer(layerID) {
		return fmt.Errorf("env layer with prefix '%s' already registered", l.prefix)
	}

	to.registerLayer(layerID, l)
	return nil
}

## Model System

Models use hit package for caching:

func buildModelWithCache(t reflect.Type) Model {
	modelKey := hit.Hash(struct{
		Package string
		Name    string
		Fields  []string
	}{
		Package: t.PkgPath(),
		Name:    t.Name(),
		Fields:  extractFieldNames(t),
	})

	if cached := getFromCache(modelKey); cached != nil {
		return cached
	}

	model := buildModelFromType(t)
	saveToCache(modelKey, model)
	return model
}

## Value Processing

Values use hit package for change detection:

func processConfigurationValues(values []Value) ProcessingResult {
	currentHash := hit.Hash(values)

	if currentHash == previousHash {
		// Configuration unchanged, return cached result
		return cachedResult
	}

	result := performProcessing(values)
	previousHash = currentHash
	cachedResult = result
	return result
}

Testing Coverage

This package maintains 100% test coverage, including: - Hash computation for all supported Go types - Hash consistency across multiple computations - Hash distribution quality for typical configuration data - Collision resistance testing with large datasets - Performance benchmarking for various data sizes - Memory usage validation for large objects - Concurrent access safety testing

The test suite covers edge cases: - Empty values and nil pointers - Very large configuration objects - Deeply nested data structures - Hash collision probability estimation - Cross-platform hash consistency

Thread Safety

All functions in the hit package are thread-safe: - No shared mutable state - Stateless hash computations - Safe for concurrent use from multiple goroutines - No coordination or locking required

This makes the package safe for use in dsco's concurrent configuration processing scenarios.

Error Handling

The package is designed for robust operation: - Hash functions never panic on valid Go values - Graceful handling of complex or unusual data structures - Consistent behavior for edge cases (nil, empty values) - Deterministic error modes for invalid inputs

Future Extensions

The package design allows for enhancements:

## Algorithm Selection

Support for different hash algorithms based on use case:

hashValue := hit.HashWith(data, hit.AlgorithmFast)     // Speed priority
hashValue = hit.HashWith(data, hit.AlgorithmSecure)    // Security priority
hashValue = hit.HashWith(data, hit.AlgorithmBalanced) // Default

## Custom Serialization

Support for custom serialization of specific types:

hit.RegisterSerializer(MyCustomType{}, customSerializer)

## Hash Validation

Support for hash validation and integrity checking:

isValid := hit.ValidateHash(data, expectedHash)

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type HashProvider

type HashProvider[T hash.Hash] interface {
	// Get retrieves a hash instance from the pool or creates a new one.
	// The returned instance is ready for use and should be returned via PutBack
	// when done.
	Get() T

	// PutBack returns a hash instance to the pool for reuse.
	// This enables efficient resource management and reduces garbage collection
	// pressure.
	PutBack(h T)
}

HashProvider defines an interface for providing hash instances with pooling support. This generic interface allows efficient reuse of hash instances, reducing allocation overhead during repeated hash computations.

type IntNode

type IntNode struct {
	// contains filtered or unexported fields
}

IntNode represents a hash tree node that contains an integer value. It implements the MerkelNode interface and provides content-addressable hashing for integer data.

func NewIntNode

func NewIntNode(
	hashProvider HashProvider[hash.Hash],
	salt []byte,
	id string,
	value int,
) *IntNode

NewIntNode creates a new IntNode with the specified value and computes its hash. The hash is computed using the provided salt for security and collision resistance.

Parameters:

  • hashProvider: Provider for hash instances to avoid allocation overhead
  • salt: Random bytes to prevent hash collision attacks
  • id: Unique identifier for this node
  • value: The integer value to store and hash

Returns a new IntNode with computed hash value.

func (*IntNode) GetHash

func (n *IntNode) GetHash() []byte

func (*IntNode) GetID

func (n *IntNode) GetID() string

type MerkelNode

type MerkelNode interface {
	// GetID returns the unique identifier for this node.
	// The ID can be used to reference and locate the node within the tree
	// structure.
	GetID() string

	// GetHash returns the computed hash value for this node.
	// The hash represents the content and structure of the node and its
	// children.
	GetHash() []byte
}

MerkelNode defines the interface for nodes in a Merkle-like hash tree structure. Each node can provide its unique identifier and computed hash value, enabling efficient tree traversal and verification.

type NodeID

type NodeID string

NodeID represents a unique identifier for hash nodes in the Merkle-like tree structure. It provides a string-based identifier that can be used to reference and locate specific nodes within the hash tree.

type StringNode

type StringNode struct {
	// contains filtered or unexported fields
}

StringNode represents a hash tree node that contains a string value. It implements the MerkelNode interface and provides content-addressable hashing for string data.

func NewStringNode

func NewStringNode(
	hashProvider HashProvider[hash.Hash],
	salt []byte,
	id string,
	value string,
) *StringNode

NewStringNode creates a new StringNode with the specified value and computes its hash. The hash is computed using the provided salt for security and collision resistance.

Parameters:

  • hashProvider: Provider for hash instances to avoid allocation overhead
  • salt: Random bytes to prevent hash collision attacks
  • id: Unique identifier for this node
  • value: The string value to store and hash

Returns a new StringNode with computed hash value.

func (*StringNode) GetHash

func (n *StringNode) GetHash() []byte

func (*StringNode) GetID

func (n *StringNode) GetID() string

Directories

Path Synopsis
Package hprovider provides hash instance pooling for efficient hash computation in dsco's hit package.
Package hprovider provides hash instance pooling for efficient hash computation in dsco's hit package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL