Documentation
¶
Overview ¶
Package blockstore defines the core types, interfaces, and errors for DittoFS block storage. It is the single source of truth for FileBlock, BlockState, ContentHash, and BlockSize -- shared across metadata stores, local stores, syncer, and remote block stores.
Sub-packages:
- local: LocalStore interface for on-node storage (memory + disk)
- remote: RemoteStore interface for durable backend storage (S3, etc.)
- sync: Syncer for local-to-remote transfer orchestration
- engine: BlockStore engine composing local store, syncer, and metadata
- gc: Block garbage collection
- storetest: Conformance test suites for FileBlockStore implementations
Index ¶
- Constants
- Variables
- func ClampToInt64(v uint64) int64
- func FormatBytes(b uint64) string
- func FormatStoreKey(payloadID string, blockIdx uint64) string
- func KeyBelongsToFile(key, payloadID string) bool
- func ParseBlockIdx(key, payloadID string) uint64
- func ParseStoreKey(storeKey string) (payloadID string, blockIdx uint64, ok bool)
- func ValidateRetentionPolicy(policy RetentionPolicy, ttl time.Duration) error
- type BlockState
- type BlockStoreError
- type ContentHash
- type DeducedDefaults
- type FileBlock
- type FileBlockStore
- type FlushResult
- type Flusher
- type Reader
- type RetentionPolicy
- type Stats
- type Store
- type SystemDetector
- type Writer
Constants ¶
const ( MinLocalStoreSize uint64 = 256 << 20 // 256 MiB MinReadBufferSize int64 = 64 << 20 // 64 MiB MinParallelSyncs = 4 MinParallelFetches = 8 DefaultPrefetchWorkers = 4 )
Minimum floor values for deduced defaults.
const BlockSize = 8 * 1024 * 1024
BlockSize is the size of a single block (8MB). This is the single source of truth -- all packages should reference this constant instead of defining their own copies.
const HashSize = 32
HashSize is the size of content hashes (SHA-256 = 32 bytes).
Variables ¶
var ( // ErrContentNotFound indicates the requested content does not exist. // // Protocol Mapping: // - NFS: NFS3ErrNoEnt (2) // - SMB: STATUS_OBJECT_NAME_NOT_FOUND // - HTTP: 404 Not Found ErrContentNotFound = errors.New("content not found") // ErrContentExists indicates content with this ID already exists. // // Protocol Mapping: // - NFS: NFS3ErrExist (17) // - SMB: STATUS_OBJECT_NAME_COLLISION // - HTTP: 409 Conflict ErrContentExists = errors.New("content already exists") // ErrInvalidOffset indicates the offset is invalid for the operation. // // Protocol Mapping: // - NFS: NFS3ErrInval (22) // - SMB: STATUS_INVALID_PARAMETER // - HTTP: 416 Range Not Satisfiable ErrInvalidOffset = errors.New("invalid offset") // ErrInvalidSize indicates the size parameter is invalid. // // Protocol Mapping: // - NFS: NFS3ErrInval (22) // - SMB: STATUS_INVALID_PARAMETER ErrInvalidSize = errors.New("invalid size") // ErrStorageFull indicates the storage backend has no available space. // // This is a transient error - it may succeed after cleanup. // // Protocol Mapping: // - NFS: NFS3ErrNoSpc (28) // - SMB: STATUS_DISK_FULL // - HTTP: 507 Insufficient Storage ErrStorageFull = errors.New("storage full") // ErrQuotaExceeded indicates a storage quota has been exceeded. // // Protocol Mapping: // - NFS: NFS3ErrDQuot (69) // - SMB: STATUS_QUOTA_EXCEEDED // - HTTP: 507 Insufficient Storage ErrQuotaExceeded = errors.New("quota exceeded") // ErrIntegrityCheckFailed indicates content integrity verification failed. // // Protocol Mapping: // - NFS: NFS3ErrIO (5) // - SMB: STATUS_DATA_CHECKSUM_ERROR // - HTTP: 500 Internal Server Error ErrIntegrityCheckFailed = errors.New("integrity check failed") // ErrReadOnly indicates the content store is read-only. // // Protocol Mapping: // - NFS: NFS3ErrRoFs (30) // - SMB: STATUS_MEDIA_WRITE_PROTECTED // - HTTP: 403 Forbidden ErrReadOnly = errors.New("content store is read-only") // ErrNotSupported indicates the operation is not supported. // // Protocol Mapping: // - NFS: NFS3ErrNotSupp (10004) // - SMB: STATUS_NOT_SUPPORTED // - HTTP: 501 Not Implemented ErrNotSupported = errors.New("operation not supported") // ErrConcurrentModification indicates content was modified concurrently. // // Callers should retry with fresh data. // // Protocol Mapping: // - NFS: NFS3ErrStale (70) or NFS3ErrJukebox (10008) // - SMB: STATUS_FILE_LOCK_CONFLICT // - HTTP: 409 Conflict or 412 Precondition Failed ErrConcurrentModification = errors.New("concurrent modification detected") // ErrInvalidPayloadID indicates the PayloadID format is invalid. // // Protocol Mapping: // - NFS: NFS3ErrBadHandle (10001) // - SMB: STATUS_INVALID_PARAMETER // - HTTP: 400 Bad Request ErrInvalidPayloadID = errors.New("invalid content ID") // ErrTooLarge indicates the content or operation is too large. // // Protocol Mapping: // - NFS: NFS3ErrFBig (27) // - SMB: STATUS_FILE_TOO_LARGE // - HTTP: 413 Payload Too Large ErrTooLarge = errors.New("content too large") // // This is a transient error - retrying may succeed. // // Protocol Mapping: // - NFS: NFS3ErrJukebox (10008) // - SMB: STATUS_DEVICE_NOT_READY // - HTTP: 503 Service Unavailable ErrUnavailable = errors.New("storage unavailable") // ErrBlockNotFound is returned when a requested block doesn't exist in the // remote block store. ErrBlockNotFound = errors.New("block not found") // ErrStoreClosed is returned when operations are attempted on a closed store. ErrStoreClosed = errors.New("store is closed") // ErrInvalidHash is returned when a hash string is malformed. ErrInvalidHash = errors.New("invalid content hash format") // ErrFileBlockNotFound is returned when a file block is not found. ErrFileBlockNotFound = errors.New("file block not found") // but the remote store is currently unreachable. Protocol handlers should // map this to appropriate I/O error codes (NFS3ERR_IO, NFS4ERR_IO, // STATUS_UNEXPECTED_IO_ERROR). // // The error is intentionally returned early (before attempting network I/O) // when the health monitor reports the remote as unhealthy, avoiding network // timeouts. // // Protocol Mapping: // - NFS: NFS3ErrIO (5) / NFS4ERR_IO (5) // - SMB: STATUS_UNEXPECTED_IO_ERROR (0xC00000E9) // - HTTP: 503 Service Unavailable ErrRemoteUnavailable = errors.New("remote store unavailable") )
Standard block store errors. Protocol handlers should check for these errors and map them to appropriate protocol-specific error codes.
Functions ¶
func ClampToInt64 ¶
ClampToInt64 safely converts a uint64 to int64, clamping at math.MaxInt64.
func FormatBytes ¶
FormatBytes formats a byte count as a human-readable string (e.g., "2 GiB", "512 MiB").
func FormatStoreKey ¶
FormatStoreKey returns the block store key (S3 object key) for a block. Format: "{payloadID}/block-{blockIdx}".
func KeyBelongsToFile ¶
KeyBelongsToFile checks if a store key belongs to the given payloadID. Store key format: "{payloadID}/block-{blockIdx}".
func ParseBlockIdx ¶
ParseBlockIdx extracts the block index from a store key for a known payloadID. Returns 0 if the key format is invalid.
func ParseStoreKey ¶
ParseStoreKey extracts the payloadID and block index from a store key. Store key format: "{payloadID}/block-{blockIdx}". Returns ("", 0, false) if the key format is invalid.
func ValidateRetentionPolicy ¶ added in v0.9.2
func ValidateRetentionPolicy(policy RetentionPolicy, ttl time.Duration) error
ValidateRetentionPolicy checks that the policy and TTL combination is valid. TTL mode requires a positive duration. Pin and LRU modes accept any TTL value.
Types ¶
type BlockState ¶
type BlockState uint8
BlockState represents the lifecycle state of a FileBlock.
State machine: Dirty -> Local -> Syncing -> Remote
- Dirty (0): Receiving writes, NOT syncable. Zero value is safe default for legacy blocks deserialized without this field.
- Local (1): Complete block on local disk, eligible for sync to remote. Set when the next block starts receiving writes, or when DataSize == BlockSize.
- Syncing (2): Sync to remote store in progress. Reverts to Local on failure.
- Remote (3): Confirmed in remote block store. Eligible for local eviction.
Write-after-sync resets: Remote -> Dirty (clears Hash + BlockStoreKey).
const ( BlockStateDirty BlockState = 0 // Receiving writes, NOT syncable BlockStateLocal BlockState = 1 // Complete, on disk, eligible for sync to remote BlockStateSyncing BlockState = 2 // Sync to remote in progress BlockStateRemote BlockState = 3 // Confirmed in remote block store )
func (BlockState) String ¶
func (s BlockState) String() string
String returns the string representation of BlockState.
type BlockStoreError ¶
type BlockStoreError struct {
// Op describes the operation that failed: "upload", "download", "dedup", or "gc".
Op string
Share string
// PayloadID is the content identifier of the affected payload.
PayloadID string
// BlockIdx is the block index within the chunk that failed.
BlockIdx uint32
// Size is the data size involved in the operation (bytes).
Size int64
// Duration is how long the operation ran before failing.
Duration time.Duration
// Retries is the number of retry attempts made before the final failure.
Retries int
// Backend identifies the storage backend type: "s3" or "memory".
Backend string
// Err is the wrapped sentinel error (e.g., ErrContentNotFound, ErrUnavailable).
Err error
}
BlockStoreError wraps sentinel block store errors with structured debugging context.
It provides rich operational metadata for diagnosing block storage issues without losing compatibility with errors.Is() checks on the underlying sentinel. For example:
err := NewBlockStoreError("upload", "/archive", "abc123", 5, "s3", ErrUnavailable)
errors.Is(err, ErrUnavailable) // true
Fields capture the operation type, affected share, payload identifier, block index, backend type, and the wrapped sentinel error. Optional fields (Size, Duration, Retries) can be set after construction for performance debugging.
func NewBlockStoreError ¶
func NewBlockStoreError(op, share, payloadID string, blockIdx uint32, backend string, err error) *BlockStoreError
NewBlockStoreError creates a BlockStoreError wrapping the given sentinel error with operational context. Optional fields (Size, Duration, Retries) default to zero and can be set on the returned pointer after construction.
func (*BlockStoreError) Error ¶
func (e *BlockStoreError) Error() string
Error returns a human-readable description of the block store error including the operation, underlying error, and key context fields.
func (*BlockStoreError) Unwrap ¶
func (e *BlockStoreError) Unwrap() error
Unwrap returns the underlying sentinel error, enabling errors.Is() and errors.As() to match through BlockStoreError wrapping.
type ContentHash ¶
ContentHash represents a SHA-256 hash of content.
func ParseContentHash ¶
func ParseContentHash(s string) (ContentHash, error)
ParseContentHash parses a hex-encoded hash string.
func (ContentHash) IsZero ¶
func (h ContentHash) IsZero() bool
IsZero returns true if the hash is all zeros (uninitialized).
func (ContentHash) String ¶
func (h ContentHash) String() string
String returns the hex-encoded hash string.
type DeducedDefaults ¶
type DeducedDefaults struct {
LocalStoreSize uint64 // 25% of memory, floor 256 MiB
ReadBufferSize int64 // 12.5% of memory, floor 64 MiB
MaxPendingSize uint64 // 50% of LocalStoreSize
ParallelSyncs int // max(4, cpus)
ParallelFetches int // max(8, cpus*2)
PrefetchWorkers int // fixed at DefaultPrefetchWorkers
// contains filtered or unexported fields
}
DeducedDefaults holds block store sizing values derived from system resources.
func DeduceDefaults ¶
func DeduceDefaults(d SystemDetector) *DeducedDefaults
DeduceDefaults derives block store sizing from detected system resources.
func (*DeducedDefaults) HitFloors ¶
func (d *DeducedDefaults) HitFloors() []string
HitFloors returns a list of human-readable descriptions for any deduced values that were clamped to their minimum floor. An empty slice means no floors were hit. Only reports values that were actually clamped (not those that naturally computed to the minimum).
func (*DeducedDefaults) String ¶
func (d *DeducedDefaults) String() string
String returns a human-readable summary of deduced defaults.
type FileBlock ¶
type FileBlock struct {
// ID is a stable UUID for this block.
ID string
// Hash is the SHA-256 of block data. Zero value means pending/incomplete.
Hash ContentHash
// DataSize is the actual bytes written in this block.
DataSize uint32
// LocalPath is the local file path. Empty means not stored locally.
LocalPath string
// BlockStoreKey is the opaque key in the remote block store (S3 key, FS path, etc.).
// Empty means not synced to remote.
BlockStoreKey string
// RefCount is the number of files referencing this block.
RefCount uint32
// LastAccess is used for LRU eviction.
LastAccess time.Time
// CreatedAt is when the block was created.
CreatedAt time.Time
// State is the block lifecycle state (Dirty -> Local -> Syncing -> Remote).
// Zero value (Dirty) is the safe default for legacy blocks.
State BlockState `json:"state"`
}
FileBlock is the single block entity in DittoFS. Content-addressed: blocks with the same hash are shared across files for dedup.
Lifecycle:
- Created on write: ID=uuid, LocalPath=path, State=Dirty
- Local: block is complete (next block started or DataSize==BlockSize)
- Syncing: sync to remote store in progress
- Remote: BlockStoreKey set after background sync to remote store
- Remote + local: both LocalPath and BlockStoreKey set, State=Remote
- Evicted: LocalPath cleared, data only in remote store
func NewFileBlock ¶
NewFileBlock creates a new pending FileBlock with the given ID and local path.
func (*FileBlock) HasLocalFile ¶ added in v0.9.4
HasLocalFile returns true if the block exists in the local store.
func (*FileBlock) IsDirty ¶
IsDirty returns true if the block is receiving writes and not yet complete.
func (*FileBlock) IsFinalized ¶
IsFinalized returns true if the block's hash has been computed.
type FileBlockStore ¶
type FileBlockStore interface {
// GetFileBlock retrieves a file block by its ID.
// Returns ErrFileBlockNotFound if not found.
GetFileBlock(ctx context.Context, id string) (*FileBlock, error)
// PutFileBlock stores or updates a file block.
PutFileBlock(ctx context.Context, block *FileBlock) error
// DeleteFileBlock removes a file block by its ID.
// Returns ErrFileBlockNotFound if not found.
DeleteFileBlock(ctx context.Context, id string) error
// IncrementRefCount atomically increments a block's RefCount.
IncrementRefCount(ctx context.Context, id string) error
// DecrementRefCount atomically decrements a block's RefCount.
// Returns the new count. When 0, the block is a GC candidate.
DecrementRefCount(ctx context.Context, id string) (uint32, error)
// FindFileBlockByHash looks up a finalized block by its content hash.
// Returns nil without error if not found (used for dedup checks).
FindFileBlockByHash(ctx context.Context, hash ContentHash) (*FileBlock, error)
// ListLocalBlocks returns blocks that are in Local state (complete, on disk,
// not yet synced to remote) and older than the given duration.
// If limit > 0, at most limit blocks are returned. If limit <= 0, all are returned.
ListLocalBlocks(ctx context.Context, olderThan time.Duration, limit int) ([]*FileBlock, error)
// ListRemoteBlocks returns blocks that are both stored locally and confirmed
// in remote store, ordered by LRU (oldest LastAccess first), up to limit.
ListRemoteBlocks(ctx context.Context, limit int) ([]*FileBlock, error)
// ListUnreferenced returns blocks with RefCount=0, up to limit.
// These are candidates for garbage collection.
ListUnreferenced(ctx context.Context, limit int) ([]*FileBlock, error)
// ListFileBlocks returns all blocks belonging to a file, ordered by block index.
// Block IDs follow the format "{payloadID}/{blockIdx}", so this method returns
// all blocks whose ID starts with "{payloadID}/".
// Returns empty slice (not nil) if no blocks found.
ListFileBlocks(ctx context.Context, payloadID string) ([]*FileBlock, error)
}
FileBlockStore defines operations for content-addressed file block management.
FileBlock is the single block entity in DittoFS. Each block is content-addressed by its SHA-256 hash and reference-counted for dedup and GC.
type FlushResult ¶
type FlushResult struct {
// Finalized indicates all blocks have been synced to the backend store.
Finalized bool
}
FlushResult indicates the outcome of a flush operation.
type Flusher ¶
type Flusher interface {
// Flush ensures all dirty data for a payload is persisted.
Flush(ctx context.Context, payloadID string) (*FlushResult, error)
// DrainAllUploads waits for all pending uploads to complete.
DrainAllUploads(ctx context.Context) error
}
Flusher defines flush/sync operations on the block store.
type Reader ¶
type Reader interface {
// ReadAt reads data from storage at the given offset into dest.
ReadAt(ctx context.Context, payloadID string, data []byte, offset uint64) (int, error)
// ReadAtWithCOWSource reads data with copy-on-write source fallback.
ReadAtWithCOWSource(ctx context.Context, payloadID, cowSource string, data []byte, offset uint64) (int, error)
// GetSize returns the stored size of a payload.
GetSize(ctx context.Context, payloadID string) (uint64, error)
// Exists checks whether a payload exists.
Exists(ctx context.Context, payloadID string) (bool, error)
}
Reader defines read operations on the block store.
type RetentionPolicy ¶ added in v0.9.2
type RetentionPolicy string
RetentionPolicy controls how blocks are retained on local storage. String-typed for GORM compatibility and JSON serialization.
const ( // RetentionPin keeps blocks stored locally indefinitely (no eviction). RetentionPin RetentionPolicy = "pin" // RetentionTTL evicts blocks after a configurable time-to-live. RetentionTTL RetentionPolicy = "ttl" // RetentionLRU evicts least-recently-used blocks when space is needed. // This is the default policy for backward compatibility. RetentionLRU RetentionPolicy = "lru" )
func ParseRetentionPolicy ¶ added in v0.9.2
func ParseRetentionPolicy(s string) (RetentionPolicy, error)
ParseRetentionPolicy parses a string into a RetentionPolicy. Empty or blank input defaults to LRU for backward compatibility. Parsing is case-insensitive.
func (RetentionPolicy) IsValid ¶ added in v0.9.2
func (p RetentionPolicy) IsValid() bool
IsValid returns true if the retention policy is a recognized value.
func (RetentionPolicy) String ¶ added in v0.9.2
func (p RetentionPolicy) String() string
String returns the string representation of the retention policy.
type Stats ¶
type Stats struct {
TotalSize uint64 // Total storage capacity in bytes
UsedSize uint64 // Space consumed by content in bytes
AvailableSize uint64 // Remaining available space in bytes
ContentCount uint64 // Total number of content items
AverageSize uint64 // Average size of content items in bytes
}
Stats contains storage statistics.
type Store ¶
type Store interface {
Reader
Writer
Flusher
// Stats returns storage statistics.
Stats() (*Stats, error)
// HealthCheck verifies the store is operational.
HealthCheck(ctx context.Context) error
// Start initializes the store and starts background goroutines.
Start(ctx context.Context) error
// Close releases resources held by the store.
Close() error
}
Store is the composed block store interface that combines all sub-interfaces with lifecycle and health operations.
type SystemDetector ¶
SystemDetector provides system resource information for deduction. This mirrors sysinfo.Detector but lives in pkg/blockstore to avoid importing internal/ from pkg/. The sysinfo.Detector satisfies this interface structurally (duck typing).
type Writer ¶
type Writer interface {
// WriteAt writes data to storage at the given offset.
WriteAt(ctx context.Context, payloadID string, data []byte, offset uint64) error
// Truncate changes the size of a payload.
Truncate(ctx context.Context, payloadID string, newSize uint64) error
// Delete removes all data for a payload.
Delete(ctx context.Context, payloadID string) error
// CopyPayload duplicates all blocks from srcPayloadID to dstPayloadID.
// For remote-backed stores this leverages server-side copy (e.g., S3 CopyObject)
// to avoid data transfer. Local blocks are copied via read+write.
// Returns the number of blocks copied and any error encountered.
CopyPayload(ctx context.Context, srcPayloadID, dstPayloadID string) (int, error)
}
Writer defines write operations on the block store.
Directories
¶
| Path | Synopsis |
|---|---|
|
Package engine provides the BlockStore orchestrator that composes local store, remote store, and syncer into the blockstore.Store interface.
|
Package engine provides the BlockStore orchestrator that composes local store, remote store, and syncer into the blockstore.Store interface. |
|
Package gc implements garbage collection for orphan blocks in the block store.
|
Package gc implements garbage collection for orphan blocks in the block store. |
|
Package local defines the LocalStore interface for on-node block storage.
|
Package local defines the LocalStore interface for on-node block storage. |
|
fs
Package fs implements a filesystem-based local block store for DittoFS.
|
Package fs implements a filesystem-based local block store for DittoFS. |
|
localtest
Package localtest provides a conformance test suite for local.LocalStore implementations.
|
Package localtest provides a conformance test suite for local.LocalStore implementations. |
|
memory
Package memory provides a pure in-memory LocalStore implementation.
|
Package memory provides a pure in-memory LocalStore implementation. |
|
Package readbuffer provides an in-memory read buffer for hot blocks and a sequential prefetch system for the DittoFS BlockStore engine.
|
Package readbuffer provides an in-memory read buffer for hot blocks and a sequential prefetch system for the DittoFS BlockStore engine. |
|
Package remote defines the RemoteStore interface for durable block storage backends (S3, filesystem, memory).
|
Package remote defines the RemoteStore interface for durable block storage backends (S3, filesystem, memory). |
|
memory
Package memory provides an in-memory RemoteStore implementation for testing.
|
Package memory provides an in-memory RemoteStore implementation for testing. |
|
remotetest
Package remotetest provides a conformance test suite for remote.RemoteStore implementations.
|
Package remotetest provides a conformance test suite for remote.RemoteStore implementations. |
|
s3
Package s3 provides an S3-backed RemoteStore implementation.
|
Package s3 provides an S3-backed RemoteStore implementation. |
|
Package sync implements local-to-remote transfer orchestration.
|
Package sync implements local-to-remote transfer orchestration. |