hash

package module
v0.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 30, 2025 License: MIT Imports: 6 Imported by: 1

README

Hash Library

A Go library for SHA256 hash generation and manipulation with a focus on convenience, performance, and type safety.

Installation

go get ella.to/hash

Quick Start

package main

import (
    "fmt"
    "ella.to/hash"
)

func main() {
    // Generate hash from bytes
    h := hash.FromBytes([]byte("hello world"))
    fmt.Println(h.String()) // sha256-b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9
    
    // Short representation for logs
    fmt.Println(h.Short()) // efcde9
}

Usage Examples

Basic Hash Generation
// From byte slice
data := []byte("hello world")
h := hash.FromBytes(data)

// From string
h = hash.FromBytes([]byte("hello world"))

// From file or any io.Reader
file, _ := os.Open("data.txt")
defer file.Close()
h, err := hash.FromReader(file)
if err != nil {
    log.Fatal(err)
}
Streaming with TeeReader

When you need both the data and its hash without loading everything into memory:

// Using the built-in TeeReader function
file, _ := os.Open("large-file.dat")
defer file.Close()

teeReader, hashFunc := hash.FromTeeReader(file)
data, err := io.ReadAll(teeReader)
if err != nil {
    log.Fatal(err)
}
hashValue := hashFunc()

// Using the custom TeeReader struct for more control
teeReader := hash.NewTeeReader(file)
buffer := make([]byte, 4096)
for {
    n, err := teeReader.Read(buffer)
    if n > 0 {
        // Process buffer[:n]
    }
    if err == io.EOF {
        break
    }
    if err != nil {
        log.Fatal(err)
    }
}
hashValue := teeReader.Hash()
String Parsing and Validation
hashStr := "sha256-b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9"

// Parse from string
h, err := hash.ParseFromString(hashStr)
if err != nil {
    log.Fatal(err)
}

// Parse from raw bytes
rawBytes := make([]byte, 32) // 32 bytes for SHA256
h, err = hash.ParseFromBytes(rawBytes)
if err != nil {
    log.Fatal(err)
}
JSON Marshaling/Unmarshaling

The Hash type implements encoding.TextMarshaler and encoding.TextUnmarshaler:

type Document struct {
    Content string    `json:"content"`
    Hash    hash.Hash `json:"hash"`
}

doc := Document{
    Content: "hello world",
    Hash:    hash.FromBytes([]byte("hello world")),
}

// Marshal to JSON
jsonData, err := json.Marshal(doc)
if err != nil {
    log.Fatal(err)
}

// Unmarshal from JSON
var newDoc Document
err = json.Unmarshal(jsonData, &newDoc)
if err != nil {
    log.Fatal(err)
}
Utility Functions
// Format hash bytes to string (handles nil gracefully)
var hashBytes []byte = nil
fmt.Println(hash.Format(hashBytes)) // "nil"

hashBytes = make([]byte, 32)
fmt.Println(hash.Format(hashBytes)) // "sha256-..."

// Print hash with additional info
hash.Print(os.Stdout, hashBytes, "file processed successfully")
// Output: a27ae file processed successfully

Thread Safety

All operations in this package are thread-safe:

  • Hash generation functions can be called concurrently
  • Hash value methods (String, Short, etc.) are safe for concurrent access
  • TeeReader instances should not be shared between goroutines (standard io.Reader practice)
// Safe concurrent usage
var wg sync.WaitGroup
results := make([]hash.Hash, 100)

for i := 0; i < 100; i++ {
    wg.Add(1)
    go func(index int) {
        defer wg.Done()
        results[index] = hash.FromBytes([]byte(fmt.Sprintf("data-%d", index)))
    }(i)
}
wg.Wait()

Error Handling

The library provides detailed error messages for common failure cases:

// Invalid string format
_, err := hash.ParseFromString("invalid-hash")
// Error: hash: invalid hash string length 12, expected 71

// Invalid hex encoding  
_, err = hash.ParseFromString("sha256-invalid_hex_characters_zzzzzzzzzzzzzzzzzzzzzzzzzzzzzzzz")
// Error: hash: invalid hexadecimal encoding: encoding/hex: invalid byte: U+007A 'z'

// Wrong byte size
_, err = hash.ParseFromBytes(make([]byte, 16))
// Error: hash: invalid hash size 16, expected 32

Constants

const (
    StringSize = 71  // Total length of string representation
    ByteSize   = 32  // Size of hash in bytes (SHA256 = 256 bits = 32 bytes)
)

Best Practices

  1. Use TeeReader for large files: When processing large files and need both content and hash
  2. Validate inputs: Always check errors when parsing hashes from external sources
  3. Use Short() for logging: Use the short representation in logs to save space
  4. Concurrent processing: The library is thread-safe, leverage goroutines for parallel processing
  5. Memory efficiency: Use streaming methods for large datasets

Common Patterns

File Integrity Verification
func verifyFile(filename string, expectedHash string) error {
    file, err := os.Open(filename)
    if err != nil {
        return err
    }
    defer file.Close()
    
    actualHash, err := hash.FromReader(file)
    if err != nil {
        return err
    }
    
    expected, err := hash.ParseFromString(expectedHash)
    if err != nil {
        return err
    }
    
    if actualHash.String() != expected.String() {
        return fmt.Errorf("hash mismatch: expected %s, got %s", 
            expected.String(), actualHash.String())
    }
    
    return nil
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Documentation

Overview

Package hash provides utilities for SHA256 hash generation and manipulation. It offers a custom Hash type with convenient string formatting, parsing, and various input sources including bytes, readers, and tee readers.

The package standardizes hash representation with a "sha256-" prefix and provides consistent error handling and validation.

Index

Constants

View Source
const (

	// StringSize is the total length of a hash in string format including the header
	// Format: "sha256-" (7 chars) + hex encoded hash (64 chars) = 71 chars total
	StringSize = 64 + len(hashHeader)

	// ByteSize is the size of a SHA256 hash in bytes (32 bytes = 256 bits)
	ByteSize = 32
)

Variables

This section is empty.

Functions

func Format

func Format(value []byte) string

Format is a utility function that converts raw hash bytes to their string representation. It handles the nil case gracefully by returning "nil".

This function is useful for logging, debugging, and displaying hash values in a consistent format throughout an application.

Parameters:

value: raw hash bytes (typically 32 bytes for SHA256, but can be nil)

Returns:

string: formatted hash with "sha256-" prefix, or "nil" if input is nil

Example:

Format(nil) -> "nil"
Format(hashBytes) -> "sha256-a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3"

func FromTeeReader

func FromTeeReader(r io.Reader) (io.Reader, func() Hash)

FromTeeReader creates a TeeReader that allows reading data while simultaneously computing its hash. This is more memory-efficient than reading all data into memory first when you need both the data and its hash.

Returns: - io.Reader: A reader that provides the same data as the input reader - func() Hash: A function that returns the computed hash (call after reading is complete)

The hash function should only be called after all data has been read from the returned reader.

func Print

func Print(w io.Writer, hash []byte, args ...interface{})

Print is a utility function similar to fmt.Fprint that writes a formatted hash value followed by additional arguments to the specified writer.

The hash is displayed in its short form (last 5 characters) followed by the additional arguments. This is useful for logging and debugging where you want to include hash information with other data. Optimized to avoid slice append allocation.

Parameters:

w: destination writer (e.g., os.Stdout, log file, buffer)
hash: raw hash bytes (should be 32 bytes for SHA256)
args: additional arguments to print after the hash

Example output: "a27ae hello world"

Types

type Hash

type Hash []byte

Hash is a custom type that wraps a byte slice representing a SHA256 hash value. It provides convenient methods for formatting, parsing, and marshaling/unmarshaling. The underlying byte slice should always be exactly 32 bytes (256 bits) for SHA256.

func FromBytes

func FromBytes(content []byte) Hash

FromBytes computes the SHA256 hash of the provided byte slice. This is the most basic hash generation function and is safe for concurrent use. It accepts any byte slice, including nil or empty slices.

func FromBytesReuse

func FromBytesReuse(content []byte, output []byte) Hash

FromBytesReuse computes the SHA256 hash using a pre-allocated output slice. This version allows reusing an existing 32-byte slice to avoid allocation. The output slice must be exactly 32 bytes or it will be reallocated.

This is an optimization for hot paths where allocation overhead matters.

func FromReader

func FromReader(r io.Reader) (Hash, error)

FromReader computes the SHA256 hash by reading all data from the provided io.Reader. This function is useful for hashing data from files, network streams, or any io.Reader. The reader is consumed entirely during this operation.

Returns an error if reading from the reader fails.

func ParseFromBytes

func ParseFromBytes(hash []byte) (Hash, error)

ParseFromBytes validates and creates a Hash from raw bytes. The input must be exactly 32 bytes (SHA256 hash size).

This function performs validation to ensure the byte slice represents a valid SHA256 hash. It returns an error if the input is nil, empty, or not exactly 32 bytes in length.

Parameters:

hash: byte slice that should contain exactly 32 bytes

Returns:

Hash: the validated hash value
error: validation error if input is invalid

func ParseFromString

func ParseFromString(value string) (Hash, error)

ParseFromString parses a hash from its string representation back to a Hash value. The input string must be in the exact format produced by Hash.String(): "sha256-" followed by 64 hexadecimal characters (lowercase).

This function performs comprehensive validation: - Checks total string length (must be exactly StringSize) - Validates the "sha256-" prefix - Validates hexadecimal encoding - Ensures the decoded bytes are exactly 32 bytes

Parameters:

value: string representation of hash (e.g., "sha256-a665a45920...")

Returns:

Hash: the parsed hash value
error: parsing/validation error if input is invalid

Performance note: Uses efficient string slicing instead of string replacement

func (Hash) MarshalText

func (h Hash) MarshalText() ([]byte, error)

MarshalText implements the encoding.TextMarshaler interface. It returns the string representation of the hash as bytes. Optimized to avoid string allocation.

This method enables automatic marshaling to JSON, YAML, and other text formats.

func (Hash) Short

func (h Hash) Short() string

Short returns the last 5 characters of the hash string representation. This is useful for displaying abbreviated hash values in logs or UI. Optimized to work directly on bytes without creating full string.

Example: If hash is "sha256-a665a4...7a27ae3", this returns "27ae3"

func (Hash) String

func (h Hash) String() string

String returns the full string representation of the hash with the "sha256-" prefix. The format is: "sha256-" followed by the lowercase hexadecimal representation. This method is safe for concurrent use.

Example: "sha256-a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3"

func (*Hash) UnmarshalText

func (h *Hash) UnmarshalText(text []byte) error

UnmarshalText implements the encoding.TextUnmarshaler interface. It parses a hash from its string representation back into a Hash value. The input must be in the format returned by String(). Optimized to work directly on bytes without string conversion.

This method enables automatic unmarshaling from JSON, YAML, and other text formats.

type TeeReader

type TeeReader struct {
	// contains filtered or unexported fields
}

TeeReader provides an alternative implementation of io.TeeReader specifically designed for hash computation. It reads from an underlying reader while simultaneously writing the data to a hash function.

This struct is useful when you need more control over the reading process compared to the standard library's io.TeeReader.

func NewTeeReader

func NewTeeReader(r io.Reader) *TeeReader

NewTeeReader creates a new TeeReader that reads from the provided io.Reader while simultaneously computing the SHA256 hash of the data.

This is useful when you need to: - Read data from a source and compute its hash without storing all data in memory - Process streaming data where you need both the content and its hash - Implement efficient file copying with integrity verification

Usage pattern:

teeReader := NewTeeReader(file)
data, err := io.ReadAll(teeReader)  // or read in chunks
hash := teeReader.Hash()

func (TeeReader) Hash

func (r TeeReader) Hash() Hash

Hash returns the computed hash value of all data that has been read so far. This method should typically be called only after all data has been read from the TeeReader (i.e., after Read returns io.EOF).

It's safe to call this method multiple times; each call returns the hash of all data read up to that point.

func (*TeeReader) Read

func (r *TeeReader) Read(b []byte) (int, error)

Read implements the io.Reader interface. It reads data from the underlying reader and simultaneously writes it to the hasher for hash computation.

The method handles the case where n > 0 bytes are read, even if an error occurs. It returns io.EOF only when no bytes are read and the underlying reader returns EOF.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL