phash

package module
v0.0.0-...-c28d3f6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 7, 2026 License: MIT Imports: 17 Imported by: 0

README

go-phash

A small, dependency-light Go library and CLI for computing 64-bit perceptual hashes (pHash) of images. Useful for near-duplicate detection, similarity search, and basic image de-duplication workflows.

Highlights

  • Classic 64-bit pHash pipeline (32x32 resize → grayscale → DCT → median threshold → 64-bit hash).
  • CLI that hashes a file/URL or compares two images with Hamming distance.
  • Robust decoding helpers with JPEG EXIF orientation handling.
  • Built-in WebP support (decode and lossless encode).
  • Simple image utilities (grayscale, resize, downscale).

Install

go get github.com/kotylevskiy/go-phash

CLI Usage Hash a single image (file path or URL):

go run ./cmd/phash test_data/sweater-thumb.jpg

Output is a 16-hex-digit hash:

fa85955a872769cb

Compare two images and get Hamming distance (distance only):

go run ./cmd/phash image-a.jpg image-b.jpg

Output format (single line):

<distance>

Build the CLI:

go build -o phash ./cmd/phash

Library Usage

package main

import (
	"fmt"
	"os"

	"github.com/kotylevskiy/go-phash"
)

func main() {
	f, err := os.Open("test_data/sweater-thumb.jpg")
	if err != nil {
		panic(err)
	}
	defer f.Close()

	img, _, err := phash.DecodeAny(f)
	if err != nil {
		panic(err)
	}

	h := phash.PHash(img)
	fmt.Printf("%016x\n", h)
}

API Overview Core hashing:

  • PHash(image.Image) uint64 computes the 64-bit perceptual hash.
  • HammingDistance(a, b uint64) int compares two hashes.

Decoding helpers:

  • DecodeAny(io.Reader) (image.Image, string, error) reads all bytes, decodes, and applies JPEG EXIF orientation.
  • DownloadAndDecodeAny(context.Context, string) (image.Image, string, error) fetches over HTTP and decodes.
  • DownloadAndDecodeAnyWithLimit(context.Context, string, int64) (image.Image, string, error) with size cap.

Encoding helper:

  • EncodeWebPLossless(io.Writer, image.Image) error encodes lossless WebP.

Image utilities:

  • Grayscale(image.Image) *image.Gray
  • Resize(image.Image, uint32, uint32) image.Image
  • DownscaleByLargestSide(image.Image, uint32) image.Image

Supported Image Formats Decode (registered by default):

  • JPEG, PNG, GIF, WebP (via golang.org/x/image/webp).

Encode:

  • WebP (lossless) via EncodeWebPLossless.

EXIF Orientation When decoding JPEGs, EXIF orientation is applied automatically, so hashes are stable across rotated inputs.

Testing

go test ./...

Notes

  • PHash(nil) returns 0.
  • Hashes are 64-bit values typically rendered as 16 hex characters with %016x.
  • The CLI accepts http:// and https:// URLs as inputs.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func DecodeAny

func DecodeAny(r io.Reader) (image.Image, string, error)

DecodeAny reads all bytes (so it works with non-seekable readers), decodes, and applies EXIF orientation. It returns the decoded image and the detected format string ("jpeg", "png", "gif", "webp", ...). Errors are returned as DecodeError with Op "read" or "decode".

func DownloadAndDecodeAny

func DownloadAndDecodeAny(ctx context.Context, url string) (image.Image, string, error)

DownloadAndDecodeAny fetches a remote image over HTTP, decodes it, and applies EXIF orientation. Errors are returned as DecodeError with Op "request", "http", "http status", or "decode".

func DownloadAndDecodeAnyWithLimit

func DownloadAndDecodeAnyWithLimit(ctx context.Context, url string, maxBytes int64) (image.Image, string, error)

DownloadAndDecodeAnyWithLimit fetches a remote image over HTTP, decodes it with a byte cap, and applies EXIF orientation. Errors are returned as DecodeError with Op "request", "http", "http status", or "decode".

func DownscaleByLargestSide

func DownscaleByLargestSide(src image.Image, maxSide uint32) image.Image

DownscaleByLargestSide scales the image down so the largest side is at most maxSide, preserving aspect ratio. If no downscale is needed, it returns src. If src is nil or maxSide is 0, it returns src.

func EncodeWebPLossless

func EncodeWebPLossless(w io.Writer, img image.Image) error

EncodeWebPLossless encodes img as a *lossless* WebP (VP8L) stream using the pure-Go encoder. It returns EncodeError for nil writer/image or encoder failures.

func Grayscale

func Grayscale(src image.Image) *image.Gray

Grayscale converts any image.Image to *image.Gray. Uses standard luminance conversion (sRGB).

func HammingDistance

func HammingDistance(a, b uint64) int

HammingDistance returns the number of differing bits between two 64-bit hashes.

func PHash

func PHash(image image.Image) uint64

PHash computes a classic 64-bit perceptual hash (pHash).

Pipeline (classic 64-bit):

  1. Resize to 32x32
  2. Grayscale
  3. 2D DCT (N=32), keep top-left 8x8 coefficients
  4. Median of 63 coefficients excluding DC
  5. Build 64-bit hash: bit=1 if coeff>median, with DC bit forced to 0

func Resize

func Resize(src image.Image, dstW, dstH uint32) image.Image

ResizeMultiStep resizes src to (dstW, dstH) using a quality-preserving strategy.

- If dstW==0 or dstH==0, aspect ratio is preserved. - Downscale: progressive halving using CatmullRom, then final CatmullRom to exact size. - Upscale: single ApproxBiLinear pass (smoother, fewer halos).

Official repos only: stdlib + golang.org/x/image/draw.

Types

type DecodeError

type DecodeError struct {
	Op  DecodeOp
	Err error
}

func (DecodeError) Error

func (e DecodeError) Error() string

Error formats DecodeError as "op: err" (or "op" when Err is nil).

type DecodeOp

type DecodeOp string

DecodeError describes failures in HTTP setup, HTTP status, IO reads, or image decoding. Returned by the helpers in decode.go to avoid raw fmt.Errorf strings.

const (
	DecodeOpRequest    DecodeOp = "request"
	DecodeOpHTTP       DecodeOp = "http"
	DecodeOpHTTPStatus DecodeOp = "http status"
	DecodeOpRead       DecodeOp = "read"
	DecodeOpDecode     DecodeOp = "decode"
)

type EncodeError

type EncodeError struct {
	Op  EncodeOp
	Err error
}

func (EncodeError) Error

func (e EncodeError) Error() string

Error formats EncodeError as "op: err" (or "op" when Err is nil).

type EncodeOp

type EncodeOp string

EncodeError describes failures when encoding images. Returned by helpers in encode.go to avoid raw fmt.Errorf strings.

const (
	EncodeOpWebP EncodeOp = "webp encode"
)

Directories

Path Synopsis
cmd
phash command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL