phash

package module
v0.0.0-...-e7061e0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 20, 2026 License: MIT Imports: 17 Imported by: 0

README

go-phash

A minimal-dependency pure-Go library and CLI for computing 64-bit perceptual hashes (pHash) of images. Useful for near-duplicate detection, similarity search, and basic image de-duplication workflows.

Highlights

  • Classic 64-bit pHash pipeline (32x32 resize → grayscale → DCT → median threshold → 64-bit hash).
  • CLI that hashes a file/URL or compares two images with Hamming distance.
  • Robust decoding helpers with JPEG EXIF orientation handling.
  • Built-in WebP decode support.
  • Pure-Go, minimal dependencies (no native/CGo requirements).
  • Simple image utilities (grayscale, resize, downscale).

Install

go get github.com/enot-style/go-phash

CLI Usage Hash a single image (file path or URL):

go run ./cmd/phash test_data/sweater-thumb.jpg

Output is a 16-hex-digit hash:

fa85955a872769cb

Compare two images and get Hamming distance (distance only):

go run ./cmd/phash image-a.jpg image-b.jpg

Output format (single line):

<distance>

Build the CLI:

go build -o phash ./cmd/phash

Library Usage

package main

import (
	"fmt"
	"os"

	"github.com/enot-style/go-phash"
)

func main() {
	f, err := os.Open("test_data/sweater-thumb.jpg")
	if err != nil {
		panic(err)
	}
	defer f.Close()

	img, _, err := phash.DecodeAny(f)
	if err != nil {
		panic(err)
	}

	h := phash.PHash(img)
	fmt.Printf("%016x\n", h)
}

API Overview Core hashing:

  • PHash(image.Image) uint64 computes the 64-bit perceptual hash.
  • HammingDistance(a, b uint64) int compares two hashes.

Decoding helpers:

  • DecodeAny(io.Reader) (image.Image, string, error) reads all bytes, decodes, and applies JPEG EXIF orientation.
  • DownloadAndDecodeAny(context.Context, string) (image.Image, string, error) fetches over HTTP and decodes.
  • DownloadAndDecodeAnyWithLimit(context.Context, string, int64) (image.Image, string, error) with size cap.

Image utilities:

  • Grayscale(image.Image) *image.Gray
  • Resize(image.Image, uint32, uint32) image.Image
  • DownscaleByLargestSide(image.Image, uint32) image.Image

Supported Image Formats Decode (registered by default):

  • JPEG, PNG, GIF, BMP, WebP (via golang.org/x/image/webp and golang.org/x/image/bmp).

EXIF Orientation When decoding JPEGs, EXIF orientation is applied automatically, so hashes are stable across rotated inputs.

Testing

go test ./...

Notes

  • As a practical rule of thumb, images with pHash Hamming distance <= 6 can usually be considered similar.
  • Hashes are 64-bit values typically rendered as 16 hex characters with %016x.
  • The CLI accepts http:// and https:// URLs as inputs.
  • PHash(nil) returns 0.

[!TIP] pHash is mostly shape/structure-driven (grayscale), so images with the same content but different colors can still look "very similar" by hash.

Examples – very close images that are distinct only by color from test_data:

  • test_data/tblue.jpeg vs test_data/tgray.jpeg -> Hamming distance 2
  • test_data/kblue.webp vs test_data/kyellow.jpeg -> Hamming distance 3

If color matters, run go-phash first and add go-colorsim as a second-step color similarity check.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func DecodeAny

func DecodeAny(r io.Reader) (image.Image, string, error)

DecodeAny reads all bytes (so it works with non-seekable readers), decodes, and applies EXIF orientation. It returns the decoded image and the detected format string ("jpeg", "png", "gif", "webp", ...). Errors are returned as DecodeError with Op "read" or "decode".

func DownloadAndDecodeAny

func DownloadAndDecodeAny(ctx context.Context, url string) (image.Image, string, error)

DownloadAndDecodeAny fetches a remote image over HTTP, decodes it, and applies EXIF orientation. Errors are returned as DecodeError with Op "request", "http", "http status", or "decode".

func DownloadAndDecodeAnyWithLimit

func DownloadAndDecodeAnyWithLimit(ctx context.Context, url string, maxBytes int64) (image.Image, string, error)

DownloadAndDecodeAnyWithLimit fetches a remote image over HTTP, decodes it with a byte cap, and applies EXIF orientation. Errors are returned as DecodeError with Op "request", "http", "http status", or "decode".

func DownscaleByLargestSide

func DownscaleByLargestSide(src image.Image, maxSide uint32) image.Image

DownscaleByLargestSide scales the image down so the largest side is at most maxSide, preserving aspect ratio. If no downscale is needed, it returns src. If src is nil or maxSide is 0, it returns src.

func Grayscale

func Grayscale(src image.Image) *image.Gray

Grayscale converts any image.Image to *image.Gray. Uses standard luminance conversion (sRGB).

func HammingDistance

func HammingDistance(a, b uint64) int

HammingDistance returns the number of differing bits between two 64-bit hashes.

func PHash

func PHash(image image.Image) uint64

PHash computes a classic 64-bit perceptual hash (pHash).

Pipeline (classic 64-bit):

  1. Resize to 32x32
  2. Grayscale
  3. 2D DCT (N=32), keep top-left 8x8 coefficients
  4. Median of 63 coefficients excluding DC
  5. Build 64-bit hash: bit=1 if coeff>median, with DC bit forced to 0

func Resize

func Resize(src image.Image, dstW, dstH uint32) image.Image

ResizeMultiStep resizes src to (dstW, dstH) using a quality-preserving strategy.

- If dstW==0 or dstH==0, aspect ratio is preserved. - Downscale: progressive halving using CatmullRom, then final CatmullRom to exact size. - Upscale: single ApproxBiLinear pass (smoother, fewer halos).

Official repos only: stdlib + golang.org/x/image/draw.

Types

type DecodeError

type DecodeError struct {
	Op  DecodeOp
	Err error
}

func (DecodeError) Error

func (e DecodeError) Error() string

Error formats DecodeError as "op: err" (or "op" when Err is nil).

type DecodeOp

type DecodeOp string

DecodeError describes failures in HTTP setup, HTTP status, IO reads, or image decoding. Returned by the helpers in decode.go to avoid raw fmt.Errorf strings.

const (
	DecodeOpRequest    DecodeOp = "request"
	DecodeOpHTTP       DecodeOp = "http"
	DecodeOpHTTPStatus DecodeOp = "http status"
	DecodeOpRead       DecodeOp = "read"
	DecodeOpDecode     DecodeOp = "decode"
)

type EncodeError

type EncodeError struct {
	Op  EncodeOp
	Err error
}

func (EncodeError) Error

func (e EncodeError) Error() string

Error formats EncodeError as "op: err" (or "op" when Err is nil).

type EncodeOp

type EncodeOp string

EncodeError describes failures when encoding images. Returned by helpers in encode.go to avoid raw fmt.Errorf strings.

const (
	EncodeOpWebP EncodeOp = "webp encode"
)

Directories

Path Synopsis
cmd
phash command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL