mosaic

package module
v1.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 30, 2026 License: GPL-3.0 Imports: 12 Imported by: 0

README

mosaic

CI Go Reference Go Report Card Release

Turn a panning video into a wide panoramic mosaic — the "VideoBrush" strip-mosaicing technique (Peleg et al.) implemented in Go on top of GoCV / OpenCV.

How it works

The pipeline aligns consecutive frames and paints a panorama from thin per-frame strips:

  1. Extract & prepare — decode frames, trim black borders, and detect the dominant pan direction so motion is horizontal.
  2. Align — for each adjacent pair, detect Shi-Tomasi corners, track them with Lucas-Kanade optical flow, and fit a partial-affine transform via RANSAC. Transforms are reduced to horizontal translation and accumulated relative to a central reference frame.
  3. Warp — project every frame onto a shared canvas (bounded, parallel).
  4. Stitch — sweep a column offset across the aligned frames, painting the strip each frame contributes; optional feather-blending hides seams.
  5. Sequence — emit a static (forward + reverse ping-pong) or dynamic (forward "video brush") mosaic and write it as MP4.

Requirements

GoCV requires OpenCV (4.x) installed locally — see the GoCV install guide.

go get github.com/nit4y/mosaic

Usage

package main

import (
	"log/slog"
	"os"

	"github.com/nit4y/mosaic"
)

func main() {
	// The caller owns logging; pass nil (or verbose=false) to silence it.
	log := mosaic.NewLogger(slog.New(slog.NewTextHandler(os.Stdout, nil)), true)

	// Start from tuned defaults and override what you need.
	cfg := mosaic.DefaultConfig()
	cfg.FeatherWidth = 4

	// Static mosaics for every video in cfg.InputDir → cfg.OutputDir.
	if err := mosaic.GenerateVideos(cfg, log); err != nil {
		log.Error("generate failed", "error", err)
		os.Exit(1)
	}

	// Or a dynamic ("video brush") mosaic for a specific directory:
	// mosaic.GenerateVideosFromDir("clips", "out", mosaic.Dynamic, cfg, log)
}

Configuration

All tunables live in mosaic.Config (DefaultConfig() returns the tuned baseline). Highlights:

Field Purpose
FlattenVertical Flatten vertical drift into one band (horizontal pans).
FeatherWidth Width of the seam cross-fade; 0 = hard seams.
CropToCoveredBand Crop output to the fully-covered rows (drop wedges).
MaxWorkers Per-stage goroutine cap (0 = NumCPU).
VideoConcurrency How many videos to process at once.
OutputFPS, OutputLengthInSeconds Output video timing.

LK / RANSAC / corner-detection parameters are exposed too — see the GoDoc.

Development

go test ./...          # unit + integration tests (needs OpenCV)
go test -race ./...    # race detector
golangci-lint run      # lint + format checks

License

GPL-3.0.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ApplyBlur

func ApplyBlur(img gocv.Mat, blurResolution float64) gocv.Mat

ApplyBlur downscales the image by blurResolution, then upscales it back to its original size, producing a simple blur.

func CalculateCanvasSize

func CalculateCanvasSize(frames []gocv.Mat, transforms []*gocv.Mat, refIndex int, lg *Logger) (int, int, int, int)

func CalculateTransformations

func CalculateTransformations(frames []gocv.Mat, cfg Config, lg *Logger) ([]*gocv.Mat, int)

CalculateTransformations computes cumulative homographies aligning each frame to the middle (reference) frame, then recenters them by the median vertical shift.

func DampYTranslation

func DampYTranslation(mat gocv.Mat, factor float64) gocv.Mat

DampYTranslation scales the ty component (element [1,2]) of a 3×3 affine homography by `factor`. factor=1.0 is a no-op; factor=0.0 removes vertical translation entirely. Mutates the input Mat in place and returns it (consistent with the other stabilize helpers).

func ExtractFrames

func ExtractFrames(videoPath string, lg *Logger) ([]gocv.Mat, error)

ExtractFrames extracts all frames from a video file and returns them as a slice of Mats.

func GenerateMosaicVideo

func GenerateMosaicVideo(videoPath, outputDir string, kind Kind, cfg Config, lg *Logger) error

GenerateMosaicVideo generates a panoramic mosaic video using a worker pool.

func GenerateVideoFromFrames

func GenerateVideoFromFrames(images []resJob, outputPath string, fps int, lg *Logger) error

GenerateVideoFromFrames converts a slice of Mats into an MP4 video file.

func GenerateVideos

func GenerateVideos(cfg Config, lg *Logger) error

GenerateVideos processes all .mp4 videos in the default input directory ("input/") and writes mosaics under "output/". Convenience wrapper for the CLI; tests and external callers should use GenerateVideosFromDir to keep paths injectable. lg may be nil (silent).

func GenerateVideosFromDir

func GenerateVideosFromDir(inputDir, outputDir string, kind Kind, cfg Config, lg *Logger) error

GenerateVideosFromDir generates a mosaic of the given Kind for every video in inputDir, writing each under outputDir/<video name>/. Videos are processed with bounded concurrency (config.VideoConcurrency); it returns the first error encountered, after attempting every video. lg is the caller-supplied logger (nil or non-verbose = silent).

func Median

func Median(xs []float64) float64

Median returns the median value of the input slice. If the slice is empty, it returns 0.

func RotateFrame

func RotateFrame(frame gocv.Mat, direction Direction) gocv.Mat

RotateFrame rotates a frame to align motion to the right.

func RotateFrameBack

func RotateFrameBack(frame gocv.Mat, direction Direction) gocv.Mat

RotateFrameBack reverts rotation applied for alignment.

func StabilizeHorizontalMotion

func StabilizeHorizontalMotion(matrix gocv.Mat) gocv.Mat

StabilizeHorizontalMotion removes rotational components from a 3×3 transform, preserving only horizontal translation.

func StabilizeScale

func StabilizeScale(mat gocv.Mat) gocv.Mat

StabilizeNoScale zeroes out any scale in rows 0 and 1 of a 3×3 matrix, so after this call the diagonal entries are 1.0 (unit scale).

func StablizeTranslation

func StablizeTranslation(mat gocv.Mat, yDamping float64) gocv.Mat

StablizeTranslation reduces a homography to horizontal translation: it zeroes rotation, forces unit scale, and damps the vertical translation by yDamping (1.0 = keep ty as-is, 0.0 = remove it).

func StitchPanorama

func StitchPanorama(
	videoName string,
	warpedFrames []gocv.Mat,
	canvasWidth,
	canvasHeight,
	frameXOffset int,
	cfg Config,
	lg *Logger,
) gocv.Mat

StitchPanorama composes a single panorama image from a sequence of canvas-sized warped frames using the column-strip algorithm from the Python reference implementation in ref/ex4.py.

For each consecutive pair (prev, curr) we find the leftmost non-black column of each (L_prev, L_curr) and paint the column strip [L_prev+frameXOffset, L_curr+frameXOffset) of prev_warped onto the canvas at the same column range. Because each warped frame is already aligned to its target position on the canvas, the strip lands exactly where it needs to be — no horizontal accumulator, no overlay of full frames.

frameXOffset shifts which column slice of each frame contributes to the panorama. For dynamic mosaics, varying frameXOffset across the output sequence produces a time-evolving panorama. For static mosaics it is typically a small constant (e.g. config.MinimalPixelColumnIndex).

We paint NO synthetic leading/trailing strips. The old code stretched the first/last frame across empty canvas to avoid black edges, which is exactly what produced the visible edge smear. The black margins left here (and the last frame's unpainted body) are removed downstream by cropping every panorama to the common content box (see buildSequence), giving clean rectangular edges instead of a smear.

func ToHomogeneous

func ToHomogeneous(affine gocv.Mat) gocv.Mat

ToHomogeneous converts a 2×3 affine transformation Mat into a 3×3 homogeneous Mat. affine must be a Mat of size 2×3.

func TrimBlackBorders

func TrimBlackBorders(img gocv.Mat, threshold uint8) gocv.Mat

TrimBlackBorders crops nearly black borders from an image and saves a debug crop.

Types

type Config

type Config struct {
	// BlurResolution is the downscale factor applied before Lucas-Kanade
	// optical flow; tracking is more stable on a slightly blurred image.
	BlurResolution float64

	// Lucas-Kanade optical flow parameters. Larger windows/levels and
	// tighter criteria trade speed for sub-pixel tracking accuracy, which
	// feeds the RANSAC rotation/translation estimate.
	LKWinSize         image.Point
	LKMaxLevel        int
	LKCriteria        gocv.TermCriteria
	LKFlags           int
	LKMinEigThreshold float64

	// RANSAC parameters for estimateAffinePartial2D in AlignImages.
	RansacThreshold     int     // px reprojection error threshold for inliers
	RansacConfidence    float64 // probability the estimate is correct
	RansacMaxIterations int     // max RANSAC iterations
	RansacFlag          int

	// Corner detection (Shi-Tomasi / GoodFeaturesToTrack) feeding LK.
	MaxCorners    int     // upper bound on detected corners per frame
	CornerQuality float64 // minimum corner quality (fraction of max response)
	CornerMinDist int     // px min distance between detected corners

	// MinimalPixelColumnIndex is the first column offset swept when
	// stitching panoramas (skips the extreme edge where alignment is weakest).
	MinimalPixelColumnIndex int

	// FlattenVertical, when true, zeroes the accumulated vertical
	// translation so every frame sits in one horizontal band (no diagonal
	// wedges) — for purely horizontal pans. When false it re-centers on the
	// median vertical drift, preserving genuine vertical motion.
	FlattenVertical bool

	// YTranslationDamping scales the per-pair vertical translation (ty) of
	// each homography inside AlignImages. 1.0 is the normal no-op value;
	// use FlattenVertical to control panorama vertical layout.
	YTranslationDamping float64

	// FeatherWidth is the px width of the linear cross-fade at each strip
	// seam in stitching. 0 = hard seams; a few px hides seam tearing.
	FeatherWidth int

	// CropToCoveredBand, when true, crops the output vertically to the band
	// of rows well-covered in every panorama, removing the diagonal wedges
	// that FlattenVertical=false can leave.
	CropToCoveredBand bool

	// CoverageThreshold is the minimum fraction of non-black pixels a row
	// must have (per panorama's content width) to be kept when
	// CropToCoveredBand is enabled.
	CoverageThreshold float64

	// Output video settings. Total output frames = OutputFPS * OutputLengthInSeconds.
	OutputFPS             int
	OutputLengthInSeconds int

	// Default I/O directories for the GenerateVideos convenience wrapper.
	InputDir  string
	OutputDir string

	// Concurrency guardrails. MaxWorkers caps goroutines per parallel stage
	// (0 = auto = runtime.NumCPU()). VideoConcurrency caps how many videos
	// are processed at once (default 1 = lightest on memory).
	MaxWorkers       int
	VideoConcurrency int
}

Config holds every tunable parameter for the mosaic pipeline. Start from DefaultConfig and override only the fields you care about, then pass the result to the Generate* functions. The zero value is NOT usable — always build on DefaultConfig.

func DefaultConfig

func DefaultConfig() Config

DefaultConfig returns the tuned baseline configuration. Override fields on the returned value to customise the pipeline.

type Direction

type Direction string

Direction is the dominant direction of camera motion across a clip. It is detected during alignment and used to orient frames before stitching.

const (
	Left  Direction = "left"
	Right Direction = "right"
	Up    Direction = "up"
	Down  Direction = "down"
)

func AlignImages

func AlignImages(img1, img2 gocv.Mat, calcDirection bool, cfg Config, lg *Logger) (*gocv.Mat, Direction)

AlignImages aligns img2 to img1 using Shi-Tomasi corner detection + Lucas-Kanade optical flow + RANSAC affine. Returns a 3×3 homogeneous Mat with horizontal-only motion (no rotation/skew, unit scale, Y-damped per config) and the motion direction.

func CalcMotionDirection

func CalcMotionDirection(pts1, pts2 []gocv.Point2f) Direction

CalcMotionDirection estimates the dominant motion direction from two corresponding slices of points.

func DetectMotionDirection

func DetectMotionDirection(frames []gocv.Mat, cfg Config, lg *Logger) Direction

DetectMotionDirection detects the dominant motion direction in a video.

type Kind

type Kind int

Kind selects how the swept panoramas are turned into an output video.

const (
	// Static renders one panorama swept across column offsets and plays it
	// forward then reversed (ping-pong) for a seamless loop.
	Static Kind = iota

	// Dynamic renders the swept panoramas and plays them forward once — a
	// time-evolving "video brush" mosaic. This is the real dynamic path
	// from the reference implementation (the old code only changed the
	// output filename).
	Dynamic
)

func (Kind) String

func (k Kind) String() string

String implements fmt.Stringer and doubles as the output file basename.

type Logger

type Logger struct {
	// contains filtered or unexported fields
}

Logger is the pipeline's verbosity-gated logging surface. The library never logs on its own: a caller builds a Logger with NewLogger, passing their own *slog.Logger and whether verbose logging is on, then hands it to the Generate* functions.

Every method funnels through enabled(), so a nil *Logger, a nil underlying logger, or verbose=false all make logging a no-op. That makes passing a Logger always safe (including nil) and logging fully opt-in.

func NewLogger

func NewLogger(logger *slog.Logger, verbose bool) *Logger

NewLogger returns a verbosity-gated logger wrapping the caller's *slog.Logger. If logger is nil or verbose is false, every method is a no-op. The returned *Logger is safe to pass and call even when nil.

func (*Logger) Debug

func (l *Logger) Debug(msg string, args ...any)

Debug logs at debug level when verbose logging is enabled, else no-ops.

func (*Logger) Error

func (l *Logger) Error(msg string, args ...any)

Error logs at error level when verbose logging is enabled, else no-ops. (Errors are also returned to the caller; this is diagnostic only.)

func (*Logger) Info

func (l *Logger) Info(msg string, args ...any)

Info logs at info level when verbose logging is enabled, else no-ops.

func (*Logger) Warn

func (l *Logger) Warn(msg string, args ...any)

Warn logs at warn level when verbose logging is enabled, else no-ops.

func (*Logger) With

func (l *Logger) With(args ...any) *Logger

With returns a Logger that adds the given key/value context to every subsequent message, preserving the verbose setting. It is nil-safe.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL