ocr

package module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 24, 2026 License: MIT Imports: 12 Imported by: 0

README

togo

togo-framework/ocr

marketplace pkg.go.dev MIT

Part of the togo framework.

Install

togo install togo-framework/ocr

togo

togo · ocr

Image → text for togo apps. A swappable OCR driver, a Go API, and a REST endpoint. Built to sit under the togo AI part.

togo install togo-framework/ocr

Pick the engine with OCR_DRIVER:

Driver How Notes
tesseract (default) the local tesseract binary (stdin→stdout) real OCR, offline; install Tesseract on the host. Options.lang (e.g. eng, ara, eng+ara).
ai the togo ai plugin (a multimodal model) best-effort — passes the image as a data-URL prompt. Needs a vision-capable provider; see the note below.

Use it

Go:

import "github.com/togo-framework/ocr"

svc, _ := ocr.FromKernel(k)
text, err := svc.Extract(ctx, imageBytes, ocr.Options{Lang: "eng"})

RESTPOST /api/ocr:

# multipart
curl -X POST localhost:8080/api/ocr -F image=@scan.png
# or JSON (base64 or data-URL)
curl -X POST localhost:8080/api/ocr -H 'content-type: application/json' \
  -d '{"image":"<base64>","options":{"lang":"eng"}}'
# → {"text":"…"}

Add an engine

Implement ocr.Extractor and ocr.RegisterDriver("paddle", factory) in your plugin's init().

AI driver note: the ai plugin's Message.Content is currently plain text, so the image is embedded as a data-URL in the prompt. Robust vision OCR needs a multimodal provider whose driver forwards image data-URLs — or an enhancement to the ai plugin's Message to carry structured image parts. For reliable offline OCR, use tesseract.

MIT © togo


Premium sponsors

ID8 Media  ·  One Studio

Support togo — become a sponsor.

Documentation

Overview

Package ocr is togo's OCR (image→text) plugin. It exposes a swappable Extractor driver, a Go API ocr.Extract(...), and a REST endpoint POST /api/ocr. Drivers register via ocr.RegisterDriver; pick one with OCR_DRIVER.

  • "tesseract" (default): real OCR via the local `tesseract` binary.
  • "ai": uses the togo `ai` plugin (a multimodal model) — best-effort; see README.

Install: `togo install togo-framework/ocr` (blank-import registers it).

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func RegisterDriver

func RegisterDriver(name string, f DriverFactory)

RegisterDriver registers an OCR engine by name (call from a plugin's init()).

Types

type DriverFactory

type DriverFactory func(k *togo.Kernel) (Extractor, error)

DriverFactory builds an Extractor from the kernel (env-configured).

type Extractor

type Extractor interface {
	Extract(ctx context.Context, image []byte, opts Options) (string, error)
}

Extractor turns image bytes into text.

type Options

type Options struct {
	// Lang is a tesseract language code (e.g. "eng", "ara", "eng+ara"). Default "eng".
	Lang string `json:"lang,omitempty"`
	// Mime hints the image type for the ai driver (e.g. "image/png"). Default png.
	Mime string `json:"mime,omitempty"`
}

Options tune an extraction.

type Service

type Service struct {
	// contains filtered or unexported fields
}

Service is the ocr runtime stored on the kernel (k.Get("ocr")).

func FromKernel

func FromKernel(k *togo.Kernel) (*Service, bool)

FromKernel fetches the ocr service from the kernel container.

func (*Service) Driver

func (s *Service) Driver() string

Driver returns the active engine name.

func (*Service) Extract

func (s *Service) Extract(ctx context.Context, image []byte, opts Options) (string, error)

Extract runs OCR on the image bytes.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL