Documentation
¶
Overview ¶
Package imgfeed loads images from files, byte slices, readers, URLs or image.Image values and normalizes them into a small, provider-agnostic Image that is ready to "feed" to a multimodal LLM.
It removes the three menial steps every Go LLM SDK leaves to the caller: reading the bytes, detecting the MIME type, and assembling the "data:<mime>;base64,<...>" URL. On top of that it can optionally downscale an image to a pixel or byte budget (see WithMaxDim and WithMaxBytes) and estimate the number of input tokens it will cost (see Image.EstimateTokens).
The core package has no LLM SDK dependencies. To turn an Image into a content part for a specific SDK, import one of the adapter subpackages:
- github.com/ultramcu/go-imgfeed/sashadapter (sashabaranov/go-openai)
- github.com/ultramcu/go-imgfeed/openaidapter (openai/openai-go)
- github.com/ultramcu/go-imgfeed/lcadapter (tmc/langchaingo)
- github.com/ultramcu/go-imgfeed/anthropicadapter (anthropics/anthropic-sdk-go)
- github.com/ultramcu/go-imgfeed/genaidapter (google.golang.org/genai, Gemini)
Each adapter imports only its own SDK, so importing the core (or one adapter) never pulls in the others.
Basic usage:
img, err := imgfeed.FromFile("photo.png",
imgfeed.WithMaxDim(1024),
imgfeed.WithDetail(imgfeed.High))
if err != nil {
// handle error
}
url := img.DataURL() // ready for any image_url field
cost := img.EstimateTokens("gpt-4o") // approximate input tokens
Index ¶
- Variables
- type Detail
- type Format
- type Image
- func FromBytes(b []byte, opts ...Option) (*Image, error)
- func FromFile(path string, opts ...Option) (*Image, error)
- func FromImage(img image.Image, opts ...Option) (*Image, error)
- func FromReader(r io.Reader, opts ...Option) (*Image, error)
- func FromURL(ctx context.Context, rawURL string, opts ...Option) (*Image, error)
- type Option
Examples ¶
Constants ¶
This section is empty.
Variables ¶
var ( // ErrEmpty is returned when the source contains no data. ErrEmpty = errors.New("imgfeed: empty image data") // ErrNotImage is returned when the data is not a recognized image type. ErrNotImage = errors.New("imgfeed: data is not a recognized image type") )
Errors returned by the loaders.
Functions ¶
This section is empty.
Types ¶
type Detail ¶
type Detail string
Detail mirrors the OpenAI image "detail" hint, which controls how much detail the model extracts from an image and therefore how many tokens it costs. It is carried on the resulting Image, applied by the SDK adapters that support it, and used by Image.EstimateTokens.
type Format ¶
type Format string
Format is an output image encoding used when an image must be re-encoded (because of resizing, a byte budget, or an explicit conversion) or when it is built from an image.Image. Only the lossless PNG and lossy JPEG encoders are supported.
type Image ¶
type Image struct {
// Data holds the encoded image bytes (possibly re-encoded by resizing).
Data []byte
// MIME is the image media type, e.g. "image/png".
MIME string
// Width and Height are the pixel dimensions, or 0 if they could not be
// determined.
Width, Height int
// Detail is the resolved detail hint (see [WithDetail]).
Detail Detail
}
Image is a normalized, ready-to-send image: the encoded bytes plus the detected MIME type, decoded dimensions (0 if unknown) and the chosen detail hint. Use Image.DataURL for a value to drop into any image_url field, or one of the adapter subpackages to build an SDK-specific content part.
func FromFile ¶
FromFile loads an image from a file on disk. The file name is used as a fallback for MIME detection when the bytes themselves are ambiguous.
Example ¶
Load an image from disk, downscale it to fit a token budget, and inspect the result. The DataURL is ready to drop into any provider's image_url field; the adapter subpackages turn the Image into an SDK content part.
img, err := imgfeed.FromFile("photo.png",
imgfeed.WithMaxDim(1024),
imgfeed.WithDetail(imgfeed.High))
if err != nil {
// handle error
return
}
_ = img.DataURL() // "data:image/png;base64,..." for any image_url field
fmt.Println(img.MIME, img.EstimateTokens("gpt-4o"))
func FromImage ¶
FromImage encodes an in-memory image.Image. The output format defaults to PNG and can be set with WithFormat. Resizing and byte-budget options apply as usual.
func FromReader ¶
FromReader loads an image by reading r to completion.
func FromURL ¶
FromURL fetches an image over HTTP(S) and loads it. The request honors ctx and the client set by WithHTTPClient (default http.DefaultClient).
Example ¶
Fetch a remote image and estimate what it will cost as input.
img, err := imgfeed.FromURL(context.Background(),
"https://example.com/cat.jpg",
imgfeed.WithMaxDim(768))
if err != nil {
return
}
fmt.Printf("%dx%d ~%d tokens\n", img.Width, img.Height, img.EstimateTokens("gpt-4o"))
func (*Image) DataURL ¶
DataURL returns the image as an RFC 2397 data URL, "data:<mime>;base64,<...>", suitable for any image_url field.
func (*Image) EstimateTokens ¶
EstimateTokens returns an approximate number of input tokens the image will cost for the given model, using OpenAI's tile-based image formula:
- Low detail costs a flat base amount.
- High/Auto detail scales the image to fit a 2048x2048 box, then so its shortest side is 768px, and charges base + perTile per 512px tile.
It is an estimate; actual usage may differ slightly and varies by model. Unknown models fall back to the gpt-4o cost, and models whose name contains "mini" or "nano" use the scaled-up tier. If the image dimensions are unknown, the base cost is returned.
type Option ¶
type Option func(*config)
Option customizes how an image is loaded and normalized.
func WithDetail ¶
WithDetail sets the OpenAI detail hint (Auto, Low or High). The default is Auto. The value is stored on the Image, forwarded by the adapters that support it, and used by Image.EstimateTokens.
func WithFormat ¶
WithFormat forces the output encoding (PNG or JPEG); the image is always re-encoded to this format. When unset, original bytes are preserved unless a resize or byte budget forces a re-encode, in which case the source format is kept where possible (otherwise PNG).
func WithHTTPClient ¶
WithHTTPClient sets the HTTP client used by FromURL. It defaults to http.DefaultClient.
func WithJPEGQuality ¶
WithJPEGQuality sets the JPEG quality (1-100) used when encoding to JPEG. The default is 85. Out-of-range values are reset to 85.
func WithMIME ¶
WithMIME overrides MIME detection, e.g. when the bytes carry a format whose signature is not auto-detected.
func WithMaxBytes ¶
WithMaxBytes ensures the encoded image stays at or below n bytes by progressively lowering JPEG quality and/or downscaling. It is best effort: if the floor is reached the smallest attempt is returned. A value <= 0 (the default) disables the limit.
func WithMaxDim ¶
WithMaxDim downscales the image so that neither side exceeds px pixels, preserving the aspect ratio. Images already within the bound are left untouched. A value <= 0 (the default) disables resizing.
Directories
¶
| Path | Synopsis |
|---|---|
|
Package anthropicadapter converts an imgfeed.Image into content blocks for the official SDK github.com/anthropics/anthropic-sdk-go.
|
Package anthropicadapter converts an imgfeed.Image into content blocks for the official SDK github.com/anthropics/anthropic-sdk-go. |
|
Package genaidapter turns an imgfeed.Image into a content part for the Google Gen AI SDK (google.golang.org/genai), used by Gemini models.
|
Package genaidapter turns an imgfeed.Image into a content part for the Google Gen AI SDK (google.golang.org/genai), used by Gemini models. |
|
Package lcadapter converts an imgfeed.Image into content parts for the LLM framework github.com/tmc/langchaingo.
|
Package lcadapter converts an imgfeed.Image into content parts for the LLM framework github.com/tmc/langchaingo. |
|
Package openaidapter adapts an imgfeed.Image into content parts for the official OpenAI Go SDK (github.com/openai/openai-go/v3), targeting the Chat Completions multimodal message format.
|
Package openaidapter adapts an imgfeed.Image into content parts for the official OpenAI Go SDK (github.com/openai/openai-go/v3), targeting the Chat Completions multimodal message format. |
|
Package sashadapter converts an imgfeed.Image into content parts for the community SDK github.com/sashabaranov/go-openai.
|
Package sashadapter converts an imgfeed.Image into content parts for the community SDK github.com/sashabaranov/go-openai. |