go-embeddings

This project provides an implementation of API clients for fetching embeddings from various LLM providers.
Currently supported APIs:
You can find sample programs that demonstrate how to use the client packages to fetch the embeddings in cmd
directory of this project.
Finally, the document
package provides an implementation of simple document text splitters, heavily inspired by the popular Langchain framework.
It's essentially a Go rewrite of character and recursive character text splitters from the Langchain framework with minor modifications, but more or less identical results.
Environment variables
[!NOTE]
Each client package lets you initialize a default API client for a specific embeddings provider by reading the API keys from environment variables
Here's a list of the env vars for each supported client
OpenAI
OPENAI_API_KEY
: Open AI API token
Cohere
COHERE_API_KEY
: Cohere API token
Google Vertex AI
VERTEXAI_TOKEN
: Google Vertex AI API token (can be fetch by gcloud auth print-access-token
once you've authenticated)
VERTEXAI_MODEL_ID
: Embeddings model (at the moment only textembedding-gecko@00
or multimodalembedding@001
are available)
GOOGLE_PROJECT_ID
: Google Project ID
VOYAGE_API_KEY
: VoyageAI API key
Voyage
VOYAGE_API_KEY
: Voyage AI API key
AWS Bedrock
[!IMPORTANT]
You must enable access to Bedrock embedding models
See here: https://docs.aws.amazon.com/bedrock/latest/userguide/model-access.html#add-model-access
Usual AWS env vars as read by the AWS SDKs i.e. AWS_ACCESS_KEY_ID
, AWS_SECRET_ACCESS_KEY
, etc.
nix
The project provides a simple nix
flake tha leverages gomod2nix for consistent Go environments and builds.
To get started just run
nix develop
And you'll be dropped into development shell.
In addition, each command is exposed as a nix
app so you can run them as follows:
nix run ".#vertexai" -- -help
NOTE: gomod2nix
vendors dependencies into nix
store so every time you add a new dependency you must run gomod2nix generate
that updates gomod2nix.toml
Contributions
Yes please!