openairt

package module

v0.5.0 Latest Latest Go to latest Published: Feb 8, 2025 License: MIT Imports: 13 Imported by: 6

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/WqyJh/go-openai-realtime

Links

Open Source Insights

README ¶

OpenAI Realtime API SDK for Golang

This library provides unofficial Go clients for OpenAI Realtime API. We support all 9 client events and 28 server events.

Model Support:

gpt-4o-realtime-preview
gpt-4o-realtime-preview-2024-10-01

Installation

go get github.com/WqyJh/go-openai-realtime

Currently, go-openai-realtime requires Go version 1.19 or greater.

Usage

Connect to the OpenAI Realtime API. The default websocket library is coder/websocket.

package main

import (
	"bufio"
	"context"
	"fmt"
	"log"
	"os"

	openairt "github.com/WqyJh/go-openai-realtime"
)

func main() {
	client := openairt.NewClient("your-api-key")
	conn, err := client.Connect(context.Background())
	if err != nil {
		log.Fatal(err)
	}
	defer conn.Close()
}

Use another WebSocket dialer

Switch to another websocket dialer gorilla/websocket.

import (
	openairt "github.com/WqyJh/go-openai-realtime"
	gorilla "github.com/WqyJh/go-openai-realtime/contrib/ws-gorilla"
)

func main() {
	dialer := gorilla.NewWebSocketDialer(gorilla.WebSocketOptions{})
	conn, err := client.Connect(ctx, openairt.WithDialer(dialer))
	if err != nil {
		log.Fatal(err)
	}
	defer conn.Close()

Send message


import (
	openairt "github.com/WqyJh/go-openai-realtime"
)

func main() {
    err = conn.SendMessage(ctx, &openairt.SessionUpdateEvent{
        Session: openairt.ClientSession{
            Modalities: []openairt.Modality{openairt.ModalityText},
        },
    })
}

Read message

ReadMessage is a blocking method that reads the next message from the connection. It should be called in a standalone goroutine because it's blocking. If the returned error is Permanent, the future read operations on the same connection will not succeed, that means the connection is broken and should be closed or had already been closed.

	for {
		msg, err := c.conn.ReadMessage(ctx)
		if err != nil {
			var permanent *PermanentError
			if errors.As(err, &permanent) {
				return permanent.Err
			}
			c.conn.logger.Warnf("read message temporary error: %+v", err)
			continue
		}
		// handle message
	}

Subscribe to events

ConnHandler is a helper that reads messages from the server in a standalone goroutine and calls the registered handlers.

Call openairt.NewConnHandler to create a ConnHandler, then call Start to start a new goroutine to read messages. You can specify only one handler to handle all events or specify multiple handlers. It's recommended to specify multiple handlers for different purposes. The registered handlers will be called in the order of registration.

	connHandler := openairt.NewConnHandler(ctx, conn, handler1, handler2, ...)
	connHandler.Start()

A handler is function that handle ServerEvent. Use event.ServerEventType() to determine the type of the event. Based on the event type, you can get the event data by type assertion.

	// Teletype response
	responseDeltaHandler := func(ctx context.Context, event openairt.ServerEvent) {
		switch event.ServerEventType() {
		case openairt.ServerEventTypeResponseTextDelta:
			fmt.Printf(event.(openairt.ResponseTextDeltaEvent).Delta)
		}
	}

There's no need to Stop the ConnHandler, it will exit when the connection is closed. If you want to wait for the ConnHandler to exit, you can use Err(). This will return a channel to receive the error.

Note that the receive of the error channel is blocking, so make sure not to call conn.Close after it, which cause deadlock.

    conn.Close()
	err = <-connHandler.Err()
	if err != nil {
		log.Printf("connection error: %v", err)
	}

More examples

WebSocket Adapter

The default WebSocket adapter is coder/websocket. There's also a gorilla/websocket adapter. You can easily implement your own adapter by implementing the WebSocketConn interface and WebSocketDialer interface.

Supported adapters:

Documentation ¶

Index ¶

Constants
Variables
func GenerateID(prefix string, length int) string
func HTTPDo[Q any, R any](ctx context.Context, url string, req *Q, opts ...HTTPOption) (*R, error)
func MarshalClientEvent(event ClientEvent) ([]byte, error)
func Permanent(err error) error
type APIType
type AudioFormat
type CachedTokensDetails
type Client
- func NewClient(authToken string) *Client
- func NewClientWithConfig(config ClientConfig) *Client
- func (c *Client) Connect(ctx context.Context, opts ...ConnectOption) (*Conn, error)
- func (c *Client) CreateSession(ctx context.Context, req *CreateSessionRequest) (*CreateSessionResponse, error)
type ClientConfig
- func DefaultAzureConfig(apiKey, baseURL string) ClientConfig
- func DefaultConfig(authToken string) ClientConfig
- func (c ClientConfig) String() string
type ClientEvent
type ClientEventType
type ClientSecret
type ClientSession
type ClientTurnDetection
type ClientTurnDetectionType
type CoderWebSocketConn
- func (c *CoderWebSocketConn) Close() error
- func (c *CoderWebSocketConn) Ping(ctx context.Context) error
- func (c *CoderWebSocketConn) ReadMessage(ctx context.Context) (MessageType, []byte, error)
- func (c *CoderWebSocketConn) Response() *http.Response
- func (c *CoderWebSocketConn) WriteMessage(ctx context.Context, messageType MessageType, data []byte) error
type CoderWebSocketDialer
- func NewCoderWebSocketDialer(options CoderWebSocketOptions) *CoderWebSocketDialer
- func (d *CoderWebSocketDialer) Dial(ctx context.Context, url string, header http.Header) (WebSocketConn, error)
type CoderWebSocketOptions
type Conn
- func (c *Conn) Close() error
- func (c *Conn) Ping(ctx context.Context) error
- func (c *Conn) ReadMessage(ctx context.Context) (ServerEvent, error)
- func (c *Conn) ReadMessageRaw(ctx context.Context) ([]byte, error)
- func (c *Conn) SendMessage(ctx context.Context, msg ClientEvent) error
- func (c *Conn) SendMessageRaw(ctx context.Context, data []byte) error
type ConnHandler
- func NewConnHandler(ctx context.Context, conn *Conn, handlers ...ServerEventHandler) *ConnHandler
- func (c *ConnHandler) Err() <-chan error
- func (c *ConnHandler) Start()
type ConnectOption
- func WithDialer(dialer WebSocketDialer) ConnectOption
- func WithLogger(logger Logger) ConnectOption
- func WithModel(model string) ConnectOption
type Conversation
type ConversationCreatedEvent
type ConversationItemCreateEvent
- func (m ConversationItemCreateEvent) ClientEventType() ClientEventType
- func (m ConversationItemCreateEvent) MarshalJSON() ([]byte, error)
type ConversationItemCreatedEvent
type ConversationItemDeleteEvent
- func (m ConversationItemDeleteEvent) ClientEventType() ClientEventType
- func (m ConversationItemDeleteEvent) MarshalJSON() ([]byte, error)
type ConversationItemDeletedEvent
type ConversationItemInputAudioTranscriptionCompletedEvent
type ConversationItemInputAudioTranscriptionFailedEvent
type ConversationItemTruncateEvent
- func (m ConversationItemTruncateEvent) ClientEventType() ClientEventType
- func (m ConversationItemTruncateEvent) MarshalJSON() ([]byte, error)
type ConversationItemTruncatedEvent
type CreateSessionRequest
type CreateSessionResponse
type Error
type ErrorEvent
type EventBase
type HTTPOption
- func WithClient(client *http.Client) HTTPOption
- func WithHeaders(headers http.Header) HTTPOption
- func WithMethod(method string) HTTPOption
type InputAudioBufferAppendEvent
- func (m InputAudioBufferAppendEvent) ClientEventType() ClientEventType
- func (m InputAudioBufferAppendEvent) MarshalJSON() ([]byte, error)
type InputAudioBufferClearEvent
- func (m InputAudioBufferClearEvent) ClientEventType() ClientEventType
- func (m InputAudioBufferClearEvent) MarshalJSON() ([]byte, error)
type InputAudioBufferClearedEvent
type InputAudioBufferCommitEvent
- func (m InputAudioBufferCommitEvent) ClientEventType() ClientEventType
- func (m InputAudioBufferCommitEvent) MarshalJSON() ([]byte, error)
type InputAudioBufferCommittedEvent
type InputAudioBufferSpeechStartedEvent
type InputAudioBufferSpeechStoppedEvent
type InputAudioTranscription
type InputTokenDetails
type IntOrInf
- func (m IntOrInf) IsInf() bool
- func (m IntOrInf) MarshalJSON() ([]byte, error)
- func (m *IntOrInf) UnmarshalJSON(data []byte) error
type ItemStatus
type Logger
type MessageContentPart
type MessageContentType
type MessageItem
type MessageItemType
type MessageRole
type MessageType
type Modality
type NopLogger
- func (l NopLogger) Errorf(_ string, _ ...any)
- func (l NopLogger) Warnf(_ string, _ ...any)
type OutputTokenDetails
type PermanentError
- func (e *PermanentError) Error() string
- func (e *PermanentError) Is(target error) bool
- func (e *PermanentError) Unwrap() error
type RateLimit
type RateLimitsUpdatedEvent
type Response
type ResponseAudioDeltaEvent
type ResponseAudioDoneEvent
type ResponseAudioTranscriptDeltaEvent
type ResponseAudioTranscriptDoneEvent
type ResponseCancelEvent
- func (m ResponseCancelEvent) ClientEventType() ClientEventType
- func (m ResponseCancelEvent) MarshalJSON() ([]byte, error)
type ResponseContentPartAddedEvent
type ResponseContentPartDoneEvent
type ResponseCreateEvent
- func (m ResponseCreateEvent) ClientEventType() ClientEventType
- func (m ResponseCreateEvent) MarshalJSON() ([]byte, error)
type ResponseCreateParams
type ResponseCreatedEvent
type ResponseDoneEvent
type ResponseFunctionCallArgumentsDeltaEvent
type ResponseFunctionCallArgumentsDoneEvent
type ResponseMessageItem
type ResponseOutputItemAddedEvent
type ResponseOutputItemDoneEvent
type ResponseStatus
type ResponseTextDeltaEvent
type ResponseTextDoneEvent
type ServerEvent
- func UnmarshalServerEvent(data []byte) (ServerEvent, error)
type ServerEventBase
- func (m ServerEventBase) ServerEventType() ServerEventType
type ServerEventHandler
type ServerEventInterface
type ServerEventType
type ServerSession
type ServerToolChoice
- func (m ServerToolChoice) Get() ToolChoiceInterface
- func (m *ServerToolChoice) IsFunction() bool
- func (m *ServerToolChoice) UnmarshalJSON(data []byte) error
type ServerTurnDetection
type ServerTurnDetectionType
type SessionCreatedEvent
type SessionUpdateEvent
- func (m SessionUpdateEvent) ClientEventType() ClientEventType
- func (m SessionUpdateEvent) MarshalJSON() ([]byte, error)
type SessionUpdatedEvent
type StdLogger
- func (l StdLogger) Errorf(format string, v ...any)
- func (l StdLogger) Warnf(format string, v ...any)
type Tool
type ToolChoice
- func (t ToolChoice) ToolChoice()
type ToolChoiceInterface
type ToolChoiceString
- func (ToolChoiceString) ToolChoice()
type ToolFunction
type ToolType
type TurnDetectionParams
type TurnDetectionType
type Usage
type Voice
type WebSocketConn
type WebSocketDialer
- func DefaultDialer() WebSocketDialer

Constants ¶

View Source

const (
	GPT4oRealtimePreview             = "gpt-4o-realtime-preview"
	GPT4oRealtimePreview20241001     = "gpt-4o-realtime-preview-2024-10-01"
	GPT4oRealtimePreview20241217     = "gpt-4o-realtime-preview-2024-12-17"
	GPT4oMiniRealtimePreview         = "gpt-4o-mini-realtime-preview"
	GPT4oMiniRealtimePreview20241217 = "gpt-4o-mini-realtime-preview-2024-12-17"
)

View Source

const (
	// OpenaiRealtimeAPIURLv1 is the base URL for the OpenAI Realtime API.
	OpenaiRealtimeAPIURLv1 = "wss://api.openai.com/v1/realtime"

	// OpenaiAPIURLv1 is the base URL for the OpenAI API.
	OpenaiAPIURLv1 = "https://api.openai.com/v1"
)

Variables ¶

View Source

var (
	ErrUnsupportedMessageType = errors.New("unsupported message type")
)

Functions ¶

func GenerateID ¶

func GenerateID(prefix string, length int) string

GenerateID generates a random ID with a prefix and a specified length. The length of the returned ID is equal to the length parameter, therefore the prefix must be shorter than the length.

func HTTPDo ¶ added in v0.5.0

func HTTPDo[Q any, R any](ctx context.Context, url string, req *Q, opts ...HTTPOption) (*R, error)

func MarshalClientEvent ¶

func MarshalClientEvent(event ClientEvent) ([]byte, error)

MarshalClientEvent marshals the client event to JSON.

func Permanent ¶ added in v0.2.0

func Permanent(err error) error

Permanent wraps the given err in a *PermanentError.

Types ¶

type APIType ¶

type APIType string

APIType is the type of API.

const (
	// APITypeOpenAI is the type of API for OpenAI.
	APITypeOpenAI APIType = "OPEN_AI"
	// APITypeAzure is the type of API for Azure.
	APITypeAzure APIType = "AZURE"
)

type AudioFormat ¶

type AudioFormat string

const (
	AudioFormatPcm16    AudioFormat = "pcm16"
	AudioFormatG711Ulaw AudioFormat = "g711_ulaw"
	AudioFormatG711Alaw AudioFormat = "g711_alaw"
)

type CachedTokensDetails ¶ added in v0.3.0

type CachedTokensDetails struct {
	TextTokens  int `json:"text_tokens"`
	AudioTokens int `json:"audio_tokens"`
}

type Client ¶

type Client struct {
	// contains filtered or unexported fields
}

Client is OpenAI Realtime API client.

func NewClient ¶

func NewClient(authToken string) *Client

NewClient creates new OpenAI Realtime API client.

func NewClientWithConfig ¶

func NewClientWithConfig(config ClientConfig) *Client

NewClientWithConfig creates new OpenAI Realtime API client for specified config.

func (*Client) Connect ¶

func (c *Client) Connect(ctx context.Context, opts ...ConnectOption) (*Conn, error)

Connect connects to the OpenAI Realtime API.

func (*Client) CreateSession ¶ added in v0.5.0

func (c *Client) CreateSession(ctx context.Context, req *CreateSessionRequest) (*CreateSessionResponse, error)

type ClientConfig ¶

type ClientConfig struct {
	BaseURL    string  // Base URL for the API. Defaults to "wss://api.openai.com/v1/realtime"
	APIBaseURL string  // Base URL for the API. Defaults to "https://api.openai.com/v1"
	APIType    APIType // API type. Defaults to APITypeOpenAI
	APIVersion string  // required when APIType is APITypeAzure
	HTTPClient *http.Client
	// contains filtered or unexported fields
}

ClientConfig is the configuration for the client.

func DefaultAzureConfig ¶

func DefaultAzureConfig(apiKey, baseURL string) ClientConfig

DefaultAzureConfig creates a new ClientConfig with the given auth token and base URL. Defaults to using the Azure Realtime API.

func DefaultConfig ¶

func DefaultConfig(authToken string) ClientConfig

DefaultConfig creates a new ClientConfig with the given auth token. Defaults to using the OpenAI Realtime API.

func (ClientConfig) String ¶

func (c ClientConfig) String() string

String returns a string representation of the ClientConfig.

type ClientEvent ¶

type ClientEvent interface {
	ClientEventType() ClientEventType
}

ClientEvent is the interface for client event.

type ClientEventType ¶

type ClientEventType string

ClientEventType is the type of client event. See https://platform.openai.com/docs/guides/realtime/client-events

const (
	ClientEventTypeSessionUpdate            ClientEventType = "session.update"
	ClientEventTypeInputAudioBufferAppend   ClientEventType = "input_audio_buffer.append"
	ClientEventTypeInputAudioBufferCommit   ClientEventType = "input_audio_buffer.commit"
	ClientEventTypeInputAudioBufferClear    ClientEventType = "input_audio_buffer.clear"
	ClientEventTypeConversationItemCreate   ClientEventType = "conversation.item.create"
	ClientEventTypeConversationItemTruncate ClientEventType = "conversation.item.truncate"
	ClientEventTypeConversationItemDelete   ClientEventType = "conversation.item.delete"
	ClientEventTypeResponseCreate           ClientEventType = "response.create"
	ClientEventTypeResponseCancel           ClientEventType = "response.cancel"
)

type ClientSecret ¶ added in v0.5.0

type ClientSecret struct {
	// Ephemeral key usable in client environments to authenticate connections to the Realtime API. Use this in client-side environments rather than a standard API token, which should only be used server-side.
	Value string `json:"value"`
	// Timestamp for when the token expires. Currently, all tokens expire after one minute.
	ExpiresAt int64 `json:"expires_at"`
}

type ClientSession ¶

type ClientSession struct {
	// The set of modalities the model can respond with. To disable audio, set this to ["text"].
	Modalities []Modality `json:"modalities,omitempty"`
	// The default system instructions prepended to model calls.
	Instructions string `json:"instructions,omitempty"`
	// The voice the model uses to respond - one of alloy, echo, or shimmer. Cannot be changed once the model has responded with audio at least once.
	Voice Voice `json:"voice,omitempty"`
	// The format of input audio. Options are "pcm16", "g711_ulaw", or "g711_alaw".
	InputAudioFormat AudioFormat `json:"input_audio_format,omitempty"`
	// The format of output audio. Options are "pcm16", "g711_ulaw", or "g711_alaw".
	OutputAudioFormat AudioFormat `json:"output_audio_format,omitempty"`
	// Configuration for input audio transcription. Can be set to `nil` to turn off.
	InputAudioTranscription *InputAudioTranscription `json:"input_audio_transcription,omitempty"`
	// Configuration for turn detection. Can be set to `nil` to turn off.
	TurnDetection *ClientTurnDetection `json:"turn_detection"`
	// Tools (functions) available to the model.
	Tools []Tool `json:"tools,omitempty"`
	// How the model chooses tools. Options are "auto", "none", "required", or specify a function.
	ToolChoice ToolChoiceInterface `json:"tool_choice,omitempty"`
	// Sampling temperature for the model.
	Temperature *float32 `json:"temperature,omitempty"`
	// Maximum number of output tokens for a single assistant response, inclusive of tool calls. Provide an integer between 1 and 4096 to limit output tokens, or "inf" for the maximum available tokens for a given model. Defaults to "inf".
	MaxOutputTokens IntOrInf `json:"max_response_output_tokens,omitempty"`
}

type ClientTurnDetection ¶

type ClientTurnDetection struct {
	// Type of turn detection, only "server_vad" is currently supported.
	Type ClientTurnDetectionType `json:"type"`

	TurnDetectionParams
}

type ClientTurnDetectionType ¶

type ClientTurnDetectionType string

const (
	ClientTurnDetectionTypeServerVad ClientTurnDetectionType = "server_vad"
)

type CoderWebSocketConn ¶ added in v0.2.0

type CoderWebSocketConn struct {
	// contains filtered or unexported fields
}

CoderWebSocketConn is a WebSocket connection implementation based on coder/websocket.

func (*CoderWebSocketConn) Close ¶ added in v0.2.0

func (c *CoderWebSocketConn) Close() error

Close closes the WebSocket connection.

func (*CoderWebSocketConn) Ping ¶ added in v0.4.0

func (c *CoderWebSocketConn) Ping(ctx context.Context) error

Ping sends a ping message to the WebSocket connection. It would be blocked until the pong message is received or the ctx is done.

func (*CoderWebSocketConn) ReadMessage ¶ added in v0.2.0

func (c *CoderWebSocketConn) ReadMessage(ctx context.Context) (MessageType, []byte, error)

ReadMessage reads a message from the WebSocket connection.

The ctx could be used to cancel the read operation. If the ctx is canceled or timedout, the read operation will be canceled and the connection will be closed.

If the returned error is Permanent, the future read operations on the same connection will not succeed.

func (*CoderWebSocketConn) Response ¶ added in v0.2.0

func (c *CoderWebSocketConn) Response() *http.Response

Response returns the *http.Response of the WebSocket connection. Commonly used to get response headers.

func (*CoderWebSocketConn) WriteMessage ¶ added in v0.2.0

func (c *CoderWebSocketConn) WriteMessage(ctx context.Context, messageType MessageType, data []byte) error

WriteMessage writes a message to the WebSocket connection.

The ctx could be used to cancel the write operation. If the ctx is canceled or timedout, the write operation will be canceled and the connection will be closed.

If the returned error is Permanent, the future write operations on the same connection will not succeed.

type CoderWebSocketDialer ¶ added in v0.2.0

type CoderWebSocketDialer struct {
	// contains filtered or unexported fields
}

CoderWebSocketDialer is a WebSocket dialer implementation based on coder/websocket.

func NewCoderWebSocketDialer ¶ added in v0.2.0

func NewCoderWebSocketDialer(
	options CoderWebSocketOptions,
) *CoderWebSocketDialer

NewCoderWebSocketDialer creates a new CoderWebSocketDialer.

func (*CoderWebSocketDialer) Dial ¶ added in v0.2.0

func (d *CoderWebSocketDialer) Dial(ctx context.Context, url string, header http.Header) (WebSocketConn, error)

Dial establishes a new WebSocket connection to the given URL.

type CoderWebSocketOptions ¶ added in v0.2.0

type CoderWebSocketOptions struct {
	// ReadLimit is the maximum size of a message in bytes. -1 means no limit. Default is -1.
	ReadLimit int64
	// DialOptions is the options to pass to the websocket.Dial function.
	DialOptions *websocket.DialOptions
}

CoderWebSocketOptions is the options for CoderWebSocketConn.

type Conn ¶

type Conn struct {
	// contains filtered or unexported fields
}

Conn is a connection to the OpenAI Realtime API.

func (*Conn) Close ¶

func (c *Conn) Close() error

Close closes the connection.

func (*Conn) Ping ¶ added in v0.4.0

func (c *Conn) Ping(ctx context.Context) error

Ping sends a ping message to the WebSocket connection.

func (*Conn) ReadMessage ¶

func (c *Conn) ReadMessage(ctx context.Context) (ServerEvent, error)

ReadMessage reads a server event from the server.

func (*Conn) ReadMessageRaw ¶ added in v0.3.1

func (c *Conn) ReadMessageRaw(ctx context.Context) ([]byte, error)

ReadMessageRaw reads a raw message from the server.

func (*Conn) SendMessage ¶

func (c *Conn) SendMessage(ctx context.Context, msg ClientEvent) error

SendMessage sends a client event to the server.

func (*Conn) SendMessageRaw ¶ added in v0.3.1

func (c *Conn) SendMessageRaw(ctx context.Context, data []byte) error

SendMessageRaw sends a raw message to the server.

type ConnHandler ¶

type ConnHandler struct {
	// contains filtered or unexported fields
}

ConnHandler is a handler for a connection to the OpenAI Realtime API. It reads messages from the server in a standalone goroutine and calls the registered handlers. It is the responsibility of the caller to call Start and Stop. The handlers are called in the order they are registered. Users should not call ReadMessage directly when using ConnHandler.

func NewConnHandler ¶

func NewConnHandler(ctx context.Context, conn *Conn, handlers ...ServerEventHandler) *ConnHandler

NewConnHandler creates a new ConnHandler with the given context and connection.

func (*ConnHandler) Err ¶ added in v0.2.0

func (c *ConnHandler) Err() <-chan error

Err returns a channel that receives errors from the ConnHandler. This could be used to wait for the goroutine to exit. If you don't need to wait for the goroutine to exit, there's no need to call this. This must be called after the connection is closed, otherwise it will block indefinitely.

func (*ConnHandler) Start ¶

func (c *ConnHandler) Start()

Start starts the ConnHandler.

type ConnectOption ¶

type ConnectOption func(*connectOption)

func WithDialer ¶ added in v0.2.0

func WithDialer(dialer WebSocketDialer) ConnectOption

WithDialer sets the dialer for the connection.

func WithLogger ¶ added in v0.2.0

func WithLogger(logger Logger) ConnectOption

WithLogger sets the logger for the connection.

func WithModel ¶

func WithModel(model string) ConnectOption

WithModel sets the model to use for the connection.

type Conversation ¶

type Conversation struct {
	// The unique ID of the conversation.
	ID string `json:"id"`
	// The object type, must be "realtime.conversation".
	Object string `json:"object"`
}

type ConversationCreatedEvent ¶

type ConversationCreatedEvent struct {
	ServerEventBase
	// The conversation resource.
	Conversation Conversation `json:"conversation"`
}

ConversationCreatedEvent is the event for conversation created. Returned when a conversation is created. Emitted right after session creation. See https://platform.openai.com/docs/api-reference/realtime-server-events/conversation/created

type ConversationItemCreateEvent ¶

type ConversationItemCreateEvent struct {
	EventBase
	// The ID of the preceding item after which the new item will be inserted.
	PreviousItemID string `json:"previous_item_id,omitempty"`
	// The item to add to the conversation.
	Item MessageItem `json:"item"`
}

ConversationItemCreateEvent is the event for conversation item create. Send this event when adding an item to the conversation. See https://platform.openai.com/docs/api-reference/realtime-client-events/conversation/item/create

func (ConversationItemCreateEvent) ClientEventType ¶

func (m ConversationItemCreateEvent) ClientEventType() ClientEventType

func (ConversationItemCreateEvent) MarshalJSON ¶

func (m ConversationItemCreateEvent) MarshalJSON() ([]byte, error)

type ConversationItemCreatedEvent ¶

type ConversationItemCreatedEvent struct {
	ServerEventBase
	PreviousItemID string              `json:"previous_item_id,omitempty"`
	Item           ResponseMessageItem `json:"item"`
}

type ConversationItemDeleteEvent ¶

type ConversationItemDeleteEvent struct {
	EventBase
	// The ID of the item to delete.
	ItemID string `json:"item_id"`
}

ConversationItemDeleteEvent is the event for conversation item delete. Send this event when you want to remove any item from the conversation history. See https://platform.openai.com/docs/api-reference/realtime-client-events/conversation/item/delete

func (ConversationItemDeleteEvent) ClientEventType ¶

func (m ConversationItemDeleteEvent) ClientEventType() ClientEventType

func (ConversationItemDeleteEvent) MarshalJSON ¶

func (m ConversationItemDeleteEvent) MarshalJSON() ([]byte, error)

type ConversationItemDeletedEvent ¶

type ConversationItemDeletedEvent struct {
	ServerEventBase
	ItemID string `json:"item_id"` // The ID of the item that was deleted.
}

type ConversationItemInputAudioTranscriptionCompletedEvent ¶

type ConversationItemInputAudioTranscriptionCompletedEvent struct {
	ServerEventBase
	ItemID       string `json:"item_id"`
	ContentIndex int    `json:"content_index"`
	Transcript   string `json:"transcript"`
}

type ConversationItemInputAudioTranscriptionFailedEvent ¶

type ConversationItemInputAudioTranscriptionFailedEvent struct {
	ServerEventBase
	ItemID       string `json:"item_id"`
	ContentIndex int    `json:"content_index"`
	Error        Error  `json:"error"`
}

type ConversationItemTruncateEvent ¶

type ConversationItemTruncateEvent struct {
	EventBase
	// The ID of the assistant message item to truncate.
	ItemID string `json:"item_id"`
	// The index of the content part to truncate.
	ContentIndex int `json:"content_index"`
	// Inclusive duration up to which audio is truncated, in milliseconds.
	AudioEndMs int `json:"audio_end_ms"`
}

ConversationItemTruncateEvent is the event for conversation item truncate. Send this event when you want to truncate a previous assistant message’s audio. See https://platform.openai.com/docs/api-reference/realtime-client-events/conversation/item/truncate

func (ConversationItemTruncateEvent) ClientEventType ¶

func (m ConversationItemTruncateEvent) ClientEventType() ClientEventType

func (ConversationItemTruncateEvent) MarshalJSON ¶

func (m ConversationItemTruncateEvent) MarshalJSON() ([]byte, error)

type ConversationItemTruncatedEvent ¶

type ConversationItemTruncatedEvent struct {
	ServerEventBase
	ItemID       string `json:"item_id"`       // The ID of the assistant message item that was truncated.
	ContentIndex int    `json:"content_index"` // The index of the content part that was truncated.
	AudioEndMs   int    `json:"audio_end_ms"`  // The duration up to which the audio was truncated, in milliseconds.
}

type CreateSessionRequest ¶ added in v0.5.0

type CreateSessionRequest struct {
	ClientSession

	// The Realtime model used for this session.
	Model string `json:"model"`
}

type CreateSessionResponse ¶ added in v0.5.0

type CreateSessionResponse struct {
	ServerSession

	// Ephemeral key returned by the API.
	ClientSecret ClientSecret `json:"client_secret"`
}

type Error ¶

type Error struct {
	// The type of error (e.g., "invalid_request_error", "server_error").
	Message string `json:"message,omitempty"`
	// Error code, if any.
	Type string `json:"type,omitempty"`
	// A human-readable error message.
	Code string `json:"code,omitempty"`
	// Parameter related to the error, if any.
	Param string `json:"param,omitempty"`
	// The event_id of the client event that caused the error, if applicable.
	EventID string `json:"event_id,omitempty"`
}

type ErrorEvent ¶

type ErrorEvent struct {
	ServerEventBase
	// Details of the error.
	Error Error `json:"error"`
}

ErrorEvent is the event for error. Returned when an error occurs. See https://platform.openai.com/docs/api-reference/realtime-server-events/error

type EventBase ¶

type EventBase struct {
	// Optional client-generated ID used to identify this event.
	EventID string `json:"event_id,omitempty"`
}

EventBase is the base struct for all client events.

type HTTPOption ¶ added in v0.5.0

type HTTPOption func(*httpOption)

func WithClient ¶ added in v0.5.0

func WithClient(client *http.Client) HTTPOption

func WithHeaders ¶ added in v0.5.0

func WithHeaders(headers http.Header) HTTPOption

func WithMethod ¶ added in v0.5.0

func WithMethod(method string) HTTPOption

type InputAudioBufferAppendEvent ¶

type InputAudioBufferAppendEvent struct {
	EventBase
	Audio string `json:"audio"` // Base64-encoded audio bytes.
}

InputAudioBufferAppendEvent is the event for input audio buffer append. Send this event to append audio bytes to the input audio buffer. See https://platform.openai.com/docs/api-reference/realtime-client-events/input_audio_buffer/append

func (InputAudioBufferAppendEvent) ClientEventType ¶

func (m InputAudioBufferAppendEvent) ClientEventType() ClientEventType

func (InputAudioBufferAppendEvent) MarshalJSON ¶

func (m InputAudioBufferAppendEvent) MarshalJSON() ([]byte, error)

type InputAudioBufferClearEvent ¶

type InputAudioBufferClearEvent struct {
	EventBase
}

InputAudioBufferClearEvent is the event for input audio buffer clear. Send this event to clear the audio bytes in the buffer. See https://platform.openai.com/docs/api-reference/realtime-client-events/input_audio_buffer/clear

func (InputAudioBufferClearEvent) ClientEventType ¶

func (m InputAudioBufferClearEvent) ClientEventType() ClientEventType

func (InputAudioBufferClearEvent) MarshalJSON ¶

func (m InputAudioBufferClearEvent) MarshalJSON() ([]byte, error)

type InputAudioBufferClearedEvent ¶

type InputAudioBufferClearedEvent struct {
	ServerEventBase
}

InputAudioBufferClearedEvent is the event for input audio buffer cleared. Returned when the input audio buffer is cleared by the client. See https://platform.openai.com/docs/api-reference/realtime-server-events/input_audio_buffer/cleared

type InputAudioBufferCommitEvent ¶

type InputAudioBufferCommitEvent struct {
	EventBase
}

InputAudioBufferCommitEvent is the event for input audio buffer commit. Send this event to commit audio bytes to a user message. See https://platform.openai.com/docs/api-reference/realtime-client-events/input_audio_buffer/commit

func (InputAudioBufferCommitEvent) ClientEventType ¶

func (m InputAudioBufferCommitEvent) ClientEventType() ClientEventType

func (InputAudioBufferCommitEvent) MarshalJSON ¶

func (m InputAudioBufferCommitEvent) MarshalJSON() ([]byte, error)

type InputAudioBufferCommittedEvent ¶

type InputAudioBufferCommittedEvent struct {
	ServerEventBase
	// The ID of the preceding item after which the new item will be inserted.
	PreviousItemID string `json:"previous_item_id,omitempty"`
	// The ID of the user message item that will be created.
	ItemID string `json:"item_id"`
}

InputAudioBufferCommittedEvent is the event for input audio buffer committed. Returned when an input audio buffer is committed, either by the client or automatically in server VAD mode. See https://platform.openai.com/docs/api-reference/realtime-server-events/input_audio_buffer/committed

type InputAudioBufferSpeechStartedEvent ¶

type InputAudioBufferSpeechStartedEvent struct {
	ServerEventBase
	// Milliseconds since the session started when speech was detected.
	AudioStartMs int64 `json:"audio_start_ms"`
	// The ID of the user message item that will be created when speech stops.
	ItemID string `json:"item_id"`
}

InputAudioBufferSpeechStartedEvent is the event for input audio buffer speech started. Returned in server turn detection mode when speech is detected. See https://platform.openai.com/docs/api-reference/realtime-server-events/input_audio_buffer/speech_started

type InputAudioBufferSpeechStoppedEvent ¶

type InputAudioBufferSpeechStoppedEvent struct {
	ServerEventBase
	// Milliseconds since the session started when speech stopped.
	AudioEndMs int64 `json:"audio_end_ms"`
	// The ID of the user message item that will be created.
	ItemID string `json:"item_id"`
}

InputAudioBufferSpeechStoppedEvent is the event for input audio buffer speech stopped. Returned in server turn detection mode when speech stops. See https://platform.openai.com/docs/api-reference/realtime-server-events/input_audio_buffer/speech_stopped

type InputAudioTranscription ¶

type InputAudioTranscription struct {
	// The model used for transcription.
	Model string `json:"model"`
}

type InputTokenDetails ¶ added in v0.3.0

type InputTokenDetails struct {
	CachedTokens        int                 `json:"cached_tokens"`
	TextTokens          int                 `json:"text_tokens"`
	AudioTokens         int                 `json:"audio_tokens"`
	CachedTokensDetails CachedTokensDetails `json:"cached_tokens_details,omitempty"`
}

type IntOrInf ¶

type IntOrInf int

IntOrInf is a type that can be either an int or "inf".

const (
	// Inf is the maximum value for an IntOrInf.
	Inf IntOrInf = math.MaxInt
)

func (IntOrInf) IsInf ¶

func (m IntOrInf) IsInf() bool

IsInf returns true if the value is "inf".

func (IntOrInf) MarshalJSON ¶

func (m IntOrInf) MarshalJSON() ([]byte, error)

MarshalJSON marshals the IntOrInf to JSON.

func (*IntOrInf) UnmarshalJSON ¶

func (m *IntOrInf) UnmarshalJSON(data []byte) error

UnmarshalJSON unmarshals the IntOrInf from JSON.

type ItemStatus ¶

type ItemStatus string

const (
	ItemStatusInProgress ItemStatus = "in_progress"
	ItemStatusCompleted  ItemStatus = "completed"
	ItemStatusIncomplete ItemStatus = "incomplete"
)

type Logger ¶ added in v0.2.0

type Logger interface {
	Errorf(format string, v ...any)
	Warnf(format string, v ...any)
}

type MessageContentPart ¶

type MessageContentPart struct {
	// The content type.
	Type MessageContentType `json:"type"`
	// The text content. Validated if type is text.
	Text string `json:"text,omitempty"`
	// Base64-encoded audio data. Validated if type is audio.
	Audio string `json:"audio,omitempty"`
	// The transcript of the audio. Validated if type is transcript.
	Transcript string `json:"transcript,omitempty"`
}

type MessageContentType ¶

type MessageContentType string

const (
	MessageContentTypeText       MessageContentType = "text"
	MessageContentTypeAudio      MessageContentType = "audio"
	MessageContentTypeTranscript MessageContentType = "transcript"
	MessageContentTypeInputText  MessageContentType = "input_text"
	MessageContentTypeInputAudio MessageContentType = "input_audio"
)

type MessageItem ¶

type MessageItem struct {
	// The unique ID of the item.
	ID string `json:"id,omitempty"`
	// The type of the item ("message", "function_call", "function_call_output").
	Type MessageItemType `json:"type"`
	// The final status of the item.
	Status ItemStatus `json:"status,omitempty"`
	// The role associated with the item.
	Role MessageRole `json:"role,omitempty"`
	// The content of the item.
	Content []MessageContentPart `json:"content,omitempty"`
	// The ID of the function call, if the item is a function call.
	CallID string `json:"call_id,omitempty"`
	// The name of the function, if the item is a function call.
	Name string `json:"name,omitempty"`
	// The arguments of the function, if the item is a function call.
	Arguments string `json:"arguments,omitempty"`
	// The output of the function, if the item is a function call output.
	Output string `json:"output,omitempty"`
}

type MessageItemType ¶

type MessageItemType string

const (
	MessageItemTypeMessage            MessageItemType = "message"
	MessageItemTypeFunctionCall       MessageItemType = "function_call"
	MessageItemTypeFunctionCallOutput MessageItemType = "function_call_output"
)

type MessageRole ¶

type MessageRole string

const (
	MessageRoleSystem    MessageRole = "system"
	MessageRoleAssistant MessageRole = "assistant"
	MessageRoleUser      MessageRole = "user"
)

type MessageType ¶ added in v0.2.0

type MessageType int

MessageType represents the type of a WebSocket message. See https://tools.ietf.org/html/rfc6455#section-5.6

const (
	// MessageText is for UTF-8 encoded text messages like JSON.
	MessageText MessageType = iota + 1
	// MessageBinary is for binary messages like protobufs.
	MessageBinary
)

MessageType constants.

type Modality ¶

type Modality string

const (
	ModalityText  Modality = "text"
	ModalityAudio Modality = "audio"
)

type NopLogger ¶ added in v0.2.0

type NopLogger struct{}

NopLogger is a logger that does nothing.

func (NopLogger) Errorf ¶ added in v0.2.0

func (l NopLogger) Errorf(_ string, _ ...any)

Errorf does nothing.

func (NopLogger) Warnf ¶ added in v0.2.0

func (l NopLogger) Warnf(_ string, _ ...any)

Warnf does nothing.

type OutputTokenDetails ¶ added in v0.3.0

type OutputTokenDetails struct {
	TextTokens  int `json:"text_tokens"`
	AudioTokens int `json:"audio_tokens"`
}

type PermanentError ¶ added in v0.2.0

type PermanentError struct {
	Err error
}

PermanentError signals that the operation should not be retried.

func (*PermanentError) Error ¶ added in v0.2.0

func (e *PermanentError) Error() string

func (*PermanentError) Is ¶ added in v0.2.0

func (e *PermanentError) Is(target error) bool

func (*PermanentError) Unwrap ¶ added in v0.2.0

func (e *PermanentError) Unwrap() error

type RateLimit ¶

type RateLimit struct {
	// The name of the rate limit ("requests", "tokens", "input_tokens", "output_tokens").
	Name string `json:"name"`
	// The maximum allowed value for the rate limit.
	Limit int `json:"limit"`
	// The remaining value before the limit is reached.
	Remaining int `json:"remaining"`
	// Seconds until the rate limit resets.
	ResetSeconds float64 `json:"reset_seconds"`
}

type RateLimitsUpdatedEvent ¶

type RateLimitsUpdatedEvent struct {
	ServerEventBase
	// List of rate limit information.
	RateLimits []RateLimit `json:"rate_limits"`
}

RateLimitsUpdatedEvent is the event for rate limits updated. Emitted after every "response.done" event to indicate the updated rate limits. See https://platform.openai.com/docs/api-reference/realtime-server-events/rate_limits/updated

type Response ¶

type Response struct {
	// The unique ID of the response.
	ID string `json:"id"`
	// The object type, must be "realtime.response".
	Object string `json:"object"`
	// The status of the response.
	Status ResponseStatus `json:"status"`
	// Additional details about the status.
	StatusDetails any `json:"status_details,omitempty"`
	// The list of output items generated by the response.
	Output []ResponseMessageItem `json:"output"`
	// Usage statistics for the response.
	Usage *Usage `json:"usage,omitempty"`
}

type ResponseAudioDeltaEvent ¶

type ResponseAudioDeltaEvent struct {
	ServerEventBase
	// The ID of the response.
	ResponseID string `json:"response_id"`
	// The ID of the item.
	ItemID string `json:"item_id"`
	// The index of the output item in the response.
	OutputIndex int `json:"output_index"`
	// The index of the content part in the item's content array.
	ContentIndex int `json:"content_index"`
	// Base64-encoded audio data delta.
	Delta string `json:"delta"`
}

ResponseAudioDeltaEvent is the event for response audio delta. Returned when the model-generated audio is updated. See https://platform.openai.com/docs/api-reference/realtime-server-events/response/audio/delta

type ResponseAudioDoneEvent ¶

type ResponseAudioDoneEvent struct {
	ServerEventBase
	// The ID of the response.
	ResponseID string `json:"response_id"`
	// The ID of the item.
	ItemID string `json:"item_id"`
	// The index of the output item in the response.
	OutputIndex int `json:"output_index"`
	// The index of the content part in the item's content array.
	ContentIndex int `json:"content_index"`
}

ResponseAudioDoneEvent is the event for response audio done. Returned when the model-generated audio is done. Also emitted when a Response is interrupted, incomplete, or cancelled. See https://platform.openai.com/docs/api-reference/realtime-server-events/response/audio/done

type ResponseAudioTranscriptDeltaEvent ¶

type ResponseAudioTranscriptDeltaEvent struct {
	ServerEventBase
	// The ID of the response.
	ResponseID string `json:"response_id"`
	// The ID of the item.
	ItemID string `json:"item_id"`
	// The index of the output item in the response.
	OutputIndex int `json:"output_index"`
	// The index of the content part in the item's content array.
	ContentIndex int `json:"content_index"`
	// The transcript delta.
	Delta string `json:"delta"`
}

ResponseAudioTranscriptDeltaEvent is the event for response audio transcript delta. Returned when the model-generated transcription of audio output is updated. See https://platform.openai.com/docs/api-reference/realtime-server-events/response/audio_transcript/delta

type ResponseAudioTranscriptDoneEvent ¶

type ResponseAudioTranscriptDoneEvent struct {
	ServerEventBase
	// The ID of the response.
	ResponseID string `json:"response_id"`
	// The ID of the item.
	ItemID string `json:"item_id"`
	// The index of the output item in the response.
	OutputIndex int `json:"output_index"`
	// The index of the content part in the item's content array.
	ContentIndex int `json:"content_index"`
	// The final transcript of the audio.
	Transcript string `json:"transcript"`
}

ResponseAudioTranscriptDoneEvent is the event for response audio transcript done. Returned when the model-generated transcription of audio output is done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled. See https://platform.openai.com/docs/api-reference/realtime-server-events/response/audio_transcript/done

type ResponseCancelEvent ¶

type ResponseCancelEvent struct {
	EventBase
	// A specific response ID to cancel - if not provided, will cancel an in-progress response in the default conversation.
	ResponseID string `json:"response_id,omitempty"`
}

ResponseCancelEvent is the event for response cancel. Send this event to cancel an in-progress response. See https://platform.openai.com/docs/api-reference/realtime-client-events/response/cancel

func (ResponseCancelEvent) ClientEventType ¶

func (m ResponseCancelEvent) ClientEventType() ClientEventType

func (ResponseCancelEvent) MarshalJSON ¶

func (m ResponseCancelEvent) MarshalJSON() ([]byte, error)

type ResponseContentPartAddedEvent ¶

type ResponseContentPartAddedEvent struct {
	ServerEventBase
	ResponseID   string             `json:"response_id"`
	ItemID       string             `json:"item_id"`
	OutputIndex  int                `json:"output_index"`
	ContentIndex int                `json:"content_index"`
	Part         MessageContentPart `json:"part"`
}

ResponseContentPartAddedEvent is the event for response content part added. Returned when a new content part is added to an assistant message item during response generation. See https://platform.openai.com/docs/api-reference/realtime-server-events/response/content_part/added

type ResponseContentPartDoneEvent ¶

type ResponseContentPartDoneEvent struct {
	ServerEventBase
	// The ID of the response.
	ResponseID string `json:"response_id"`
	// The ID of the item to which the content part was added.
	ItemID string `json:"item_id"`
	// The index of the output item in the response.
	OutputIndex int `json:"output_index"`
	// The index of the content part in the item's content array.
	ContentIndex int `json:"content_index"`
	// The content part that was added.
	Part MessageContentPart `json:"part"`
}

ResponseContentPartDoneEvent is the event for response content part done. Returned when a content part is done streaming in an assistant message item. Also emitted when a Response is interrupted, incomplete, or cancelled. See https://platform.openai.com/docs/api-reference/realtime-server-events/response/content_part/done

type ResponseCreateEvent ¶

type ResponseCreateEvent struct {
	EventBase
	// Configuration for the response.
	Response ResponseCreateParams `json:"response"`
}

ResponseCreateEvent is the event for response create. Send this event to trigger a response generation. See https://platform.openai.com/docs/api-reference/realtime-client-events/response/create

func (ResponseCreateEvent) ClientEventType ¶

func (m ResponseCreateEvent) ClientEventType() ClientEventType

func (ResponseCreateEvent) MarshalJSON ¶

func (m ResponseCreateEvent) MarshalJSON() ([]byte, error)

type ResponseCreateParams ¶

type ResponseCreateParams struct {
	// The modalities for the response.
	Modalities []Modality `json:"modalities,omitempty"`
	// Instructions for the model.
	Instructions string `json:"instructions,omitempty"`
	// The voice the model uses to respond - one of alloy, echo, or shimmer.
	Voice Voice `json:"voice,omitempty"`
	// The format of output audio.
	OutputAudioFormat AudioFormat `json:"output_audio_format,omitempty"`
	// Tools (functions) available to the model.
	Tools []Tool `json:"tools,omitempty"`
	// How the model chooses tools.
	ToolChoice ToolChoiceInterface `json:"tool_choice,omitempty"`
	// Sampling temperature.
	Temperature *float32 `json:"temperature,omitempty"`
	// Maximum number of output tokens for a single assistant response, inclusive of tool calls. Provide an integer between 1 and 4096 to limit output tokens, or "inf" for the maximum available tokens for a given model. Defaults to "inf".
	MaxOutputTokens IntOrInf `json:"max_output_tokens,omitempty"`
}

type ResponseCreatedEvent ¶

type ResponseCreatedEvent struct {
	ServerEventBase
	// The response resource.
	Response Response `json:"response"`
}

ResponseCreatedEvent is the event for response created. Returned when a new Response is created. The first event of response creation, where the response is in an initial state of "in_progress". See https://platform.openai.com/docs/api-reference/realtime-server-events/response/created

type ResponseDoneEvent ¶

type ResponseDoneEvent struct {
	ServerEventBase
	// The response resource.
	Response Response `json:"response"`
}

ResponseDoneEvent is the event for response done. Returned when a Response is done streaming. Always emitted, no matter the final state. See https://platform.openai.com/docs/api-reference/realtime-server-events/response/done

type ResponseFunctionCallArgumentsDeltaEvent ¶

type ResponseFunctionCallArgumentsDeltaEvent struct {
	ServerEventBase
	// The ID of the response.
	ResponseID string `json:"response_id"`
	// The ID of the item.
	ItemID string `json:"item_id"`
	// The index of the output item in the response.
	OutputIndex int `json:"output_index"`
	// The ID of the function call.
	CallID string `json:"call_id"`
	// The arguments delta as a JSON string.
	Delta string `json:"delta"`
}

ResponseFunctionCallArgumentsDeltaEvent is the event for response function call arguments delta. Returned when the model-generated function call arguments are updated. See https://platform.openai.com/docs/api-reference/realtime-server-events/response/function_call_arguments/delta

type ResponseFunctionCallArgumentsDoneEvent ¶

type ResponseFunctionCallArgumentsDoneEvent struct {
	ServerEventBase
	// The ID of the response.
	ResponseID string `json:"response_id"`
	// The ID of the item.
	ItemID string `json:"item_id"`
	// The index of the output item in the response.
	OutputIndex int `json:"output_index"`
	// The ID of the function call.
	CallID string `json:"call_id"`
	// The final arguments as a JSON string.
	Arguments string `json:"arguments"`
	// The name of the function. Not shown in API reference but present in the actual event.
	Name string `json:"name"`
}

ResponseFunctionCallArgumentsDoneEvent is the event for response function call arguments done. Returned when the model-generated function call arguments are done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled. See https://platform.openai.com/docs/api-reference/realtime-server-events/response/function_call_arguments/done

type ResponseMessageItem ¶

type ResponseMessageItem struct {
	MessageItem
	// The object type, must be "realtime.item".
	Object string `json:"object,omitempty"`
}

type ResponseOutputItemAddedEvent ¶

type ResponseOutputItemAddedEvent struct {
	ServerEventBase
	// The ID of the response to which the item belongs.
	ResponseID string `json:"response_id"`
	// The index of the output item in the response.
	OutputIndex int `json:"output_index"`
	// The item that was added.
	Item ResponseMessageItem `json:"item"`
}

ResponseOutputItemAddedEvent is the event for response output item added. Returned when a new Item is created during response generation. See https://platform.openai.com/docs/api-reference/realtime-server-events/response/output_item/added

type ResponseOutputItemDoneEvent ¶

type ResponseOutputItemDoneEvent struct {
	ServerEventBase
	// The ID of the response to which the item belongs.
	ResponseID string `json:"response_id"`
	// The index of the output item in the response.
	OutputIndex int `json:"output_index"`
	// The completed item.
	Item ResponseMessageItem `json:"item"`
}

ResponseOutputItemDoneEvent is the event for response output item done. Returned when an Item is done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled. See https://platform.openai.com/docs/api-reference/realtime-server-events/response/output_item/done

type ResponseStatus ¶

type ResponseStatus string

const (
	ResponseStatusInProgress ResponseStatus = "in_progress"
	ResponseStatusCompleted  ResponseStatus = "completed"
	ResponseStatusCancelled  ResponseStatus = "cancelled"
	ResponseStatusIncomplete ResponseStatus = "incomplete"
	ResponseStatusFailed     ResponseStatus = "failed"
)

type ResponseTextDeltaEvent ¶

type ResponseTextDeltaEvent struct {
	ServerEventBase
	ResponseID   string `json:"response_id"`
	ItemID       string `json:"item_id"`
	OutputIndex  int    `json:"output_index"`
	ContentIndex int    `json:"content_index"`
	Delta        string `json:"delta"`
}

ResponseTextDeltaEvent is the event for response text delta. Returned when the text value of a "text" content part is updated. See https://platform.openai.com/docs/api-reference/realtime-server-events/response/text/delta

type ResponseTextDoneEvent ¶

type ResponseTextDoneEvent struct {
	ServerEventBase
	ResponseID   string `json:"response_id"`
	ItemID       string `json:"item_id"`
	OutputIndex  int    `json:"output_index"`
	ContentIndex int    `json:"content_index"`
	Text         string `json:"text"`
}

ResponseTextDoneEvent is the event for response text done. Returned when the text value of a "text" content part is done streaming. Also emitted when a Response is interrupted, incomplete, or cancelled. See https://platform.openai.com/docs/api-reference/realtime-server-events/response/text/done

type ServerEvent ¶

type ServerEvent interface {
	ServerEventType() ServerEventType
}

ServerEvent is the interface for server events.

func UnmarshalServerEvent ¶

func UnmarshalServerEvent(data []byte) (ServerEvent, error)

UnmarshalServerEvent unmarshals the server event from the given JSON data.

type ServerEventBase ¶

type ServerEventBase struct {
	// The unique ID of the server event.
	EventID string `json:"event_id,omitempty"`
	// The type of the server event.
	Type ServerEventType `json:"type"`
}

ServerEventBase is the base struct for all server events.

func (ServerEventBase) ServerEventType ¶

func (m ServerEventBase) ServerEventType() ServerEventType

type ServerEventHandler ¶

type ServerEventHandler func(ctx context.Context, event ServerEvent)

type ServerEventInterface ¶

type ServerEventInterface interface {
	ErrorEvent |
		SessionCreatedEvent |
		SessionUpdatedEvent |
		ConversationCreatedEvent |
		InputAudioBufferCommittedEvent |
		InputAudioBufferClearedEvent |
		InputAudioBufferSpeechStartedEvent |
		InputAudioBufferSpeechStoppedEvent |
		ConversationItemCreatedEvent |
		ConversationItemInputAudioTranscriptionCompletedEvent |
		ConversationItemInputAudioTranscriptionFailedEvent |
		ConversationItemTruncatedEvent |
		ConversationItemDeletedEvent |
		ResponseCreatedEvent |
		ResponseDoneEvent |
		ResponseOutputItemAddedEvent |
		ResponseOutputItemDoneEvent |
		ResponseContentPartAddedEvent |
		ResponseContentPartDoneEvent |
		ResponseTextDeltaEvent |
		ResponseTextDoneEvent |
		ResponseAudioTranscriptDeltaEvent |
		ResponseAudioTranscriptDoneEvent |
		ResponseAudioDeltaEvent |
		ResponseAudioDoneEvent |
		ResponseFunctionCallArgumentsDeltaEvent |
		ResponseFunctionCallArgumentsDoneEvent |
		RateLimitsUpdatedEvent
}

type ServerEventType ¶

type ServerEventType string

const (
	ServerEventTypeError                                            ServerEventType = "error"
	ServerEventTypeSessionCreated                                   ServerEventType = "session.created"
	ServerEventTypeSessionUpdated                                   ServerEventType = "session.updated"
	ServerEventTypeConversationCreated                              ServerEventType = "conversation.created"
	ServerEventTypeInputAudioBufferCommitted                        ServerEventType = "input_audio_buffer.committed"
	ServerEventTypeInputAudioBufferCleared                          ServerEventType = "input_audio_buffer.cleared"
	ServerEventTypeInputAudioBufferSpeechStarted                    ServerEventType = "input_audio_buffer.speech_started"
	ServerEventTypeInputAudioBufferSpeechStopped                    ServerEventType = "input_audio_buffer.speech_stopped"
	ServerEventTypeConversationItemCreated                          ServerEventType = "conversation.item.created"
	ServerEventTypeConversationItemInputAudioTranscriptionCompleted ServerEventType = "conversation.item.input_audio_transcription.completed"
	ServerEventTypeConversationItemInputAudioTranscriptionFailed    ServerEventType = "conversation.item.input_audio_transcription.failed"
	ServerEventTypeConversationItemTruncated                        ServerEventType = "conversation.item.truncated"
	ServerEventTypeConversationItemDeleted                          ServerEventType = "conversation.item.deleted"
	ServerEventTypeResponseCreated                                  ServerEventType = "response.created"
	ServerEventTypeResponseDone                                     ServerEventType = "response.done"
	ServerEventTypeResponseOutputItemAdded                          ServerEventType = "response.output_item.added"
	ServerEventTypeResponseOutputItemDone                           ServerEventType = "response.output_item.done"
	ServerEventTypeResponseContentPartAdded                         ServerEventType = "response.content_part.added"
	ServerEventTypeResponseContentPartDone                          ServerEventType = "response.content_part.done"
	ServerEventTypeResponseTextDelta                                ServerEventType = "response.text.delta"
	ServerEventTypeResponseTextDone                                 ServerEventType = "response.text.done"
	ServerEventTypeResponseAudioTranscriptDelta                     ServerEventType = "response.audio_transcript.delta"
	ServerEventTypeResponseAudioTranscriptDone                      ServerEventType = "response.audio_transcript.done"
	ServerEventTypeResponseAudioDelta                               ServerEventType = "response.audio.delta"
	ServerEventTypeResponseAudioDone                                ServerEventType = "response.audio.done"
	ServerEventTypeResponseFunctionCallArgumentsDelta               ServerEventType = "response.function_call_arguments.delta"
	ServerEventTypeResponseFunctionCallArgumentsDone                ServerEventType = "response.function_call_arguments.done"
	ServerEventTypeRateLimitsUpdated                                ServerEventType = "rate_limits.updated"
)

type ServerSession ¶

type ServerSession struct {
	// The unique ID of the session.
	ID string `json:"id"`
	// The object type, must be "realtime.session".
	Object string `json:"object"`
	// The default model used for this session.
	Model string `json:"model"`
	// The set of modalities the model can respond with.
	Modalities []Modality `json:"modalities,omitempty"`
	// The default system instructions.
	Instructions string `json:"instructions,omitempty"`
	// The voice the model uses to respond - one of alloy, echo, or shimmer.
	Voice Voice `json:"voice,omitempty"`
	// The format of input audio.
	InputAudioFormat AudioFormat `json:"input_audio_format,omitempty"`
	// The format of output audio.
	OutputAudioFormat AudioFormat `json:"output_audio_format,omitempty"`
	// Configuration for input audio transcription.
	InputAudioTranscription *InputAudioTranscription `json:"input_audio_transcription,omitempty"`
	// Configuration for turn detection.
	TurnDetection *ServerTurnDetection `json:"turn_detection,omitempty"`
	// Tools (functions) available to the model.
	Tools []Tool `json:"tools,omitempty"`
	// How the model chooses tools.
	ToolChoice ServerToolChoice `json:"tool_choice,omitempty"`
	// Sampling temperature.
	Temperature *float32 `json:"temperature,omitempty"`
	// Maximum number of output tokens.
	MaxOutputTokens IntOrInf `json:"max_response_output_tokens,omitempty"`
}

type ServerToolChoice ¶

type ServerToolChoice struct {
	String   ToolChoiceString
	Function ToolChoice
}

ServerToolChoice is a type that can be used to choose a tool response from the server.

func (ServerToolChoice) Get ¶

func (m ServerToolChoice) Get() ToolChoiceInterface

Get returns the ToolChoiceInterface based on the type of tool choice.

func (*ServerToolChoice) IsFunction ¶

func (m *ServerToolChoice) IsFunction() bool

IsFunction returns true if the tool choice is a function call.

func (*ServerToolChoice) UnmarshalJSON ¶

func (m *ServerToolChoice) UnmarshalJSON(data []byte) error

UnmarshalJSON is a custom unmarshaler for ServerToolChoice.

type ServerTurnDetection ¶

type ServerTurnDetection struct {
	// The type of turn detection ("server_vad" or "none").
	Type ServerTurnDetectionType `json:"type"`

	TurnDetectionParams
}

type ServerTurnDetectionType ¶

type ServerTurnDetectionType string

const (
	ServerTurnDetectionTypeNone      ServerTurnDetectionType = "none"
	ServerTurnDetectionTypeServerVad ServerTurnDetectionType = "server_vad"
)

type SessionCreatedEvent ¶

type SessionCreatedEvent struct {
	ServerEventBase
	// The session resource.
	Session ServerSession `json:"session"`
}

SessionCreatedEvent is the event for session created. Returned when a session is created. Emitted automatically when a new connection is established. See https://platform.openai.com/docs/api-reference/realtime-server-events/session/created

type SessionUpdateEvent ¶

type SessionUpdateEvent struct {
	EventBase
	// Session configuration to update.
	Session ClientSession `json:"session"`
}

SessionUpdateEvent is the event for session update. Send this event to update the session’s default configuration. See https://platform.openai.com/docs/api-reference/realtime-client-events/session/update

func (SessionUpdateEvent) ClientEventType ¶

func (m SessionUpdateEvent) ClientEventType() ClientEventType

func (SessionUpdateEvent) MarshalJSON ¶

func (m SessionUpdateEvent) MarshalJSON() ([]byte, error)

type SessionUpdatedEvent ¶

type SessionUpdatedEvent struct {
	ServerEventBase
	// The updated session resource.
	Session ServerSession `json:"session"`
}

SessionUpdatedEvent is the event for session updated. Returned when a session is updated. See https://platform.openai.com/docs/api-reference/realtime-server-events/session/updated

type StdLogger ¶ added in v0.2.0

type StdLogger struct{}

StdLogger is a logger that logs to the "log" package.

func (StdLogger) Errorf ¶ added in v0.2.0

func (l StdLogger) Errorf(format string, v ...any)

func (StdLogger) Warnf ¶ added in v0.2.0

func (l StdLogger) Warnf(format string, v ...any)

type Tool ¶

type Tool struct {
	Type        ToolType `json:"type"`
	Name        string   `json:"name"`
	Description string   `json:"description"`
	Parameters  any      `json:"parameters"`
}

type ToolChoice ¶

type ToolChoice struct {
	Type     ToolType     `json:"type"`
	Function ToolFunction `json:"function,omitempty"`
}

func (ToolChoice) ToolChoice ¶

func (t ToolChoice) ToolChoice()

type ToolChoiceInterface ¶

type ToolChoiceInterface interface {
	ToolChoice()
}

type ToolChoiceString ¶

type ToolChoiceString string

const (
	ToolChoiceAuto     ToolChoiceString = "auto"
	ToolChoiceNone     ToolChoiceString = "none"
	ToolChoiceRequired ToolChoiceString = "required"
)

func (ToolChoiceString) ToolChoice ¶

func (ToolChoiceString) ToolChoice()

type ToolFunction ¶

type ToolFunction struct {
	Name string `json:"name"`
}

type ToolType ¶

type ToolType string

const (
	ToolTypeFunction ToolType = "function"
)

type TurnDetectionParams ¶

type TurnDetectionParams struct {
	// Activation threshold for VAD.
	Threshold float64 `json:"threshold,omitempty"`
	// Audio included before speech starts (in milliseconds).
	PrefixPaddingMs int `json:"prefix_padding_ms,omitempty"`
	// Duration of silence to detect speech stop (in milliseconds).
	SilenceDurationMs int `json:"silence_duration_ms,omitempty"`
	// Whether or not to automatically generate a response when VAD is enabled. true by default.
	CreateResponse *bool `json:"create_response,omitempty"`
}

type TurnDetectionType ¶

type TurnDetectionType string

const (
	// TurnDetectionTypeNone means turn detection is disabled.
	// This can only be used in ServerSession, not in ClientSession.
	// If you want to disable turn detection, you should send SessionUpdateEvent with TurnDetection set to nil.
	TurnDetectionTypeNone TurnDetectionType = "none"
	// TurnDetectionTypeServerVad use server-side VAD to detect turn.
	// This is default value for newly created session.
	TurnDetectionTypeServerVad TurnDetectionType = "server_vad"
)

type Usage ¶

type Usage struct {
	TotalTokens  int `json:"total_tokens"`
	InputTokens  int `json:"input_tokens"`
	OutputTokens int `json:"output_tokens"`
	// Input token details.
	InputTokenDetails InputTokenDetails `json:"input_token_details,omitempty"`
	// Output token details.
	OutputTokenDetails OutputTokenDetails `json:"output_token_details,omitempty"`
}

type Voice ¶

type Voice string

const (
	VoiceAlloy   Voice = "alloy"
	VoiceAsh     Voice = "ash"
	VoiceBallad  Voice = "ballad"
	VoiceCoral   Voice = "coral"
	VoiceEcho    Voice = "echo"
	VoiceSage    Voice = "sage"
	VoiceShimmer Voice = "shimmer"
	VoiceVerse   Voice = "verse"
)

type WebSocketConn ¶ added in v0.2.0

type WebSocketConn interface {
	// ReadMessage reads a message from the WebSocket connection.
	//
	// The ctx could be used to cancel the read operation. It's behavior depends on the underlying implementation.
	// If the read succeeds, the returned error should be nil, and the ctx's cancel/timeout shouldn't affect the
	// connection and future read operations.
	//
	// If the returned error is Permanent, the future read operations on the same connection will not succeed,
	// that means the connection is broken and should be closed or had already been closed.
	//
	// In general, once the ctx is canceled before read finishes, the read operation will be canceled and
	// the connection will be closed.
	//
	// There are some exceptions:
	// - If the underlying implementation is gorilla/websocket, the read operation will not be canceled
	//   when the ctx is canceled before its deadline, it will keep reading until the ctx reaches deadline or the connection is closed.
	ReadMessage(ctx context.Context) (messageType MessageType, p []byte, err error)

	// WriteMessage writes a message to the WebSocket connection.
	//
	// The ctx could be used to cancel the write operation. It's behavior depends on the underlying implementation.
	//
	// If the returned error is Permanent, the future write operations on the same connection will not succeed,
	// that means the connection is broken and should be closed or had already been closed.
	//
	// In general, once the ctx is canceled before write finishes, the write operation will be canceled and
	// the connection will be closed.
	WriteMessage(ctx context.Context, messageType MessageType, data []byte) error

	// Close closes the WebSocket connection.
	Close() error

	// Response returns the *http.Response of the WebSocket connection.
	// Commonly used to get response headers.
	Response() *http.Response

	// Ping sends a ping message to the WebSocket connection.
	Ping(ctx context.Context) error
}

WebSocketConn is a WebSocket connection abstraction.

type WebSocketDialer ¶ added in v0.2.0

type WebSocketDialer interface {
	// Dial establishes a new WebSocket connection to the given URL.
	// The ctx could be used to cancel the dial operation. It's effect depends on the underlying implementation.
	Dial(ctx context.Context, url string, header http.Header) (WebSocketConn, error)
}

WebSocketDialer is a WebSocket connection dialer abstraction.

func DefaultDialer ¶ added in v0.2.0

func DefaultDialer() WebSocketDialer

DefaultDialer returns a default WebSocketDialer.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
contrib
ws-gorilla Module
examples module
test

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL