puthuggingface

package
v8.19.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 31, 2025 License: Apache-2.0 Imports: 14 Imported by: 0

Documentation

Overview

Create a Hugging Face inference endpoint.

Create an inference endpoint to perform an inference task with the `hugging_face` service. Supported tasks include: `text_embedding`, `completion`, and `chat_completion`.

To configure the endpoint, first visit the Hugging Face Inference Endpoints page and create a new endpoint. Select a model that supports the task you intend to use.

For Elastic's `text_embedding` task: The selected model must support the `Sentence Embeddings` task. On the new endpoint creation page, select the `Sentence Embeddings` task under the `Advanced Configuration` section. After the endpoint has initialized, copy the generated endpoint URL. Recommended models for `text_embedding` task:

* `all-MiniLM-L6-v2` * `all-MiniLM-L12-v2` * `all-mpnet-base-v2` * `e5-base-v2` * `e5-small-v2` * `multilingual-e5-base` * `multilingual-e5-small`

For Elastic's `chat_completion` and `completion` tasks: The selected model must support the `Text Generation` task and expose OpenAI API. HuggingFace supports both serverless and dedicated endpoints for `Text Generation`. When creating dedicated endpoint select the `Text Generation` task. After the endpoint is initialized (for dedicated) or ready (for serverless), ensure it supports the OpenAI API and includes `/v1/chat/completions` part in URL. Then, copy the full endpoint URL for use. Recommended models for `chat_completion` and `completion` tasks:

* `Mistral-7B-Instruct-v0.2` * `QwQ-32B` * `Phi-3-mini-128k-instruct`

For Elastic's `rerank` task: The selected model must support the `sentence-ranking` task and expose OpenAI API. HuggingFace supports only dedicated (not serverless) endpoints for `Rerank` so far. After the endpoint is initialized, copy the full endpoint URL for use. Tested models for `rerank` task:

* `bge-reranker-base` * `jina-reranker-v1-turbo-en-GGUF`

Index

Constants

This section is empty.

Variables

View Source
var ErrBuildPath = errors.New("cannot build path, check for missing path parameters")

ErrBuildPath is returned in case of missing parameters within the build of the request.

Functions

This section is empty.

Types

type NewPutHuggingFace

type NewPutHuggingFace func(tasktype, huggingfaceinferenceid string) *PutHuggingFace

NewPutHuggingFace type alias for index.

func NewPutHuggingFaceFunc

func NewPutHuggingFaceFunc(tp elastictransport.Interface) NewPutHuggingFace

NewPutHuggingFaceFunc returns a new instance of PutHuggingFace with the provided transport. Used in the index of the library this allows to retrieve every apis in once place.

type PutHuggingFace

type PutHuggingFace struct {
	// contains filtered or unexported fields
}

func New

Create a Hugging Face inference endpoint.

Create an inference endpoint to perform an inference task with the `hugging_face` service. Supported tasks include: `text_embedding`, `completion`, and `chat_completion`.

To configure the endpoint, first visit the Hugging Face Inference Endpoints page and create a new endpoint. Select a model that supports the task you intend to use.

For Elastic's `text_embedding` task: The selected model must support the `Sentence Embeddings` task. On the new endpoint creation page, select the `Sentence Embeddings` task under the `Advanced Configuration` section. After the endpoint has initialized, copy the generated endpoint URL. Recommended models for `text_embedding` task:

* `all-MiniLM-L6-v2` * `all-MiniLM-L12-v2` * `all-mpnet-base-v2` * `e5-base-v2` * `e5-small-v2` * `multilingual-e5-base` * `multilingual-e5-small`

For Elastic's `chat_completion` and `completion` tasks: The selected model must support the `Text Generation` task and expose OpenAI API. HuggingFace supports both serverless and dedicated endpoints for `Text Generation`. When creating dedicated endpoint select the `Text Generation` task. After the endpoint is initialized (for dedicated) or ready (for serverless), ensure it supports the OpenAI API and includes `/v1/chat/completions` part in URL. Then, copy the full endpoint URL for use. Recommended models for `chat_completion` and `completion` tasks:

* `Mistral-7B-Instruct-v0.2` * `QwQ-32B` * `Phi-3-mini-128k-instruct`

For Elastic's `rerank` task: The selected model must support the `sentence-ranking` task and expose OpenAI API. HuggingFace supports only dedicated (not serverless) endpoints for `Rerank` so far. After the endpoint is initialized, copy the full endpoint URL for use. Tested models for `rerank` task:

* `bge-reranker-base` * `jina-reranker-v1-turbo-en-GGUF`

https://www.elastic.co/guide/en/elasticsearch/reference/current/infer-service-hugging-face.html

func (*PutHuggingFace) ChunkingSettings

func (r *PutHuggingFace) ChunkingSettings(chunkingsettings *types.InferenceChunkingSettings) *PutHuggingFace

ChunkingSettings The chunking configuration object. API name: chunking_settings

func (PutHuggingFace) Do

func (r PutHuggingFace) Do(providedCtx context.Context) (*Response, error)

Do runs the request through the transport, handle the response and returns a puthuggingface.Response

func (*PutHuggingFace) ErrorTrace

func (r *PutHuggingFace) ErrorTrace(errortrace bool) *PutHuggingFace

ErrorTrace When set to `true` Elasticsearch will include the full stack trace of errors when they occur. API name: error_trace

func (*PutHuggingFace) FilterPath

func (r *PutHuggingFace) FilterPath(filterpaths ...string) *PutHuggingFace

FilterPath Comma-separated list of filters in dot notation which reduce the response returned by Elasticsearch. API name: filter_path

func (*PutHuggingFace) Header

func (r *PutHuggingFace) Header(key, value string) *PutHuggingFace

Header set a key, value pair in the PutHuggingFace headers map.

func (*PutHuggingFace) HttpRequest

func (r *PutHuggingFace) HttpRequest(ctx context.Context) (*http.Request, error)

HttpRequest returns the http.Request object built from the given parameters.

func (*PutHuggingFace) Human

func (r *PutHuggingFace) Human(human bool) *PutHuggingFace

Human When set to `true` will return statistics in a format suitable for humans. For example `"exists_time": "1h"` for humans and `"eixsts_time_in_millis": 3600000` for computers. When disabled the human readable values will be omitted. This makes sense for responses being consumed only by machines. API name: human

func (PutHuggingFace) Perform

func (r PutHuggingFace) Perform(providedCtx context.Context) (*http.Response, error)

Perform runs the http.Request through the provided transport and returns an http.Response.

func (*PutHuggingFace) Pretty

func (r *PutHuggingFace) Pretty(pretty bool) *PutHuggingFace

Pretty If set to `true` the returned JSON will be "pretty-formatted". Only use this option for debugging only. API name: pretty

func (*PutHuggingFace) Raw

func (r *PutHuggingFace) Raw(raw io.Reader) *PutHuggingFace

Raw takes a json payload as input which is then passed to the http.Request If specified Raw takes precedence on Request method.

func (*PutHuggingFace) Request

func (r *PutHuggingFace) Request(req *Request) *PutHuggingFace

Request allows to set the request property with the appropriate payload.

func (*PutHuggingFace) Service

Service The type of service supported for the specified task type. In this case, `hugging_face`. API name: service

func (*PutHuggingFace) ServiceSettings

func (r *PutHuggingFace) ServiceSettings(servicesettings *types.HuggingFaceServiceSettings) *PutHuggingFace

ServiceSettings Settings used to install the inference model. These settings are specific to the `hugging_face` service. API name: service_settings

func (*PutHuggingFace) TaskSettings added in v8.19.0

func (r *PutHuggingFace) TaskSettings(tasksettings *types.HuggingFaceTaskSettings) *PutHuggingFace

TaskSettings Settings to configure the inference task. These settings are specific to the task type you specified. API name: task_settings

func (*PutHuggingFace) Timeout added in v8.19.0

func (r *PutHuggingFace) Timeout(duration string) *PutHuggingFace

Timeout Specifies the amount of time to wait for the inference endpoint to be created. API name: timeout

type Request

type Request struct {

	// ChunkingSettings The chunking configuration object.
	ChunkingSettings *types.InferenceChunkingSettings `json:"chunking_settings,omitempty"`
	// Service The type of service supported for the specified task type. In this case,
	// `hugging_face`.
	Service huggingfaceservicetype.HuggingFaceServiceType `json:"service"`
	// ServiceSettings Settings used to install the inference model. These settings are specific to
	// the `hugging_face` service.
	ServiceSettings types.HuggingFaceServiceSettings `json:"service_settings"`
	// TaskSettings Settings to configure the inference task.
	// These settings are specific to the task type you specified.
	TaskSettings *types.HuggingFaceTaskSettings `json:"task_settings,omitempty"`
}

Request holds the request body struct for the package puthuggingface

https://github.com/elastic/elasticsearch-specification/blob/470b4b9aaaa25cae633ec690e54b725c6fc939c7/specification/inference/put_hugging_face/PutHuggingFaceRequest.ts#L31-L121

func NewRequest

func NewRequest() *Request

NewRequest returns a Request

func (*Request) FromJSON

func (r *Request) FromJSON(data string) (*Request, error)

FromJSON allows to load an arbitrary json into the request structure

type Response

type Response struct {

	// ChunkingSettings Chunking configuration object
	ChunkingSettings *types.InferenceChunkingSettings `json:"chunking_settings,omitempty"`
	// InferenceId The inference Id
	InferenceId string `json:"inference_id"`
	// Service The service type
	Service string `json:"service"`
	// ServiceSettings Settings specific to the service
	ServiceSettings json.RawMessage `json:"service_settings"`
	// TaskSettings Task settings specific to the service and task type
	TaskSettings json.RawMessage `json:"task_settings,omitempty"`
	// TaskType The task type
	TaskType tasktypehuggingface.TaskTypeHuggingFace `json:"task_type"`
}

Response holds the response body struct for the package puthuggingface

https://github.com/elastic/elasticsearch-specification/blob/470b4b9aaaa25cae633ec690e54b725c6fc939c7/specification/inference/put_hugging_face/PutHuggingFaceResponse.ts#L22-L24

func NewResponse

func NewResponse() *Response

NewResponse returns a Response

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL