tensorrt

package

v0.3.1 Latest Latest Go to latest Published: Mar 21, 2026 License: Apache-2.0 Imports: 6 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/zerfoo/ztensor

Links

Open Source Insights

Documentation ¶

Overview ¶

Package tensorrt provides bindings for the NVIDIA TensorRT inference library via purego (dlopen/dlsym, no CGo). It loads libtrt_capi.so at runtime. Call Available() to check whether the library was found.

Index ¶

func Available() bool
func ShuffleSetFirstTranspose(layer *Layer, perm []int32)
func ShuffleSetReshapeDims(layer *Layer, dims []int32)
type ActivationType
type Builder
- func CreateBuilder(logger *Logger) (*Builder, error)
- func (b *Builder) BuildSerializedNetwork(network *NetworkDefinition, config *BuilderConfig) ([]byte, error)
- func (b *Builder) CreateBuilderConfig() (*BuilderConfig, error)
- func (b *Builder) CreateNetwork() (*NetworkDefinition, error)
- func (b *Builder) CreateOptimizationProfile() (*OptimizationProfile, error)
- func (b *Builder) Destroy()
type BuilderConfig
- func (c *BuilderConfig) Destroy()
- func (c *BuilderConfig) SetFlag(flag BuilderFlag)
- func (c *BuilderConfig) SetMemoryPoolLimit(bytes int)
type BuilderFlag
type DataType
type ElementWiseOp
type Engine
- func (e *Engine) CreateExecutionContext() (*ExecutionContext, error)
- func (e *Engine) Destroy()
- func (e *Engine) GetIOTensorName(index int) string
- func (e *Engine) NumIOTensors() int
type ExecutionContext
- func (c *ExecutionContext) Destroy()
- func (c *ExecutionContext) EnqueueV3(stream unsafe.Pointer) error
- func (c *ExecutionContext) SetInputShape(name string, dims []int32) error
- func (c *ExecutionContext) SetOptimizationProfile(index int) error
- func (c *ExecutionContext) SetTensorAddress(name string, data unsafe.Pointer) error
type Layer
- func (l *Layer) GetOutput(index int) *Tensor
- func (l *Layer) SetName(name string)
type Logger
- func CreateLogger(severity Severity) *Logger
- func (l *Logger) Destroy()
type MatrixOp
type NetworkDefinition
- func (n *NetworkDefinition) AddActivation(input *Tensor, actType ActivationType) *Layer
- func (n *NetworkDefinition) AddConstant(dims []int32, dtype DataType, weights unsafe.Pointer, count int64) *Layer
- func (n *NetworkDefinition) AddConvolutionNd(input *Tensor, nbOutputMaps int, kernelSize []int32, ...) *Layer
- func (n *NetworkDefinition) AddElementWise(input1, input2 *Tensor, op ElementWiseOp) *Layer
- func (n *NetworkDefinition) AddInput(name string, dtype DataType, dims []int32) *Tensor
- func (n *NetworkDefinition) AddMatrixMultiply(input0 *Tensor, op0 MatrixOp, input1 *Tensor, op1 MatrixOp) *Layer
- func (n *NetworkDefinition) AddReduce(input *Tensor, op ReduceOp, reduceAxes uint32, keepDims bool) *Layer
- func (n *NetworkDefinition) AddShuffle(input *Tensor) *Layer
- func (n *NetworkDefinition) AddSoftMax(input *Tensor, axis int) *Layer
- func (n *NetworkDefinition) Destroy()
- func (n *NetworkDefinition) MarkOutput(tensor *Tensor)
- func (n *NetworkDefinition) NumInputs() int
- func (n *NetworkDefinition) NumLayers() int
- func (n *NetworkDefinition) NumOutputs() int
type OptimizationProfile
- func (p *OptimizationProfile) AddToConfig(config *BuilderConfig) (int, error)
- func (p *OptimizationProfile) SetDimensions(inputName string, minDims, optDims, maxDims []int32) error
type ReduceOp
type Runtime
- func CreateRuntime(logger *Logger) (*Runtime, error)
- func (r *Runtime) DeserializeEngine(data []byte) (*Engine, error)
- func (r *Runtime) Destroy()
type Severity
type TabularConfig
type TabularEngine
- func CompileTabular[T tensor.Numeric](plan *graph.ExecutionPlan[T], cfg TabularConfig) (*TabularEngine, error)
- func (te *TabularEngine) Destroy()
- func (te *TabularEngine) Infer(inputData []float32, batchSize int) ([]float32, error)
type Tensor

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func Available ¶

func Available() bool

Available returns true if the TensorRT C shim library can be loaded. The result is cached after the first call.

func ShuffleSetFirstTranspose ¶

func ShuffleSetFirstTranspose(layer *Layer, perm []int32)

ShuffleSetFirstTranspose sets the first transpose permutation on a shuffle layer.

func ShuffleSetReshapeDims ¶

func ShuffleSetReshapeDims(layer *Layer, dims []int32)

ShuffleSetReshapeDims sets reshape dimensions on a shuffle layer.

Types ¶

type ActivationType ¶

type ActivationType int

ActivationType specifies the activation function.

const (
	ActivationReLU    ActivationType = 0
	ActivationSigmoid ActivationType = 1
	ActivationTanh    ActivationType = 2
)

type Builder ¶

type Builder struct {
	// contains filtered or unexported fields
}

Builder wraps a TensorRT IBuilder.

func CreateBuilder ¶

func CreateBuilder(logger *Logger) (*Builder, error)

CreateBuilder creates a new TensorRT builder.

func (*Builder) BuildSerializedNetwork ¶

func (b *Builder) BuildSerializedNetwork(network *NetworkDefinition, config *BuilderConfig) ([]byte, error)

BuildSerializedNetwork builds an optimized engine from the network and returns serialized bytes. The caller must use Runtime.DeserializeEngine to load it.

func (*Builder) CreateBuilderConfig ¶

func (b *Builder) CreateBuilderConfig() (*BuilderConfig, error)

CreateBuilderConfig creates a new builder configuration.

func (*Builder) CreateNetwork ¶

func (b *Builder) CreateNetwork() (*NetworkDefinition, error)

CreateNetwork creates a new network definition with explicit batch mode.

func (*Builder) CreateOptimizationProfile ¶

func (b *Builder) CreateOptimizationProfile() (*OptimizationProfile, error)

CreateOptimizationProfile creates a new optimization profile from the builder.

func (*Builder) Destroy ¶

func (b *Builder) Destroy()

Destroy releases the builder.

type BuilderConfig ¶

type BuilderConfig struct {
	// contains filtered or unexported fields
}

BuilderConfig wraps a TensorRT IBuilderConfig.

func (*BuilderConfig) Destroy ¶

func (c *BuilderConfig) Destroy()

Destroy releases the builder config.

func (*BuilderConfig) SetFlag ¶

func (c *BuilderConfig) SetFlag(flag BuilderFlag)

SetFlag enables a builder flag (e.g., FP16 precision).

func (*BuilderConfig) SetMemoryPoolLimit ¶

func (c *BuilderConfig) SetMemoryPoolLimit(bytes int)

SetMemoryPoolLimit sets the maximum workspace memory for engine building.

type BuilderFlag ¶

type BuilderFlag int

BuilderFlag controls engine build options.

const (
	FlagFP16 BuilderFlag = 0
	FlagINT8 BuilderFlag = 1
)

type DataType ¶

type DataType int

DataType specifies the element type for TensorRT tensors.

const (
	Float32 DataType = 0
	Float16 DataType = 1
	Int8    DataType = 2
	Int32   DataType = 3
)

type ElementWiseOp ¶

type ElementWiseOp int

ElementWiseOp specifies the elementwise operation.

const (
	ElementWiseSum  ElementWiseOp = 0
	ElementWiseProd ElementWiseOp = 1
	ElementWiseMax  ElementWiseOp = 2
	ElementWiseMin  ElementWiseOp = 3
	ElementWiseSub  ElementWiseOp = 4
	ElementWiseDiv  ElementWiseOp = 5
)

type Engine ¶

type Engine struct {
	// contains filtered or unexported fields
}

Engine wraps a TensorRT ICudaEngine.

func (*Engine) CreateExecutionContext ¶

func (e *Engine) CreateExecutionContext() (*ExecutionContext, error)

CreateExecutionContext creates an execution context for this engine.

func (*Engine) Destroy ¶

func (e *Engine) Destroy()

Destroy releases the engine.

func (*Engine) GetIOTensorName ¶

func (e *Engine) GetIOTensorName(index int) string

GetIOTensorName returns the name of the I/O tensor at the given index.

func (*Engine) NumIOTensors ¶

func (e *Engine) NumIOTensors() int

NumIOTensors returns the number of input/output tensors.

type ExecutionContext ¶

type ExecutionContext struct {
	// contains filtered or unexported fields
}

ExecutionContext wraps a TensorRT IExecutionContext.

func (*ExecutionContext) Destroy ¶

func (c *ExecutionContext) Destroy()

Destroy releases the execution context.

func (*ExecutionContext) EnqueueV3 ¶

func (c *ExecutionContext) EnqueueV3(stream unsafe.Pointer) error

EnqueueV3 enqueues inference on the given CUDA stream.

func (*ExecutionContext) SetInputShape ¶

func (c *ExecutionContext) SetInputShape(name string, dims []int32) error

SetInputShape sets the input shape for a named tensor on the execution context. Required for dynamic shapes before calling EnqueueV3.

func (*ExecutionContext) SetOptimizationProfile ¶

func (c *ExecutionContext) SetOptimizationProfile(index int) error

SetOptimizationProfile sets the active optimization profile on the context.

func (*ExecutionContext) SetTensorAddress ¶

func (c *ExecutionContext) SetTensorAddress(name string, data unsafe.Pointer) error

SetTensorAddress binds a device pointer to a named tensor.

type Layer ¶

type Layer struct {
	// contains filtered or unexported fields
}

Layer wraps a TensorRT ILayer pointer.

func (*Layer) GetOutput ¶

func (l *Layer) GetOutput(index int) *Tensor

GetOutput returns the output tensor at the given index.

func (*Layer) SetName ¶

func (l *Layer) SetName(name string)

SetName sets the layer name for debugging.

type Logger ¶

type Logger struct {
	// contains filtered or unexported fields
}

Logger wraps a TensorRT ILogger.

func CreateLogger ¶

func CreateLogger(severity Severity) *Logger

CreateLogger creates a new TensorRT logger with the given minimum severity.

func (*Logger) Destroy ¶

func (l *Logger) Destroy()

Destroy releases the logger.

type MatrixOp ¶

type MatrixOp int

MatrixOp specifies whether to transpose a matrix multiply operand.

const (
	MatrixOpNone      MatrixOp = 0
	MatrixOpTranspose MatrixOp = 1
)

type NetworkDefinition ¶

type NetworkDefinition struct {
	// contains filtered or unexported fields
}

NetworkDefinition wraps a TensorRT INetworkDefinition.

func (*NetworkDefinition) AddActivation ¶

func (n *NetworkDefinition) AddActivation(input *Tensor, actType ActivationType) *Layer

AddActivation adds an activation layer.

func (*NetworkDefinition) AddConstant ¶

func (n *NetworkDefinition) AddConstant(dims []int32, dtype DataType, weights unsafe.Pointer, count int64) *Layer

AddConstant adds a constant tensor layer.

func (*NetworkDefinition) AddConvolutionNd ¶

func (n *NetworkDefinition) AddConvolutionNd(input *Tensor, nbOutputMaps int,
	kernelSize []int32, kernelWeights unsafe.Pointer, kernelCount int64,
	biasWeights unsafe.Pointer, biasCount int64) *Layer

AddConvolutionNd adds an N-dimensional convolution layer.

func (*NetworkDefinition) AddElementWise ¶

func (n *NetworkDefinition) AddElementWise(input1, input2 *Tensor, op ElementWiseOp) *Layer

AddElementWise adds an elementwise operation layer.

func (*NetworkDefinition) AddInput ¶

func (n *NetworkDefinition) AddInput(name string, dtype DataType, dims []int32) *Tensor

AddInput adds a network input tensor.

func (*NetworkDefinition) AddMatrixMultiply ¶

func (n *NetworkDefinition) AddMatrixMultiply(input0 *Tensor, op0 MatrixOp,
	input1 *Tensor, op1 MatrixOp) *Layer

AddMatrixMultiply adds a matrix multiplication layer.

func (*NetworkDefinition) AddReduce ¶

func (n *NetworkDefinition) AddReduce(input *Tensor, op ReduceOp, reduceAxes uint32, keepDims bool) *Layer

AddReduce adds a reduce layer.

func (*NetworkDefinition) AddShuffle ¶

func (n *NetworkDefinition) AddShuffle(input *Tensor) *Layer

AddShuffle adds a shuffle (reshape/transpose) layer.

func (*NetworkDefinition) AddSoftMax ¶

func (n *NetworkDefinition) AddSoftMax(input *Tensor, axis int) *Layer

AddSoftMax adds a softmax layer with the given axis.

func (*NetworkDefinition) Destroy ¶

func (n *NetworkDefinition) Destroy()

Destroy releases the network definition.

func (*NetworkDefinition) MarkOutput ¶

func (n *NetworkDefinition) MarkOutput(tensor *Tensor)

MarkOutput marks a tensor as a network output.

func (*NetworkDefinition) NumInputs ¶

func (n *NetworkDefinition) NumInputs() int

NumInputs returns the number of network inputs.

func (*NetworkDefinition) NumLayers ¶

func (n *NetworkDefinition) NumLayers() int

NumLayers returns the number of network layers.

func (*NetworkDefinition) NumOutputs ¶

func (n *NetworkDefinition) NumOutputs() int

NumOutputs returns the number of network outputs.

type OptimizationProfile ¶

type OptimizationProfile struct {
	// contains filtered or unexported fields
}

OptimizationProfile wraps a TensorRT IOptimizationProfile.

func (*OptimizationProfile) AddToConfig ¶

func (p *OptimizationProfile) AddToConfig(config *BuilderConfig) (int, error)

AddToConfig adds this optimization profile to a builder config. Returns the profile index.

func (*OptimizationProfile) SetDimensions ¶

func (p *OptimizationProfile) SetDimensions(inputName string, minDims, optDims, maxDims []int32) error

SetDimensions sets the min/opt/max dimensions for a named input tensor.

type ReduceOp ¶

type ReduceOp int

ReduceOp specifies the reduction operation.

const (
	ReduceSum  ReduceOp = 0
	ReduceProd ReduceOp = 1
	ReduceMax  ReduceOp = 2
	ReduceMin  ReduceOp = 3
	ReduceAvg  ReduceOp = 4
)

type Runtime ¶

type Runtime struct {
	// contains filtered or unexported fields
}

Runtime wraps a TensorRT IRuntime for deserializing engines.

func CreateRuntime ¶

func CreateRuntime(logger *Logger) (*Runtime, error)

CreateRuntime creates a new TensorRT runtime.

func (*Runtime) DeserializeEngine ¶

func (r *Runtime) DeserializeEngine(data []byte) (*Engine, error)

DeserializeEngine loads a serialized engine.

func (*Runtime) Destroy ¶

func (r *Runtime) Destroy()

Destroy releases the runtime.

type Severity ¶

type Severity int

Severity controls the minimum log level for TensorRT's internal logger.

const (
	SeverityInternalError Severity = 0
	SeverityError         Severity = 1
	SeverityWarning       Severity = 2
	SeverityInfo          Severity = 3
	SeverityVerbose       Severity = 4
)

type TabularConfig ¶ added in v0.3.0

type TabularConfig struct {
	// FP16 enables FP16 precision mode. Default false (FP32).
	FP16 bool

	// MaxWorkspaceBytes sets the maximum TensorRT workspace memory.
	// Default: 64 MB.
	MaxWorkspaceBytes int

	// MaxBatchSize is the maximum batch dimension for dynamic shapes.
	// Default: 64.
	MaxBatchSize int

	// OptBatchSize is the optimal batch size for the optimization profile.
	// Default: 1.
	OptBatchSize int
}

TabularConfig controls compilation of a tabular model graph to TensorRT.

type TabularEngine ¶ added in v0.3.0

type TabularEngine struct {
	// contains filtered or unexported fields
}

TabularEngine wraps a compiled TensorRT engine for tabular model inference. It holds the serialized engine, runtime, execution context, and CUDA stream.

func CompileTabular ¶ added in v0.3.0

func CompileTabular[T tensor.Numeric](plan *graph.ExecutionPlan[T], cfg TabularConfig) (*TabularEngine, error)

CompileTabular compiles a graph.ExecutionPlan from a tabular model into a TensorRT engine. Tabular models are small feed-forward networks (MLP) with operations: MatMul, Add, ReLU, Sigmoid, Tanh, Softmax, ReduceSum.

The plan must have been compiled from a graph with a single input tensor of shape [batch, features] and a single output tensor.

func (*TabularEngine) Destroy ¶ added in v0.3.0

func (te *TabularEngine) Destroy()

Destroy releases all TensorRT and CUDA resources.

func (*TabularEngine) Infer ¶ added in v0.3.0

func (te *TabularEngine) Infer(inputData []float32, batchSize int) ([]float32, error)

Infer runs inference on the compiled TensorRT engine with the given input data. inputData shape must be [batch, features]. Returns output data as a flat float32 slice.

type Tensor ¶

type Tensor struct {
	// contains filtered or unexported fields
}

Tensor wraps a TensorRT ITensor pointer.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL