Documentation
¶
Overview ¶
Package tensorrt provides bindings for the NVIDIA TensorRT inference library via purego (dlopen/dlsym, no CGo). It loads libtrt_capi.so at runtime. Call Available() to check whether the library was found.
Index ¶
- func Available() bool
- func ShuffleSetFirstTranspose(layer *Layer, perm []int32)
- func ShuffleSetReshapeDims(layer *Layer, dims []int32)
- type ActivationType
- type Builder
- func (b *Builder) BuildSerializedNetwork(network *NetworkDefinition, config *BuilderConfig) ([]byte, error)
- func (b *Builder) CreateBuilderConfig() (*BuilderConfig, error)
- func (b *Builder) CreateNetwork() (*NetworkDefinition, error)
- func (b *Builder) CreateOptimizationProfile() (*OptimizationProfile, error)
- func (b *Builder) Destroy()
- type BuilderConfig
- type BuilderFlag
- type DataType
- type ElementWiseOp
- type Engine
- type ExecutionContext
- func (c *ExecutionContext) Destroy()
- func (c *ExecutionContext) EnqueueV3(stream unsafe.Pointer) error
- func (c *ExecutionContext) SetInputShape(name string, dims []int32) error
- func (c *ExecutionContext) SetOptimizationProfile(index int) error
- func (c *ExecutionContext) SetTensorAddress(name string, data unsafe.Pointer) error
- type Layer
- type Logger
- type MatrixOp
- type NetworkDefinition
- func (n *NetworkDefinition) AddActivation(input *Tensor, actType ActivationType) *Layer
- func (n *NetworkDefinition) AddConstant(dims []int32, dtype DataType, weights unsafe.Pointer, count int64) *Layer
- func (n *NetworkDefinition) AddConvolutionNd(input *Tensor, nbOutputMaps int, kernelSize []int32, ...) *Layer
- func (n *NetworkDefinition) AddElementWise(input1, input2 *Tensor, op ElementWiseOp) *Layer
- func (n *NetworkDefinition) AddInput(name string, dtype DataType, dims []int32) *Tensor
- func (n *NetworkDefinition) AddMatrixMultiply(input0 *Tensor, op0 MatrixOp, input1 *Tensor, op1 MatrixOp) *Layer
- func (n *NetworkDefinition) AddReduce(input *Tensor, op ReduceOp, reduceAxes uint32, keepDims bool) *Layer
- func (n *NetworkDefinition) AddShuffle(input *Tensor) *Layer
- func (n *NetworkDefinition) AddSoftMax(input *Tensor, axis int) *Layer
- func (n *NetworkDefinition) Destroy()
- func (n *NetworkDefinition) MarkOutput(tensor *Tensor)
- func (n *NetworkDefinition) NumInputs() int
- func (n *NetworkDefinition) NumLayers() int
- func (n *NetworkDefinition) NumOutputs() int
- type OptimizationProfile
- type ReduceOp
- type Runtime
- type Severity
- type TabularConfig
- type TabularEngine
- type Tensor
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func Available ¶
func Available() bool
Available returns true if the TensorRT C shim library can be loaded. The result is cached after the first call.
func ShuffleSetFirstTranspose ¶
ShuffleSetFirstTranspose sets the first transpose permutation on a shuffle layer.
func ShuffleSetReshapeDims ¶
ShuffleSetReshapeDims sets reshape dimensions on a shuffle layer.
Types ¶
type ActivationType ¶
type ActivationType int
ActivationType specifies the activation function.
const ( ActivationReLU ActivationType = 0 ActivationSigmoid ActivationType = 1 ActivationTanh ActivationType = 2 )
type Builder ¶
type Builder struct {
// contains filtered or unexported fields
}
Builder wraps a TensorRT IBuilder.
func CreateBuilder ¶
CreateBuilder creates a new TensorRT builder.
func (*Builder) BuildSerializedNetwork ¶
func (b *Builder) BuildSerializedNetwork(network *NetworkDefinition, config *BuilderConfig) ([]byte, error)
BuildSerializedNetwork builds an optimized engine from the network and returns serialized bytes. The caller must use Runtime.DeserializeEngine to load it.
func (*Builder) CreateBuilderConfig ¶
func (b *Builder) CreateBuilderConfig() (*BuilderConfig, error)
CreateBuilderConfig creates a new builder configuration.
func (*Builder) CreateNetwork ¶
func (b *Builder) CreateNetwork() (*NetworkDefinition, error)
CreateNetwork creates a new network definition with explicit batch mode.
func (*Builder) CreateOptimizationProfile ¶
func (b *Builder) CreateOptimizationProfile() (*OptimizationProfile, error)
CreateOptimizationProfile creates a new optimization profile from the builder.
type BuilderConfig ¶
type BuilderConfig struct {
// contains filtered or unexported fields
}
BuilderConfig wraps a TensorRT IBuilderConfig.
func (*BuilderConfig) Destroy ¶
func (c *BuilderConfig) Destroy()
Destroy releases the builder config.
func (*BuilderConfig) SetFlag ¶
func (c *BuilderConfig) SetFlag(flag BuilderFlag)
SetFlag enables a builder flag (e.g., FP16 precision).
func (*BuilderConfig) SetMemoryPoolLimit ¶
func (c *BuilderConfig) SetMemoryPoolLimit(bytes int)
SetMemoryPoolLimit sets the maximum workspace memory for engine building.
type BuilderFlag ¶
type BuilderFlag int
BuilderFlag controls engine build options.
const ( FlagFP16 BuilderFlag = 0 FlagINT8 BuilderFlag = 1 )
type ElementWiseOp ¶
type ElementWiseOp int
ElementWiseOp specifies the elementwise operation.
const ( ElementWiseSum ElementWiseOp = 0 ElementWiseProd ElementWiseOp = 1 ElementWiseMax ElementWiseOp = 2 ElementWiseMin ElementWiseOp = 3 ElementWiseSub ElementWiseOp = 4 ElementWiseDiv ElementWiseOp = 5 )
type Engine ¶
type Engine struct {
// contains filtered or unexported fields
}
Engine wraps a TensorRT ICudaEngine.
func (*Engine) CreateExecutionContext ¶
func (e *Engine) CreateExecutionContext() (*ExecutionContext, error)
CreateExecutionContext creates an execution context for this engine.
func (*Engine) GetIOTensorName ¶
GetIOTensorName returns the name of the I/O tensor at the given index.
func (*Engine) NumIOTensors ¶
NumIOTensors returns the number of input/output tensors.
type ExecutionContext ¶
type ExecutionContext struct {
// contains filtered or unexported fields
}
ExecutionContext wraps a TensorRT IExecutionContext.
func (*ExecutionContext) Destroy ¶
func (c *ExecutionContext) Destroy()
Destroy releases the execution context.
func (*ExecutionContext) EnqueueV3 ¶
func (c *ExecutionContext) EnqueueV3(stream unsafe.Pointer) error
EnqueueV3 enqueues inference on the given CUDA stream.
func (*ExecutionContext) SetInputShape ¶
func (c *ExecutionContext) SetInputShape(name string, dims []int32) error
SetInputShape sets the input shape for a named tensor on the execution context. Required for dynamic shapes before calling EnqueueV3.
func (*ExecutionContext) SetOptimizationProfile ¶
func (c *ExecutionContext) SetOptimizationProfile(index int) error
SetOptimizationProfile sets the active optimization profile on the context.
func (*ExecutionContext) SetTensorAddress ¶
func (c *ExecutionContext) SetTensorAddress(name string, data unsafe.Pointer) error
SetTensorAddress binds a device pointer to a named tensor.
type Layer ¶
type Layer struct {
// contains filtered or unexported fields
}
Layer wraps a TensorRT ILayer pointer.
type Logger ¶
type Logger struct {
// contains filtered or unexported fields
}
Logger wraps a TensorRT ILogger.
func CreateLogger ¶
CreateLogger creates a new TensorRT logger with the given minimum severity.
type MatrixOp ¶
type MatrixOp int
MatrixOp specifies whether to transpose a matrix multiply operand.
type NetworkDefinition ¶
type NetworkDefinition struct {
// contains filtered or unexported fields
}
NetworkDefinition wraps a TensorRT INetworkDefinition.
func (*NetworkDefinition) AddActivation ¶
func (n *NetworkDefinition) AddActivation(input *Tensor, actType ActivationType) *Layer
AddActivation adds an activation layer.
func (*NetworkDefinition) AddConstant ¶
func (n *NetworkDefinition) AddConstant(dims []int32, dtype DataType, weights unsafe.Pointer, count int64) *Layer
AddConstant adds a constant tensor layer.
func (*NetworkDefinition) AddConvolutionNd ¶
func (n *NetworkDefinition) AddConvolutionNd(input *Tensor, nbOutputMaps int, kernelSize []int32, kernelWeights unsafe.Pointer, kernelCount int64, biasWeights unsafe.Pointer, biasCount int64) *Layer
AddConvolutionNd adds an N-dimensional convolution layer.
func (*NetworkDefinition) AddElementWise ¶
func (n *NetworkDefinition) AddElementWise(input1, input2 *Tensor, op ElementWiseOp) *Layer
AddElementWise adds an elementwise operation layer.
func (*NetworkDefinition) AddInput ¶
func (n *NetworkDefinition) AddInput(name string, dtype DataType, dims []int32) *Tensor
AddInput adds a network input tensor.
func (*NetworkDefinition) AddMatrixMultiply ¶
func (n *NetworkDefinition) AddMatrixMultiply(input0 *Tensor, op0 MatrixOp, input1 *Tensor, op1 MatrixOp) *Layer
AddMatrixMultiply adds a matrix multiplication layer.
func (*NetworkDefinition) AddReduce ¶
func (n *NetworkDefinition) AddReduce(input *Tensor, op ReduceOp, reduceAxes uint32, keepDims bool) *Layer
AddReduce adds a reduce layer.
func (*NetworkDefinition) AddShuffle ¶
func (n *NetworkDefinition) AddShuffle(input *Tensor) *Layer
AddShuffle adds a shuffle (reshape/transpose) layer.
func (*NetworkDefinition) AddSoftMax ¶
func (n *NetworkDefinition) AddSoftMax(input *Tensor, axis int) *Layer
AddSoftMax adds a softmax layer with the given axis.
func (*NetworkDefinition) Destroy ¶
func (n *NetworkDefinition) Destroy()
Destroy releases the network definition.
func (*NetworkDefinition) MarkOutput ¶
func (n *NetworkDefinition) MarkOutput(tensor *Tensor)
MarkOutput marks a tensor as a network output.
func (*NetworkDefinition) NumInputs ¶
func (n *NetworkDefinition) NumInputs() int
NumInputs returns the number of network inputs.
func (*NetworkDefinition) NumLayers ¶
func (n *NetworkDefinition) NumLayers() int
NumLayers returns the number of network layers.
func (*NetworkDefinition) NumOutputs ¶
func (n *NetworkDefinition) NumOutputs() int
NumOutputs returns the number of network outputs.
type OptimizationProfile ¶
type OptimizationProfile struct {
// contains filtered or unexported fields
}
OptimizationProfile wraps a TensorRT IOptimizationProfile.
func (*OptimizationProfile) AddToConfig ¶
func (p *OptimizationProfile) AddToConfig(config *BuilderConfig) (int, error)
AddToConfig adds this optimization profile to a builder config. Returns the profile index.
func (*OptimizationProfile) SetDimensions ¶
func (p *OptimizationProfile) SetDimensions(inputName string, minDims, optDims, maxDims []int32) error
SetDimensions sets the min/opt/max dimensions for a named input tensor.
type Runtime ¶
type Runtime struct {
// contains filtered or unexported fields
}
Runtime wraps a TensorRT IRuntime for deserializing engines.
func CreateRuntime ¶
CreateRuntime creates a new TensorRT runtime.
func (*Runtime) DeserializeEngine ¶
DeserializeEngine loads a serialized engine.
type Severity ¶
type Severity int
Severity controls the minimum log level for TensorRT's internal logger.
type TabularConfig ¶ added in v0.3.0
type TabularConfig struct {
// FP16 enables FP16 precision mode. Default false (FP32).
FP16 bool
// MaxWorkspaceBytes sets the maximum TensorRT workspace memory.
// Default: 64 MB.
MaxWorkspaceBytes int
// MaxBatchSize is the maximum batch dimension for dynamic shapes.
// Default: 64.
MaxBatchSize int
// OptBatchSize is the optimal batch size for the optimization profile.
// Default: 1.
OptBatchSize int
}
TabularConfig controls compilation of a tabular model graph to TensorRT.
type TabularEngine ¶ added in v0.3.0
type TabularEngine struct {
// contains filtered or unexported fields
}
TabularEngine wraps a compiled TensorRT engine for tabular model inference. It holds the serialized engine, runtime, execution context, and CUDA stream.
func CompileTabular ¶ added in v0.3.0
func CompileTabular[T tensor.Numeric](plan *graph.ExecutionPlan[T], cfg TabularConfig) (*TabularEngine, error)
CompileTabular compiles a graph.ExecutionPlan from a tabular model into a TensorRT engine. Tabular models are small feed-forward networks (MLP) with operations: MatMul, Add, ReLU, Sigmoid, Tanh, Softmax, ReduceSum.
The plan must have been compiled from a graph with a single input tensor of shape [batch, features] and a single output tensor.
func (*TabularEngine) Destroy ¶ added in v0.3.0
func (te *TabularEngine) Destroy()
Destroy releases all TensorRT and CUDA resources.