Documentation
¶
Overview ¶
Package tensor provides a `Tensor` interface with 2 different implementations: `Local` and `Device`.
Tensors are multidimensional arrays (from scalar with 0 dimensions, to arbitrarily large dimensions), defined by their shape (a data type and its axes dimensions) and their actual content. As a special case, a Tensor can also be a tuple of multiple tensors.
This implementation uses `gomlx.types.Shape` to represent the shape, and (for now) only explicitly supports dense representations of the data. There are two types of `Tensor` implementation: `Local` and `Device`. Both are wrappers to XLA underlying data representations, but can be used from Go. Mostly they are means to interact with the graph computation.
`Device` means a tensor that is stored in wherever the computation is run (GPU, TPU or on the host itself). `Local` is a tensor in the current local process (still, they are stored in C++ to interact with XLA). When one wants to print or save a tensor, it needs to be converted to `Local`. When one feed a tensor to a computational graph execution, it is converted to `Device`.
Device tensors are considered immutable, with one exception: decomposing tuples destroy them (a implementation choice of the underlying C++ XLA implementation).
Transferring tensors to/from local/device areas has a cost, and should be avoided. For example, while training weights of an ML model, one generally does not need to transfer those weights to local -- just at the end of training to save the model weights. Because the tensors are immutable, the transferring is cached, so if `Tensor.Local()` or `Tensor.Device()` is called multiple times, the price is paid only once.
In addition, tensors (both `Local` and `Device`) can be tuples of other tensors -- a recursive definition that allow for nested values. This seems to be an XLA mechanism to return several values in one -- documentation there is not very clear.
Tensors are not yet concurrency safe. You have to handle race conditions to prevent simultaneous changes. TODO: add the necessary locking mechanisms.
**Advanced**: On storage and mutability:
This is somewhat inefficient for large tensors: there are potentially 3 copies (or more) of values floating around: the "Device" data, stored in the accelerator; the "Local" data stored on the program C++ heap; the Go values. This is still a work in progress on ways to avoid extra copies floating around. In the short term there are the following "unstable" (the API may change) options:
- Call `Finalize()` or `FinalizeAll()` on tensors that one is no longer needed. This immediately frees the resources, and doesn't wait for the garbage collector. Notice that the Local/Device tensors create links to each other for cache purpose, so if let for the garbage collector they will only be freed once both are no longer used.
- For `Local` tensors, the functions `Local.ValueOf()`, `Local.Data() returns pointers to the underlying C++ data. One can mutate them directly (the values, not the slice lengths), for instance if loading value from disk, one can avoid an extra copy of the data in Go by loading directly to the C++ data. If you mutate them, then also call ClearCache(), since a corresponding Device will become out of date -- before you feed the modified Local tensor to another graph run.
Index ¶
- Variables
- func AnyValueOf(local *Local) (result any)
- func Data[T shapes.Number](t *Local) []T
- func ToScalar[T shapes.Number](t Tensor) (result T)
- func ValueOf[T shapes.MultiDimensionSlice](local *Local) (result T)
- type Device
- func (c Device) AddDevice(device *Device) *Device
- func (c Device) AddLocal(local *Local) *Local
- func (device *Device) ClearCache()
- func (c Device) ClearDevice(device *Device)
- func (c Device) ClearLocal()
- func (device *Device) DType() shapes.DType
- func (c Device) Device(hasClient HasClient, deviceNum int) *Device
- func (device *Device) Empty() bool
- func (device *Device) Error() error
- func (device *Device) Finalize()
- func (device *Device) FinalizeAll()
- func (device *Device) IsTuple() bool
- func (c Device) Local() *Local
- func (device *Device) Ok() bool
- func (device *Device) Rank() int
- func (device *Device) Shape() shapes.Shape
- func (device *Device) ShapedBuffer() *xla.OnDeviceBuffer
- func (device *Device) SplitTuple() []*Device
- func (device *Device) SplitTupleError() ([]*Device, error)
- func (device *Device) String() string
- func (device *Device) Value() any
- type HasClient
- type Local
- func FromAnyValue(value any) (local *Local)
- func FromDataAndDimensions[T shapes.Supported](data []T, dimensions ...int) (local *Local)
- func FromShape(shape shapes.Shape) (local *Local)
- func FromValue[S shapes.MultiDimensionSlice](value S) *Local
- func FromValueAndDimensions[T shapes.Supported](value T, dimensions ...int) (local *Local)
- func GobDeserialize(decoder *gob.Decoder) (local *Local, err error)
- func LocalWithError(err error) *Local
- func MakeLocalTuple(tensors ...*Local) *Local
- func MakeLocalTupleAny(values ...any) *Local
- func Zeros(shape shapes.Shape) *Local
- func (c Local) AddDevice(device *Device) *Device
- func (c Local) AddLocal(local *Local) *Local
- func (local *Local) Bytes() []byte
- func (local *Local) ClearCache()
- func (c Local) ClearDevice(device *Device)
- func (c Local) ClearLocal()
- func (local *Local) DType() shapes.DType
- func (local *Local) Data() any
- func (c Local) Device(hasClient HasClient, deviceNum int) *Device
- func (local *Local) Empty() bool
- func (local *Local) Error() error
- func (local *Local) Finalize()
- func (local *Local) FinalizeAll()
- func (local *Local) GoStr() string
- func (local *Local) GobSerialize(encoder *gob.Encoder) (err error)
- func (local *Local) IsTuple() bool
- func (local *Local) Literal() *xla.Literal
- func (c Local) Local() *Local
- func (local *Local) Ok() bool
- func (local *Local) Rank() int
- func (local *Local) Scalar() any
- func (local *Local) Shape() shapes.Shape
- func (local *Local) SplitTuple() (tensors []*Local, err error)
- func (local *Local) String() string
- func (local *Local) StringN(n int) string
- func (local *Local) Value() any
- type Tensor
Constants ¶
This section is empty.
Variables ¶
var MaxStringSize = 500
MaxStringSize is the largest Local tensor that is actually returned by String() is requested.
Functions ¶
func AnyValueOf ¶
AnyValueOf constructs a multi-dimension-slice from the Local tensor, and returns it as type `any` (`interface{}`). Works the same way as `ValueOf` without the generics interface.
If local holds an error, returns the error.
The slices themselves shouldn't be modified -- the underlying storage is not owned by Go, and tensor objects are supposed to be immutable. See discussion on data storage and exceptions to mutability on the package description if you really need to mutate its values.
func Data ¶
Data returns the flattened data for the given Local. Returns nil if the DType of the tensor is not compatible with the requested number type.
func ToScalar ¶
ToScalar returns the scalar stored in the Tensor. If Tensor is a Device tensor, transfer it locally -- presumably a small value. Returns 0 if the DType given is incompatible, or if the Local is not a scalar.
Notice the Tensor DType doesn't need to be exactly the corresponding to the type parameter T. If they are different it will be automatically converted. E.g: `ToScalar[float64](<some types.Int64 tensor>)` will also work and return a `types.Int64` value converted to float64.
func ValueOf ¶
func ValueOf[T shapes.MultiDimensionSlice](local *Local) (result T)
ValueOf constructs a multi-dimension-slice from the Local. Returns nil (or the zero value for the type T) if the Local has an error or is not compatible (shape or DType).
ValueOf will do conversions of type, if possible. Example: let's say t is a [5]int local tensor. One can call `ValueOf[[]float64](t)` and get a Go `[]float64` back of size 5.
The slices themselves shouldn't be modified -- the underlying storage is not owned by Go, and tensor objects are supposed to be immutable. See discussion on data storage and exceptions to mutability on the package description if you really need to mutate its values.
Types ¶
type Device ¶
type Device struct {
// contains filtered or unexported fields
}
Device represents a tensor (or tuple) stored on device. The object doesn't offer much functionality, except re-use it as input to a graph execution, or converting to a `Local` copy.
To create it, either create a Local tensor first and then convert, or use the output of the execution of a computation graph -- they return Device tensors.
It implements the Tensor interface.
func DeviceWithError ¶
DeviceWithError creates a device tensor with the given error.
func FromShapedBuffer ¶
func FromShapedBuffer(buffer *xla.OnDeviceBuffer) (deviceT *Device)
FromShapedBuffer creates a Device tensor from XLA's OnDeviceBuffer structure. Internal implementation, most users don't need to use this.
func (*Device) ClearCache ¶
func (device *Device) ClearCache()
ClearCache disconnects the device tensor to any corresponding local data. See discussion on storage and mutability on the package documentation.
func (Device) ClearDevice ¶
func (c Device) ClearDevice(device *Device)
ClearDevice from cache, and leaves the device tensor passed without a cache.
func (Device) ClearLocal ¶
func (c Device) ClearLocal()
ClearLocal cache, and leaves the local cached tensor without a cache.
func (Device) Device ¶
Device either uses a cached value on device already or it transfers local data to the shapedBuffer store of values (OnDeviceBuffer) and returns a tensor.Device reference -- value is cached for future calls. This is used for instance to transfer parameters when executing a graph.
func (*Device) Empty ¶
Empty returns whether Local is holding no data or is in an error state. It's similar to a "nil" state for Local.
func (*Device) Finalize ¶
func (device *Device) Finalize()
Finalize releases the memory associated with the shapedBuffer. It becomes empty. It mutates the tensor, but it's handy in case one is dealing with large data. See discussion on storage and mutability on the package documentation.
func (*Device) FinalizeAll ¶
func (device *Device) FinalizeAll()
FinalizeAll releases the memory associated with all copies of the tensor (local and on device), and mark them as empty.
func (Device) Local ¶
func (c Device) Local() *Local
Local will transfer data from the Device storage to a Local tensor.
func (*Device) ShapedBuffer ¶
func (device *Device) ShapedBuffer() *xla.OnDeviceBuffer
ShapedBuffer returns the underlying XLA structure. Internal usage only.
func (*Device) SplitTuple ¶
SplitTuple splits a device tensor into its elements. In case of error returns nil, or individual tuple element errors are reported in the tensors themselves.
This makes the current device tensor invalid.
func (*Device) SplitTupleError ¶
SplitTupleError splits a device tensor into its elements. In case of error, return the error.
This makes the current device tensor invalid.
type HasClient ¶
HasClient accepts anything that can return a xla.Client. That includes xla.Client itself and graph.Manager.
type Local ¶
type Local struct {
// contains filtered or unexported fields
}
Local represents a multidimensional array of one of the supported types (see shapes.Number). It can be from a scalar to an arbitrary rank (number of dimensions). See Shape() to get information about its dimensions and the underlying DType. Finally, it can also hold a tuple of tensors (recursive definition).
It implements the generic Tensor interface, and provides some specialized functionality that assumes the data is local -- all derived from the Data() method, which returns a pointer to the underlying data directly.
A Local tensor needs to be made Device before being fed (as input) to a computation graph. Conversely, the output of computation graphs are Device that need to be converted to Local to introspect/manipulate the values in Go.
Local is not thread safe, if using it concurrently, you will have to protect the access.
func FromAnyValue ¶
FromAnyValue is a non-generic version of FromValue. The input is expected to be either a scalar or a slice of slices with homogeneous dimensions. If the input happens to already be a Local, it is returned.
func FromDataAndDimensions ¶
FromDataAndDimensions creates a local tensor with the given dimensions, filled with the given flat values. The DType is inferred from the values.
func FromShape ¶
FromShape creates a Local tensor with the given shape, with the data uninitialized.
func FromValue ¶
func FromValue[S shapes.MultiDimensionSlice](value S) *Local
FromValue returns a Local tensor constructed from the given multi-dimension slice (or scalar).
func FromValueAndDimensions ¶
FromValueAndDimensions creates a local tensor with the given dimensions, filled with the given value replicated everywhere. The DType is inferred from the value.
func GobDeserialize ¶ added in v0.2.0
GobDeserialize a Tensor from the reader. Returns new tensor.Local or an error.
func LocalWithError ¶
LocalWithError creates a local tensor with the given error.
func MakeLocalTuple ¶
MakeLocalTuple compose local tensors into a Tuple. The individual tensors are destroyed in the process, as the tuple takes ownership of its parts.
func MakeLocalTupleAny ¶
MakeLocalTupleAny composes values into local tensor. Values can be any value that can be converted to a *Local tensor, or a *Local tensor. Similar to MakeLocalTuple, but more permissible.
func (*Local) Bytes ¶
Bytes returns the same memory as Data, but the raw slice of bytes, with the proper size in bytes.
func (*Local) ClearCache ¶
func (local *Local) ClearCache()
ClearCache disconnect the local tensor to any corresponding shapedBuffer data. See discussion on storage and mutability on the package documentation.
func (Local) ClearDevice ¶
func (c Local) ClearDevice(device *Device)
ClearDevice from cache, and leaves the device tensor passed without a cache.
func (Local) ClearLocal ¶
func (c Local) ClearLocal()
ClearLocal cache, and leaves the local cached tensor without a cache.
func (*Local) Data ¶
Data returns a slice with the consecutive data of the corresponding DType type. Consider the generic function Data[L]() if you know the type upfront.
It returns nil is Local tensor is in an invalid state, or if it is a tuple.
func (Local) Device ¶
Device either uses a cached value on device already or it transfers local data to the shapedBuffer store of values (OnDeviceBuffer) and returns a tensor.Device reference -- value is cached for future calls. This is used for instance to transfer parameters when executing a graph.
func (*Local) Empty ¶
Empty returns whether Local is holding no data. It's similar to a "nil" state for Local.
func (*Local) Finalize ¶
func (local *Local) Finalize()
Finalize releases the memory associated with the local tensor. It becomes Empty() = true. It mutates the tensor, but it's handy in case one is dealing with large data. See discussion on storage and mutability on the package documentation.
func (*Local) FinalizeAll ¶
func (local *Local) FinalizeAll()
FinalizeAll releases the memory associated with all copies of the tensor (local and on device), and mark them as empty.
func (*Local) GoStr ¶
GoStr converts to string, using a Go-syntax representation that can be copied&pasted back to code.
func (*Local) GobSerialize ¶ added in v0.2.0
GobSerialize Local tensor in binary format.
func (*Local) Literal ¶
Literal returns the internal storage of the Local value. Internal only, used only by new Op implementations.
func (Local) Local ¶
func (c Local) Local() *Local
Local will transfer data from the Device storage to a Local tensor.
func (*Local) Scalar ¶
Scalar returns the scalar value contained in the Local. It will return a zero value if the shape is not scalar.
func (*Local) SplitTuple ¶
SplitTuple splits the Tuple tensor into its components. This unfortunately destroys the current Local, emptying it.
type Tensor ¶
type Tensor interface {
// Local version of the tensor. If the underlying tensor is Local already, it's a no-op. Otherwise, the tensor
// contents are transferred locally. It uses a cache system, so if tensor was already local no transfer happens.
Local() *Local
// Device version of the tensor. If the underlying tensor is on the given Device already, it's a no-op. Otherwise,
// the tensor contents are transferred to the device. It uses a cache system, so if tensor was already local no
// transfer happens.
Device(client HasClient, deviceNum int) *Device
// Shape of the tensor.
Shape() shapes.Shape
// DType of the tensor's shape.
DType() shapes.DType
// Rank of the tensor's shape.
Rank() int
// String returns a printable version of the tensor. This may lead to a transfer from a Device tensor
// with the Local().
String() string
// Value returns a multidimensional slice (except if shape is a scalar) containing the values.
// If the underlying tensor is on device (e.g: GPU), it's transferred locally with Local().
Value() any
// Error returns the message that caused an error state.
Error() error
// Ok returns whether the tensor is in an invalid state.
Ok() bool
// FinalizeAll immediately frees the dat of all versions of the Tensor -- Local or on Device, and make the
// tensor invalid.
FinalizeAll()
}
Tensor represents a multidimensional arrays (from scalar with 0 dimensions, to arbitrarily large dimensions), defined by their shape (a data type and its axes dimensions) and their actual content. As a special case, a Tensor can also be a tuple of multiple tensors.
Tensor can be implemented by a tensor.Local or tensor.Device, which reflects whether the data is stored in the local CPU on or the device actually running the computation: an accelerator like a GPU or the CPU as well.
Local and Device tensors can be converted to each other -- there is a transferring cost to that. There is a cache system to prevent duplicate transfers, but it assumes immutability -- call ClearCache after mutating a Local tensor.
More details in the `tensor` package documentation.