tensor

package
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 18, 2023 License: Apache-2.0 Imports: 7 Imported by: 0

Documentation

Overview

Package tensor provides a `Tensor` interface with 2 different implementations: `Local` and `Device`.

Tensors are multidimensional arrays (from scalar with 0 dimensions, to arbitrarily large dimensions), defined by their shape (a data type and its axes dimensions) and their actual content. As a special case, a Tensor can also be a tuple of multiple tensors.

This implementation uses `gomlx.types.Shape` to represent the shape, and (for now) only explicitly supports dense representations of the data. There are two types of `Tensor` implementation: `Local` and `Device`. Both are wrappers to XLA underlying data representations, but can be used from Go. Mostly they are means to interact with the graph computation.

`Device` means a tensor that is stored in wherever the computation is run (GPU, TPU or on the host itself). `Local` is a tensor in the current local process (still, they are stored in C++ to interact with XLA). When one wants to print or save a tensor, it needs to be converted to `Local`. When one feed a tensor to a computational graph execution, it is converted to `Device`.

Device tensors are considered immutable, with one exception: decomposing tuples destroy them (a implementation choice of the underlying C++ XLA implementation).

Transferring tensors to/from local/device areas has a cost, and should be avoided. For example, while training weights of an ML model, one generally does not need to transfer those weights to local -- just at the end of training to save the model weights. Because the tensors are immutable, the transferring is cached, so if `Tensor.Local()` or `Tensor.Device()` is called multiple times, the price is paid only once.

In addition, tensors (both `Local` and `Device`) can be tuples of other tensors -- a recursive definition that allow for nested values. This seems to be an XLA mechanism to return several values in one -- documentation there is not very clear.

Tensors are not yet concurrency safe. You have to handle race conditions to prevent simultaneous changes. TODO: add the necessary locking mechanisms.

**Advanced**: On storage and mutability:

This is somewhat inefficient for large tensors: there are potentially 3 copies (or more) of values floating around: the "Device" data, stored in the accelerator; the "Local" data stored on the program C++ heap; the Go values. This is still a work in progress on ways to avoid extra copies floating around. In the short term there are the following "unstable" (the API may change) options:

  • Call `Finalize()` or `FinalizeAll()` on tensors that one is no longer needed. This immediately frees the resources, and doesn't wait for the garbage collector. Notice that the Local/Device tensors create links to each other for cache purpose, so if let for the garbage collector they will only be freed once both are no longer used.
  • For `Local` tensors, the functions `Local.ValueOf()`, `Local.Data() returns pointers to the underlying C++ data. One can mutate them directly (the values, not the slice lengths), for instance if loading value from disk, one can avoid an extra copy of the data in Go by loading directly to the C++ data. If you mutate them, then also call ClearCache(), since a corresponding Device will become out of date -- before you feed the modified Local tensor to another graph run.

Index

Constants

This section is empty.

Variables

View Source
var MaxStringSize = 500

MaxStringSize is the largest Local tensor that is actually returned by String() is requested.

Functions

func AnyValueOf

func AnyValueOf(local *Local) (result any)

AnyValueOf constructs a multi-dimension-slice from the Local tensor, and returns it as type `any` (`interface{}`). Works the same way as `ValueOf` without the generics interface.

If local holds an error, returns the error.

The slices themselves shouldn't be modified -- the underlying storage is not owned by Go, and tensor objects are supposed to be immutable. See discussion on data storage and exceptions to mutability on the package description if you really need to mutate its values.

func Data

func Data[T shapes.Number](t *Local) []T

Data returns the flattened data for the given Local. Returns nil if the DType of the tensor is not compatible with the requested number type.

func ToScalar

func ToScalar[T shapes.Number](t Tensor) (result T)

ToScalar returns the scalar stored in the Tensor. If Tensor is a Device tensor, transfer it locally -- presumably a small value. Returns 0 if the DType given is incompatible, or if the Local is not a scalar.

Notice the Tensor DType doesn't need to be exactly the corresponding to the type parameter T. If they are different it will be automatically converted. E.g: `ToScalar[float64](<some types.Int64 tensor>)` will also work and return a `types.Int64` value converted to float64.

func ValueOf

func ValueOf[T shapes.MultiDimensionSlice](local *Local) (result T)

ValueOf constructs a multi-dimension-slice from the Local. Returns nil (or the zero value for the type T) if the Local has an error or is not compatible (shape or DType).

ValueOf will do conversions of type, if possible. Example: let's say t is a [5]int local tensor. One can call `ValueOf[[]float64](t)` and get a Go `[]float64` back of size 5.

The slices themselves shouldn't be modified -- the underlying storage is not owned by Go, and tensor objects are supposed to be immutable. See discussion on data storage and exceptions to mutability on the package description if you really need to mutate its values.

Types

type Device

type Device struct {
	// contains filtered or unexported fields
}

Device represents a tensor (or tuple) stored on device. The object doesn't offer much functionality, except re-use it as input to a graph execution, or converting to a `Local` copy.

To create it, either create a Local tensor first and then convert, or use the output of the execution of a computation graph -- they return Device tensors.

It implements the Tensor interface.

func DeviceWithError

func DeviceWithError(err error) *Device

DeviceWithError creates a device tensor with the given error.

func FromShapedBuffer

func FromShapedBuffer(buffer *xla.OnDeviceBuffer) (deviceT *Device)

FromShapedBuffer creates a Device tensor from XLA's OnDeviceBuffer structure. Internal implementation, most users don't need to use this.

func (Device) AddDevice

func (c Device) AddDevice(device *Device) *Device

AddDevice to the internal cache, and returns itself for convenience.

func (Device) AddLocal

func (c Device) AddLocal(local *Local) *Local

AddLocal to cache and returns the local tensor for convenience.

func (*Device) ClearCache

func (device *Device) ClearCache()

ClearCache disconnects the device tensor to any corresponding local data. See discussion on storage and mutability on the package documentation.

func (Device) ClearDevice

func (c Device) ClearDevice(device *Device)

ClearDevice from cache, and leaves the device tensor passed without a cache.

func (Device) ClearLocal

func (c Device) ClearLocal()

ClearLocal cache, and leaves the local cached tensor without a cache.

func (*Device) DType

func (device *Device) DType() shapes.DType

DType returns the DType of the tensor's shape.

func (Device) Device

func (c Device) Device(hasClient HasClient, deviceNum int) *Device

Device either uses a cached value on device already or it transfers local data to the shapedBuffer store of values (OnDeviceBuffer) and returns a tensor.Device reference -- value is cached for future calls. This is used for instance to transfer parameters when executing a graph.

func (*Device) Empty

func (device *Device) Empty() bool

Empty returns whether Local is holding no data or is in an error state. It's similar to a "nil" state for Local.

func (*Device) Error

func (device *Device) Error() error

Error returns the message that caused an error state.

func (*Device) Finalize

func (device *Device) Finalize()

Finalize releases the memory associated with the shapedBuffer. It becomes empty. It mutates the tensor, but it's handy in case one is dealing with large data. See discussion on storage and mutability on the package documentation.

func (*Device) FinalizeAll

func (device *Device) FinalizeAll()

FinalizeAll releases the memory associated with all copies of the tensor (local and on device), and mark them as empty.

func (*Device) IsTuple

func (device *Device) IsTuple() bool

IsTuple returns whether Local is a tuple.

func (Device) Local

func (c Device) Local() *Local

Local will transfer data from the Device storage to a Local tensor.

func (*Device) Ok

func (device *Device) Ok() bool

Ok returns whether the shapedBuffer is not empty and has no error.

func (*Device) Rank

func (device *Device) Rank() int

Rank returns the rank fo the tensor's shape.

func (*Device) Shape

func (device *Device) Shape() shapes.Shape

Shape returns the shape of the Device.

func (*Device) ShapedBuffer

func (device *Device) ShapedBuffer() *xla.OnDeviceBuffer

ShapedBuffer returns the underlying XLA structure. Internal usage only.

func (*Device) SplitTuple

func (device *Device) SplitTuple() []*Device

SplitTuple splits a device tensor into its elements. In case of error returns nil, or individual tuple element errors are reported in the tensors themselves.

This makes the current device tensor invalid.

func (*Device) SplitTupleError

func (device *Device) SplitTupleError() ([]*Device, error)

SplitTupleError splits a device tensor into its elements. In case of error, return the error.

This makes the current device tensor invalid.

func (*Device) String

func (device *Device) String() string

String converts to string, by converting (transferring) the tensor to local and then using Local.String().

func (*Device) Value

func (device *Device) Value() any

Value returns a multidimensional slice (except if shape is a scalar) containing the values. If there isn't yet a cached local copy of tensor, it first copies the tensor from the device to a local tensor.

type HasClient

type HasClient interface {
	Client() *xla.Client
}

HasClient accepts anything that can return a xla.Client. That includes xla.Client itself and graph.Manager.

type Local

type Local struct {
	// contains filtered or unexported fields
}

Local represents a multidimensional array of one of the supported types (see shapes.Number). It can be from a scalar to an arbitrary rank (number of dimensions). See Shape() to get information about its dimensions and the underlying DType. Finally, it can also hold a tuple of tensors (recursive definition).

It implements the generic Tensor interface, and provides some specialized functionality that assumes the data is local -- all derived from the Data() method, which returns a pointer to the underlying data directly.

A Local tensor needs to be made Device before being fed (as input) to a computation graph. Conversely, the output of computation graphs are Device that need to be converted to Local to introspect/manipulate the values in Go.

Local is not thread safe, if using it concurrently, you will have to protect the access.

func FromAnyValue

func FromAnyValue(value any) (local *Local)

FromAnyValue is a non-generic version of FromValue. The input is expected to be either a scalar or a slice of slices with homogeneous dimensions. If the input happens to already be a Local, it is returned.

func FromDataAndDimensions

func FromDataAndDimensions[T shapes.Supported](data []T, dimensions ...int) (local *Local)

FromDataAndDimensions creates a local tensor with the given dimensions, filled with the given flat values. The DType is inferred from the values.

func FromShape

func FromShape(shape shapes.Shape) (local *Local)

FromShape creates a Local tensor with the given shape, with the data uninitialized.

func FromValue

func FromValue[S shapes.MultiDimensionSlice](value S) *Local

FromValue returns a Local tensor constructed from the given multi-dimension slice (or scalar).

func FromValueAndDimensions

func FromValueAndDimensions[T shapes.Supported](value T, dimensions ...int) (local *Local)

FromValueAndDimensions creates a local tensor with the given dimensions, filled with the given value replicated everywhere. The DType is inferred from the value.

func GobDeserialize added in v0.2.0

func GobDeserialize(decoder *gob.Decoder) (local *Local, err error)

GobDeserialize a Tensor from the reader. Returns new tensor.Local or an error.

func LocalWithError

func LocalWithError(err error) *Local

LocalWithError creates a local tensor with the given error.

func MakeLocalTuple

func MakeLocalTuple(tensors ...*Local) *Local

MakeLocalTuple compose local tensors into a Tuple. The individual tensors are destroyed in the process, as the tuple takes ownership of its parts.

func MakeLocalTupleAny

func MakeLocalTupleAny(values ...any) *Local

MakeLocalTupleAny composes values into local tensor. Values can be any value that can be converted to a *Local tensor, or a *Local tensor. Similar to MakeLocalTuple, but more permissible.

func Zeros

func Zeros(shape shapes.Shape) *Local

Zeros returns a zero initialized Local of the given shape (including DType).

func (Local) AddDevice

func (c Local) AddDevice(device *Device) *Device

AddDevice to the internal cache, and returns itself for convenience.

func (Local) AddLocal

func (c Local) AddLocal(local *Local) *Local

AddLocal to cache and returns the local tensor for convenience.

func (*Local) Bytes

func (local *Local) Bytes() []byte

Bytes returns the same memory as Data, but the raw slice of bytes, with the proper size in bytes.

func (*Local) ClearCache

func (local *Local) ClearCache()

ClearCache disconnect the local tensor to any corresponding shapedBuffer data. See discussion on storage and mutability on the package documentation.

func (Local) ClearDevice

func (c Local) ClearDevice(device *Device)

ClearDevice from cache, and leaves the device tensor passed without a cache.

func (Local) ClearLocal

func (c Local) ClearLocal()

ClearLocal cache, and leaves the local cached tensor without a cache.

func (*Local) DType

func (local *Local) DType() shapes.DType

DType returns the DType of the tensor's shape.

func (*Local) Data

func (local *Local) Data() any

Data returns a slice with the consecutive data of the corresponding DType type. Consider the generic function Data[L]() if you know the type upfront.

It returns nil is Local tensor is in an invalid state, or if it is a tuple.

func (Local) Device

func (c Local) Device(hasClient HasClient, deviceNum int) *Device

Device either uses a cached value on device already or it transfers local data to the shapedBuffer store of values (OnDeviceBuffer) and returns a tensor.Device reference -- value is cached for future calls. This is used for instance to transfer parameters when executing a graph.

func (*Local) Empty

func (local *Local) Empty() bool

Empty returns whether Local is holding no data. It's similar to a "nil" state for Local.

func (*Local) Error

func (local *Local) Error() error

Error returns the message that caused an error state.

func (*Local) Finalize

func (local *Local) Finalize()

Finalize releases the memory associated with the local tensor. It becomes Empty() = true. It mutates the tensor, but it's handy in case one is dealing with large data. See discussion on storage and mutability on the package documentation.

func (*Local) FinalizeAll

func (local *Local) FinalizeAll()

FinalizeAll releases the memory associated with all copies of the tensor (local and on device), and mark them as empty.

func (*Local) GoStr

func (local *Local) GoStr() string

GoStr converts to string, using a Go-syntax representation that can be copied&pasted back to code.

func (*Local) GobSerialize added in v0.2.0

func (local *Local) GobSerialize(encoder *gob.Encoder) (err error)

GobSerialize Local tensor in binary format.

func (*Local) IsTuple

func (local *Local) IsTuple() bool

IsTuple returns whether Local is a tuple.

func (*Local) Literal

func (local *Local) Literal() *xla.Literal

Literal returns the internal storage of the Local value. Internal only, used only by new Op implementations.

func (Local) Local

func (c Local) Local() *Local

Local will transfer data from the Device storage to a Local tensor.

func (*Local) Ok

func (local *Local) Ok() bool

Ok returns whether local is both not empty and is not in error.

func (*Local) Rank

func (local *Local) Rank() int

Rank returns the rank fo the tensor's shape.

func (*Local) Scalar

func (local *Local) Scalar() any

Scalar returns the scalar value contained in the Local. It will return a zero value if the shape is not scalar.

func (*Local) Shape

func (local *Local) Shape() shapes.Shape

Shape of Local, includes DType.

func (*Local) SplitTuple

func (local *Local) SplitTuple() (tensors []*Local, err error)

SplitTuple splits the Tuple tensor into its components. This unfortunately destroys the current Local, emptying it.

func (*Local) String

func (local *Local) String() string

String converts to string, if not too large.

func (*Local) StringN

func (local *Local) StringN(n int) string

StringN converts to string, displaying at most n elements. TODO: nice pretty-print version, even for large tensors.

func (*Local) Value

func (local *Local) Value() any

Value returns a multidimensional slice (except if shape is a scalar) containing the values, cast to type any. Same as AnyValueOf(t).

type Tensor

type Tensor interface {
	// Local version of the tensor. If the underlying tensor is Local already, it's a no-op. Otherwise, the tensor
	// contents are transferred locally. It uses a cache system, so if tensor was already local no transfer happens.
	Local() *Local

	// Device version of the tensor. If the underlying tensor is on the given Device already, it's a no-op. Otherwise,
	// the tensor contents are transferred to the device. It uses a cache system, so if tensor was already local no
	// transfer happens.
	Device(client HasClient, deviceNum int) *Device

	// Shape of the tensor.
	Shape() shapes.Shape

	// DType of the tensor's shape.
	DType() shapes.DType

	// Rank of the tensor's shape.
	Rank() int

	// String returns a printable version of the tensor. This may lead to a transfer from a Device tensor
	// with the Local().
	String() string

	// Value returns a multidimensional slice (except if shape is a scalar) containing the values.
	// If the underlying tensor is on device (e.g: GPU), it's transferred locally with Local().
	Value() any

	// Error returns the message that caused an error state.
	Error() error

	// Ok returns whether the tensor is in an invalid state.
	Ok() bool

	// FinalizeAll immediately frees the dat of all versions of the Tensor -- Local or on Device, and make the
	// tensor invalid.
	FinalizeAll()
}

Tensor represents a multidimensional arrays (from scalar with 0 dimensions, to arbitrarily large dimensions), defined by their shape (a data type and its axes dimensions) and their actual content. As a special case, a Tensor can also be a tuple of multiple tensors.

Tensor can be implemented by a tensor.Local or tensor.Device, which reflects whether the data is stored in the local CPU on or the device actually running the computation: an accelerator like a GPU or the CPU as well.

Local and Device tensors can be converted to each other -- there is a transferring cost to that. There is a cache system to prevent duplicate transfers, but it assumes immutability -- call ClearCache after mutating a Local tensor.

More details in the `tensor` package documentation.

Directories

Path Synopsis
Package image provides several functions to transform images back and forth from tensors.
Package image provides several functions to transform images back and forth from tensors.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL