Documentation
¶
Overview ¶
Package cifar provides a library of tools to download and manipulate Cifar-10 dataset. Information about it in https://www.cs.toronto.edu/~kriz/cifar.html
Index ¶
- Constants
- Variables
- func ConvertToGoImage(images *tensor.Local, exampleNum int) *image.NRGBA
- func DownloadCifar10(baseDir string) error
- func DownloadCifar100(baseDir string) error
- func LoadCifar10(baseDir string, dtype shapes.DType) (images, labels *tensor.Local, err error)
- func LoadCifar100(baseDir string, dtype shapes.DType) (images, labels *tensor.Local, err error)
- func ResetCache()
- type DataSource
- type Dataset
- func (ds *Dataset) GatherImagesGraph(ctx *context.Context, batchIndices *Node) (batchImages *Node)
- func (ds *Dataset) Name() string
- func (ds *Dataset) Reset()
- func (ds *Dataset) Shapes() (inputs []shapes.Shape, labels shapes.Shape)
- func (ds *Dataset) Yield() (spec any, inputs, labels []tensor.Tensor, err error)
- type ImagesAndLabels
- type Partition
Constants ¶
const ( C10Url = "https://www.cs.toronto.edu/~kriz/cifar-10-binary.tar.gz" C10TarName = "cifar-10-binary.tar.gz" C10SubDir = "cifar-10-batches-bin" C100Url = "https://www.cs.toronto.edu/~kriz/cifar-100-binary.tar.gz" C100TarName = "cifar-100-binary.tar.gz" C100SubDir = "cifar-100-binary" // NumExamples is the total number of examples, including training and testing. // The value is the same for both, Cifar-10 and Cifar-100. NumExamples = 60000 // NumTrainExamples is the number of examples reserved for training, the starting ones. // The value is the same for both, Cifar-10 and Cifar-100. NumTrainExamples = 50000 // NumTestExamples is the number of examples reserved for testing, the last ones. // The value is the same for both, Cifar-10 and Cifar-100. NumTestExamples = 10000 )
const ( Width int = 32 Height int = 32 Depth int = 3 )
Width, Height and Depth are the dimensions of the images, the same for Cifar-10 and Cifar-100.
const C10ExamplesPerFile = 10000
Variables ¶
var ( C10Labels = [10]string{"airplane", "automobile", "bird", "cat", "deer", "dog", "frog", "horse", "ship", "truck"} C100CoarseLabels = [20]string{"aquatic_mammals", "fish", "flowers", "food_containers", "fruit_and_vegetables", "household_electrical_devices", "household_furniture", "insects", "large_carnivores", "large_man-made_outdoor_things", "large_natural_outdoor_scenes", "large_omnivores_and_herbivores", "medium_mammals", "non-insect_invertebrates", "people", "reptiles", "small_mammals", "trees", "vehicles_1", "vehicles_2"} C100FineLabels = [100]string{"apple", "aquarium_fish", "baby", "bear", "beaver", "bed", "bee", "beetle", "bicycle", "bottle", "bowl", "boy", "bridge", "bus", "butterfly", "camel", "can", "castle", "caterpillar", "cattle", "chair", "chimpanzee", "clock", "cloud", "cockroach", "couch", "crab", "crocodile", "cup", "dinosaur", "dolphin", "elephant", "flatfish", "forest", "fox", "girl", "hamster", "house", "kangaroo", "keyboard", "lamp", "lawn_mower", "leopard", "lion", "lizard", "lobster", "man", "maple_tree", "motorcycle", "mountain", "mouse", "mushroom", "oak_tree", "orange", "orchid", "otter", "palm_tree", "pear", "pickup_truck", "pine_tree", "plain", "plate", "poppy", "porcupine", "possum", "rabbit", "raccoon", "ray", "road", "rocket", "rose", "sea", "seal", "shark", "shrew", "skunk", "skyscraper", "snail", "snake", "spider", "squirrel", "streetcar", "sunflower", "sweet_pepper", "table", "tank", "telephone", "television", "tiger", "tractor", "train", "trout", "tulip", "turtle", "wardrobe", "whale", "willow_tree", "wolf", "woman", "worm"} )
Functions ¶
func DownloadCifar10 ¶
func DownloadCifar100 ¶
func LoadCifar10 ¶
LoadCifar10 into 2 tensors of the given DType: images with given dtype and shaped [NumExamples=60000, Height=32, Width=32, Depth=3], and labels shaped [NumExamples=60000, 1] of Int64. The first 50k examples are for training, and the last 10k for testing. Only Float32 and Float64 dtypes are supported for now.
func LoadCifar100 ¶
LoadCifar100 into 2 tensors of the given DType: images with given dtype and shaped [NumExamples=60000, Height=32, Width=32, Depth=3], and labels shaped [NumExamples=60000, 1] of Int64. The first 50k examples are for training, and the last 10k for testing. Only Float32 and Float64 dtypes are supported for now.
func ResetCache ¶
func ResetCache()
Types ¶
type DataSource ¶
type DataSource int
DataSource refers to Cifar-10 (C10) or Cifar-100 (C100).
const ( C10 DataSource = iota C100 )
type Dataset ¶
type Dataset struct {
// contains filtered or unexported fields
}
Dataset provides a stream of data for Train/Test partitions. It implements `gomlx/ml/train.Dataset`
It yields indices and labels. Indices have to be converted to images using Dataset.GatherImagesGraph -- the images themselves are stored in device tensor, presumably faster.
If eval is set, it will yield batches sequentially (not-shuffled) until the end of the partition (Train/Test), and then yield io.EOF.
If eval is false, it will yield batches of random samples (with replacement) indefinitely.
See example usage in the demo/ subpackage.
func NewDataset ¶
func NewDataset(name, baseDir string, source DataSource, dtype shapes.DType, partition Partition, batchSize int, eval bool) (*Dataset, error)
NewDataset returns a Dataset for the training data, which implements train.Dataset and hence can be used by train.Trainer methods.
It automatically downloads the data from the web, and then loads the data into memory, if it hasn't been loaded yet. It caches the result, so multiple Dataset's can be created without any extra costs in time/memory.
func (*Dataset) GatherImagesGraph ¶
GatherImagesGraph converts a batch of indices to a batch of images (shape=[batchSize, 32, 32, 3]). Since datasets hold all the Cifar (10 or 100) data, a train or test dataset will work for indices from either.
func (*Dataset) Reset ¶
func (ds *Dataset) Reset()
Reset implements train.Dataset and, for an evaluation dataset, restarts it.
type ImagesAndLabels ¶
type ImagesAndLabels struct {
// contains filtered or unexported fields
}