Datasets¶

Debug datasets¶

class torchelie.datasets.ColoredColumns(*size, transform=None)¶

A dataset of precedurally generated images of columns randomly colorized.

Parameters:	size (int) – size of images transform* (transforms or None) – the image transforms to apply to the generated pictures

class torchelie.datasets.ColoredRows(*size, transform=None)¶

A dataset of precedurally generated images of rows randomly colorized.

Parameters:	size (int) – size of images transform* (transforms or None) – the image transforms to apply to the generated pictures

Datasets wrappers¶

class torchelie.datasets.HorizontalConcatDataset(datasets)¶

Concatenates multiple datasets. However, while torchvision’s ConcatDataset just concatenates samples, torchelie’s also relabels classes. While a vertical concat like torchvision’s is useful to add more examples per class, an horizontal concat merges datasets to more classes.

Parameters:	datasets (list of Dataset) – the datasets to concatenate

class torchelie.datasets.PairedDataset(dataset1, dataset2)¶

A dataset that returns all possible pairs of samples of two datasets

Parameters:	dataset1 (Dataset) – a dataset dataset2 (Dataset) – another dataset

class torchelie.datasets.MixUpDataset(dataset, alpha=0.4)¶

Linearly mixes two samples and labels from a dataset according to the MixUp algorithm

https://arxiv.org/abs/1905.02249

Parameters:	dataset (Dataset) – the dataset alpha (float) – the alpha that parameterizes the beta distribution from which the blending factor is sampled

class torchelie.datasets.NoexceptDataset(ds)¶

Wrap a dataset and absorbs the exceptions it raises. Useful in case of a big downloaded dataset with corrupted samples for instance.

Parameters:	ds (Dataset) – a dataset

class torchelie.datasets.WithIndexDataset(ds)¶

Wrap a dataset. Also returns the index of the accessed element. Original dataset’s attributes are transparently accessible

Parameters:	ds (Dataset) – A dataset

class torchelie.datasets.CachedDataset(ds, transform=None, device='cpu')¶

Wrap a dataset. Lazily caches elements returned by the underlying dataset.

Parameters:	ds (Dataset) – A dataset transform (Callable) – transform to apply on cached elements device – the device on which the cache is allocated

class torchelie.datasets.Subset(ds, ratio, remap_unused_classes=False)¶

Create a subset that is a random ratio of a dataset.

Parameters:	ds (Dataset) – the dataset to sample from. Must have a `.samples` member like torchvision’s datasets. ratio (float) – a value between 0 and 1, the subsampling ratio. remap_unused_classes (boolean) – if True, classes not represented in the subset will not be considered. Remaining classes will be numbered from 0 to N.

Functions¶

torchelie.datasets.mixup(x1, x2, y1, y2, num_classes, mixer=None, alpha=0.4)¶

Mixes samples x1 and x2 with respective labels y1 and y2 according to MixUp

\(\lambda \sim \text{Beta}(\alpha, \alpha)\)

\(x = \lambda x_1 + (1-\lambda) x_2\)

\(y = \lambda y_1 + (1 - \lambda) y_2\)

Parameters:	x1 (tensor) – sample 1 x2 (tensor) – sample 2 y1 (tensor) – label 1 y2 (tensor) – label 2 num_classes (int) – number of classes mixer (Distribution, optional) – a distribution to sample lambda from. If unspecified, the distribution will be a Beta(alpha, alpha) alpha (float) – if mixer is unspecified, used to parameterize the Beta distribution