Datasets¶
Debug datasets¶
-
class
torchelie.datasets.
ColoredColumns
(*size, transform=None)¶ A dataset of precedurally generated images of columns randomly colorized.
Parameters: - *size (int) – size of images
- transform (transforms or None) – the image transforms to apply to the generated pictures
-
class
torchelie.datasets.
ColoredRows
(*size, transform=None)¶ A dataset of precedurally generated images of rows randomly colorized.
Parameters: - *size (int) – size of images
- transform (transforms or None) – the image transforms to apply to the generated pictures
Datasets wrappers¶
-
class
torchelie.datasets.
HorizontalConcatDataset
(datasets)¶ Concatenates multiple datasets. However, while torchvision’s ConcatDataset just concatenates samples, torchelie’s also relabels classes. While a vertical concat like torchvision’s is useful to add more examples per class, an horizontal concat merges datasets to more classes.
Parameters: datasets (list of Dataset) – the datasets to concatenate
-
class
torchelie.datasets.
PairedDataset
(dataset1, dataset2)¶ A dataset that returns all possible pairs of samples of two datasets
Parameters: - dataset1 (Dataset) – a dataset
- dataset2 (Dataset) – another dataset
-
class
torchelie.datasets.
MixUpDataset
(dataset, alpha=0.4)¶ Linearly mixes two samples and labels from a dataset according to the MixUp algorithm
https://arxiv.org/abs/1905.02249
Parameters: - dataset (Dataset) – the dataset
- alpha (float) – the alpha that parameterizes the beta distribution from which the blending factor is sampled
-
class
torchelie.datasets.
NoexceptDataset
(ds)¶ Wrap a dataset and absorbs the exceptions it raises. Useful in case of a big downloaded dataset with corrupted samples for instance.
Parameters: ds (Dataset) – a dataset
-
class
torchelie.datasets.
WithIndexDataset
(ds)¶ Wrap a dataset. Also returns the index of the accessed element. Original dataset’s attributes are transparently accessible
Parameters: ds (Dataset) – A dataset
-
class
torchelie.datasets.
CachedDataset
(ds, transform=None, device='cpu')¶ Wrap a dataset. Lazily caches elements returned by the underlying dataset.
Parameters: - ds (Dataset) – A dataset
- transform (Callable) – transform to apply on cached elements
- device – the device on which the cache is allocated
-
class
torchelie.datasets.
Subset
(ds, ratio, remap_unused_classes=False)¶ Create a subset that is a random ratio of a dataset.
Parameters: - ds (Dataset) – the dataset to sample from. Must have a
.samples
member like torchvision’s datasets. - ratio (float) – a value between 0 and 1, the subsampling ratio.
- remap_unused_classes (boolean) – if True, classes not represented in the subset will not be considered. Remaining classes will be numbered from 0 to N.
- ds (Dataset) – the dataset to sample from. Must have a
Functions¶
-
torchelie.datasets.
mixup
(x1, x2, y1, y2, num_classes, mixer=None, alpha=0.4)¶ Mixes samples x1 and x2 with respective labels y1 and y2 according to MixUp
\(\lambda \sim \text{Beta}(\alpha, \alpha)\)
\(x = \lambda x_1 + (1-\lambda) x_2\)
\(y = \lambda y_1 + (1 - \lambda) y_2\)
Parameters: - x1 (tensor) – sample 1
- x2 (tensor) – sample 2
- y1 (tensor) – label 1
- y2 (tensor) – label 2
- num_classes (int) – number of classes
- mixer (Distribution, optional) – a distribution to sample lambda from. If unspecified, the distribution will be a Beta(alpha, alpha)
- alpha (float) – if mixer is unspecified, used to parameterize the Beta distribution