Models¶

GAN from GauGAN¶

class torchelie.models.VggImg2ImgGeneratorDebug(in_noise, out_ch, side_ch=1)¶

A vgg based image decoder that decodes a latent / noise vector into an image, conditioned on another image, like Pix2Pix or GauGAN. This architecture is really close to GauGAN as it’s not an encoder-decoder architecture and uses SPADE

Parameters:	in_noise (int) – dimension of the input latent out_ch (int) – number of channels of the output image, 3 for RGB image side_ch (int) – number of channels of the input image, 3 for RGB image

forward(x, y)¶

Generate an image

Parameters:	x (2D tensor) – input latent vectors y (4D tensor) – input images
Returns:	the generated images as a 4D tensor

class torchelie.models.VggClassCondGeneratorDebug(in_noise, out_ch, num_classes)¶

A vgg based image decoder that decodes a latent / noise vector into an image, conditioned on a class label (through conditional batchnorm).

Parameters:	in_noise (int) – dimension of the input latent out_ch (int) – number of channels of the output image, 3 for RGB image side_ch (int) – number of channels of the input image, 3 for RGB image

forward(x, y)¶

Generate images

Parameters:	x (2D tensor) – latent vectors y (1D tensor) – class labels
Returns:	generated images as a 4D tensor

GAN from Pix2Pix¶

torchelie.models.patch_discr(arch, in_ch=3, out_ch=1, norm=None)¶

Construct a PatchGAN discriminator

Parameters:	arch (list of ints) – a list of number of filters. For instance [64, 128, 256] generates a PatchGAN with 3 conv layers, with respective number of kernels 64, 128 and 256. in_ch (int) – number of input channels, 3 for RGB images out_ch (int) – number of output channels, 1 for fake / real discriminator norm (fn) – a normalization layer ctor
Returns:	the specified patchGAN as CondSeq

torchelie.models.proj_patch_discr(arch, num_classes, in_ch=3, out_ch=1, norm=None)¶

Construct a PatchGAN discriminator with projection

Parameters:

arch (list of ints) – a list of number of filters. For instance [64, 128, 256] generates a PatchGAN with 3 conv layers, with respective number of kernels 64, 128 and 256.
num_classes (int) – number of classes to discriminate
in_ch (int) – number of input channels, 3 for RGB images
out_ch (int) – number of output channels, 1 for fake / real discriminator
norm (fn) – a normalization layer ctor

Returns:

the specified patchGAN as CondSeq

torchelie.models.Patch286(in_ch=3, out_ch=1, norm=<sphinx.ext.autodoc.importer._MockObject object>)¶

Patch Discriminator from pix2pix

Parameters:	in_ch (int) – input channels, 3 for pictures out_ch (int) – output channels, 1 for binary real / fake classification norm (function) – the normalization layer to use

torchelie.models.Patch70(in_ch=3, out_ch=1, norm=<sphinx.ext.autodoc.importer._MockObject object>)¶

Patch Discriminator from pix2pix

Parameters:	in_ch (int) – input channels, 3 for pictures out_ch (int) – output channels, 1 for binary real / fake classification norm (function) – the normalization layer to use

torchelie.models.Patch32(in_ch=3, out_ch=1, norm=<sphinx.ext.autodoc.importer._MockObject object>)¶

Patch Discriminator from pix2pix

Parameters:	in_ch (int) – input channels, 3 for pictures out_ch (int) – output channels, 1 for binary real / fake classification norm (function) – the normalization layer to use

torchelie.models.ProjPatch32(in_ch=3, out_ch=1, norm=<sphinx.ext.autodoc.importer._MockObject object>, num_classes=10)¶

Patch Discriminator from pix2pix, with projection for conditional GANs

Parameters:	in_ch (int) – input channels, 3 for pictures out_ch (int) – output channels, 1 for binary real / fake classification norm (function) – the normalization layer to use num_classes (int) – how many classes to discriminate

torchelie.models.Patch16(in_ch=3, out_ch=1, norm=<sphinx.ext.autodoc.importer._MockObject object>)¶

Patch Discriminator from pix2pix

Parameters:	in_ch (int) – input channels, 3 for pictures out_ch (int) – output channels, 1 for binary real / fake classification norm (function) – the normalization layer to use

FIXME: Where to put that one? .. autofunction:: torchelie.models.VggGeneratorDebug

Other GANs¶

class torchelie.models.AutoGAN(arch, n_skip_max=2, in_noise=256, out_ch=3, batchnorm_in_output=False)¶

Generator discovered in AutoGAN: Neural Architecture Search for Generative Adversarial Networks.

Parameters:

arch (list) – architecture specification: a list of output channel for each block. Each block doubles the resolution of the generated image. Example: [512, 256, 128, 64, 32].
n_skip_max (int) – how many blocks far back will be used for the skip connections maximum.
in_noise (int) – dimension of the input noise vector
out_ch (int) – number of channels on the image
batchnorm_in_output (bool) – whether to have a batchnorm just before projecting to RGB. I have found it better on False, but the official AutoGAN repo has it.

forward(z)¶

Forward pass

Parameters:	z (tensor) – A batch of noise vectors
Returns:	generated batch of images

torchelie.models.autogan_32(in_noise, out_ch=3)¶

torchelie.models.autogan_64(in_noise, out_ch=3)¶

torchelie.models.autogan_128(in_noise, out_ch=3)¶

torchelie.models.snres_discr(arch, in_ch=3, out_ch=1)¶

Make a resnet discriminator with spectral norm, using SNResidualDiscrBlock.

Parameters:	arch (list) – a list of ints to specify output channels of the blocks, and ‘D’ to downsample. Example: [32, ‘D’, 64, ‘D’] in_ch (int) – number of input channels out_ch (int) – number of output channels
Returns:	an instance

torchelie.models.snres_projdiscr(arch, num_classes, in_ch=3)¶

Make a resnet discriminator with spectral norm and projection, using SNResidualDiscrBlock.

Parameters:	arch (list) – a list of ints to specify output channels of the blocks, and ‘D’ to downsample. Example: [32, ‘D’, 64, ‘D’] in_ch (int) – number of input channels num_classes (int) – number of classes in the dataset
Returns:	an instance

torchelie.models.snres_discr_4l(in_ch=3, out_ch=1)¶

Make a 4 layers resnet discriminator with spectral norm, using SNResidualDiscrBlock.

Parameters:	in_ch (int) – number of input channels out_ch (int) – number of output channels
Returns:	an instance

torchelie.models.snres_projdiscr_4l(num_classes, in_ch=3)¶

Make a 4 layers resnet discriminator with spectral norm and projection, using SNResidualDiscrBlock.

Parameters:	in_ch (int) – number of input channels num_classes (int) – number of classes in the dataset
Returns:	an instance

torchelie.models.snres_discr_5l(in_ch=3, out_ch=1)¶

Make a 5 layers resnet discriminator with spectral norm, using SNResidualDiscrBlock.

Parameters:	in_ch (int) – number of input channels out_ch (int) – number of output channels
Returns:	an instance

torchelie.models.snres_projdiscr_5l(num_classes, in_ch=3)¶

Make a 5 layers resnet discriminator with spectral norm and projection, using SNResidualDiscrBlock.

Parameters:	in_ch (int) – number of input channels num_classes (int) – number of classes in the dataset
Returns:	an instance

Convolutional¶

torchelie.models.VggBNBone(arch, in_ch=3, leak=0, block=<function Conv2dBNReLU>, debug=False)¶

Construct a VGG net

How to specify a VGG architecture:

It’s a list of blocks specifications. Blocks are either:

‘M’ for maxpool of kernel size 2 and stride 2
‘A’ for average pool of kernel size 2 and stride 2
‘U’ for nearest neighbors upsampling (scale factor 2)
an integer ch for a block with ch output channels

Parameters:	arch (list) – architecture specification in_ch (int) – number of input channels leak (float) – leak in relus block (fn) – block ctor
Returns:	A VGG instance

torchelie.models.ResNetBone(arch, head, block, in_ch=3, debug=False)¶

A resnet

How to specify an architecture:

It’s a list of block specifications. Each element is a string of the form “output channels:stride”. For instance “64:2” is a block with input stride 2 and 64 output channels.

Parameters:	arch (list) – the architecture specification head (fn) – the module ctor to build for the first conv block (fn) – the residual block to use ctor in_ch (int) – number of input channels, 3 for RGB images debug (bool) – should insert debug layers between each layer
Returns:	A Resnet instance

torchelie.models.VectorCondResNetBone(arch, head, hidden, in_ch=3, debug=False)¶

A resnet with vector side condition.

Parameters:	arch (list) – the architecture specification head (fn) – the module ctor to build for the first conv hidden (int) – the hidden size of condition projection in_ch (int) – number of input channels, 3 for RGB images debug (bool) – should insert debug layers between each layer
Returns:	A Resnet instance

torchelie.models.ClassCondResNetBone(*args, **kwargs)¶

A resnet with class side condition.

Parameters:

arch (list) – the architecture specification
head (fn) – the module ctor to build for the first conv
hidden (int) – the hidden size of the side label embedding
num_classes (int) – the number of possible labels in the side condition
in_ch (int) – number of input channels, 3 for RGB images
debug (bool) – should insert debug layers between each layer

Returns:

A Resnet instance

class torchelie.models.UNetBone(arch, in_ch=3, out_ch=1)¶

Configurable UNet model.

Note: Not all input sizes are valid. Make sure that the model can decode an image of the same size first.

Parameters:	arch (list) – an architecture specification made of: - an int, for an kernel with specified output_channels - ‘U’ for upsampling+conv - ‘D’ for downsampling (maxpooling) in_ch (int) – number of input channels out_ch (int) – number of output channels

class torchelie.models.Attention56Bone(in_ch=3)¶

Attention56 bone

Parameters:	in_ch (int) – number of channels in the images

Image classifiers¶

torchelie.models.VggDebug(num_classes, in_ch=1, debug=False)¶

A not so small Vgg net classifier for testing purposes

Parameters:	num_classes (int) – number of output classes in_ch (int) – number of input channels, 3 for RGB images debug (bool) – whether to add debug layers
Returns:	a VGG instance

torchelie.models.ResNetDebug(num_classes, in_ch=3, debug=False)¶

A not so big predefined resnet classifier for debugging purposes.

Parameters:	num_classes (int) – the number of output classes in_ch (int) – number of input channels, 3 for RGB images debug (bool) – whereas to print additional debug info
Returns:	a resnet instance

torchelie.models.PreactResNetDebug(num_classes, in_ch=3, debug=False)¶

A not so big predefined preactivation resnet classifier for debugging purposes.

Parameters:	num_classes (int) – the number of output classes in_ch (int) – number of input channels, 3 for RGB images debug (bool) – whereas to print additional debug info
Returns:	a resnet instance

torchelie.models.VectorCondResNetDebug(vector_size, in_ch=3, debug=False)¶

A not so big predefined resnet classifier for debugging purposes.

Parameters:	vector_size (int) – size of the conditioning vector in_ch (int) – number of input channels, 3 for RGB images debug (bool) – whereas to print additional debug info
Returns:	a resnet instance

torchelie.models.ClassCondResNetDebug(num_classes, num_cond_classes, in_ch=3, debug=False)¶

A not so big predefined resnet classifier for debugging purposes.

Parameters:	num_cond_classes (int) – the number of possible labels in the side condition num_classes (int) – the number of output classes in_ch (int) – number of input channels, 3 for RGB images debug (bool) – whereas to print additional debug info
Returns:	a resnet instance

torchelie.models.attention56(num_classes, in_ch=3)¶

Build a attention56 network

Parameters:	num_classes (int) – number of classes in_ch (int) – number of channels in the images

Image Segmenter¶

torchelie.models.UNet(in_ch=3, out_ch=1)¶

Instantiate the UNet network specified in _U-Net: Convolutional Networks for Biomedical Image Segmentation_ (Ronneberger, 2015)

Valid input sizes include : 572x572, 132x132

Parameters:	in_ch (int) – number of input channels out_ch (int) – number of output channels
Returns:	An instantiated UNet

class torchelie.models.Hourglass(noise_dim=32, down_channels=[128, 128, 128, 128, 128], skip_channels=4, down_kernel=[3, 3, 3, 3, 3], up_kernel=[3, 3, 3, 3, 3], upsampling='bilinear', pad=<sphinx.ext.autodoc.importer._MockObject object>, relu=<sphinx.ext.autodoc.importer._MockObject object>)¶: Hourglass model from Deep Image Prior.

Classification heads¶

class torchelie.models.Classifier(feat_extractor, feature_size, num_classes)¶

A classification head added on top of a feature extraction model.

Parameters:	feat_extractor (nn.Module) – a feature extraction model feature_size (int) – the number of features in the last layer of the feature extractor num_classes (int) – the number of output classes

forward(*xs)¶

Forward pass

Parameters:	*xs – arguments for feat_extractor

class torchelie.models.Classifier1(feat_extractor, feature_size, num_classes)¶

A one layer classification head added on top of a feature extraction model.

Parameters:	feat_extractor (nn.Module) – a feature extraction model feature_size (int) – the number of features in the last layer of the feature extractor num_classes (int) – the number of output classes

forward(*xs)¶

Forward pass

Parameters:	*xs – arguments for feat_extractor

class torchelie.models.ProjectionDiscr(feat_extractor, feature_size, num_classes)¶

A classification head for conditional GANs discriminators using a projection discriminator from https://arxiv.org/abs/1802.05637

Parameters:	feat_extractor (nn.Module) – a feature extraction model feature_size (int) – the number of features in the last layer of the feature extractor num_classes (int) – the number of output classes

forward(x, y)¶

Forward pass

Parameters:	x – argument for feat_extractor y – class label

torchelie.models.PerceptualNet(layers)¶

Make a VGG16 with appropriately named layers that records intermediate activations.

Parameters:	layers (list of str) – the names of the layers for which to save the activations.

PixelCNN¶

class torchelie.models.PixelCNN(hid, sz, channels=3, n_layer=3)¶

A PixelCNN model with 6 blocks

Parameters:	hid (int) – the number of hidden channels in the blocks sz ((int, int)) – the size of the images to learn. Must be square channels (int) – number of channels in the data. 3 for RGB images

forward(x)¶: A forward pass for training

sample(temp, N)¶

Sample a batch of images

Parameters:	temp (float) – the sampling temperature N (int) – number of images to generate in the batch
Returns:	A batch of images