Models¶
GAN from GauGAN¶
-
class
torchelie.models.
VggImg2ImgGeneratorDebug
(in_noise, out_ch, side_ch=1)¶ A vgg based image decoder that decodes a latent / noise vector into an image, conditioned on another image, like Pix2Pix or GauGAN. This architecture is really close to GauGAN as it’s not an encoder-decoder architecture and uses SPADE
Parameters: - in_noise (int) – dimension of the input latent
- out_ch (int) – number of channels of the output image, 3 for RGB image
- side_ch (int) – number of channels of the input image, 3 for RGB image
-
forward
(x, y)¶ Generate an image
Parameters: - x (2D tensor) – input latent vectors
- y (4D tensor) – input images
Returns: the generated images as a 4D tensor
-
class
torchelie.models.
VggClassCondGeneratorDebug
(in_noise, out_ch, num_classes)¶ A vgg based image decoder that decodes a latent / noise vector into an image, conditioned on a class label (through conditional batchnorm).
Parameters: - in_noise (int) – dimension of the input latent
- out_ch (int) – number of channels of the output image, 3 for RGB image
- side_ch (int) – number of channels of the input image, 3 for RGB image
-
forward
(x, y)¶ Generate images
Parameters: - x (2D tensor) – latent vectors
- y (1D tensor) – class labels
Returns: generated images as a 4D tensor
GAN from Pix2Pix¶
-
torchelie.models.
patch_discr
(arch, in_ch=3, out_ch=1, norm=None)¶ Construct a PatchGAN discriminator
Parameters: - arch (list of ints) – a list of number of filters. For instance [64, 128, 256] generates a PatchGAN with 3 conv layers, with respective number of kernels 64, 128 and 256.
- in_ch (int) – number of input channels, 3 for RGB images
- out_ch (int) – number of output channels, 1 for fake / real discriminator
- norm (fn) – a normalization layer ctor
Returns: the specified patchGAN as CondSeq
-
torchelie.models.
proj_patch_discr
(arch, num_classes, in_ch=3, out_ch=1, norm=None)¶ Construct a PatchGAN discriminator with projection
Parameters: - arch (list of ints) – a list of number of filters. For instance [64, 128, 256] generates a PatchGAN with 3 conv layers, with respective number of kernels 64, 128 and 256.
- num_classes (int) – number of classes to discriminate
- in_ch (int) – number of input channels, 3 for RGB images
- out_ch (int) – number of output channels, 1 for fake / real discriminator
- norm (fn) – a normalization layer ctor
Returns: the specified patchGAN as CondSeq
-
torchelie.models.
Patch286
(in_ch=3, out_ch=1, norm=<sphinx.ext.autodoc.importer._MockObject object>)¶ Patch Discriminator from pix2pix
Parameters: - in_ch (int) – input channels, 3 for pictures
- out_ch (int) – output channels, 1 for binary real / fake classification
- norm (function) – the normalization layer to use
-
torchelie.models.
Patch70
(in_ch=3, out_ch=1, norm=<sphinx.ext.autodoc.importer._MockObject object>)¶ Patch Discriminator from pix2pix
Parameters: - in_ch (int) – input channels, 3 for pictures
- out_ch (int) – output channels, 1 for binary real / fake classification
- norm (function) – the normalization layer to use
-
torchelie.models.
Patch32
(in_ch=3, out_ch=1, norm=<sphinx.ext.autodoc.importer._MockObject object>)¶ Patch Discriminator from pix2pix
Parameters: - in_ch (int) – input channels, 3 for pictures
- out_ch (int) – output channels, 1 for binary real / fake classification
- norm (function) – the normalization layer to use
-
torchelie.models.
ProjPatch32
(in_ch=3, out_ch=1, norm=<sphinx.ext.autodoc.importer._MockObject object>, num_classes=10)¶ Patch Discriminator from pix2pix, with projection for conditional GANs
Parameters: - in_ch (int) – input channels, 3 for pictures
- out_ch (int) – output channels, 1 for binary real / fake classification
- norm (function) – the normalization layer to use
- num_classes (int) – how many classes to discriminate
-
torchelie.models.
Patch16
(in_ch=3, out_ch=1, norm=<sphinx.ext.autodoc.importer._MockObject object>)¶ Patch Discriminator from pix2pix
Parameters: - in_ch (int) – input channels, 3 for pictures
- out_ch (int) – output channels, 1 for binary real / fake classification
- norm (function) – the normalization layer to use
FIXME: Where to put that one? .. autofunction:: torchelie.models.VggGeneratorDebug
Other GANs¶
-
class
torchelie.models.
AutoGAN
(arch, n_skip_max=2, in_noise=256, out_ch=3, batchnorm_in_output=False)¶ Generator discovered in AutoGAN: Neural Architecture Search for Generative Adversarial Networks.
Parameters: - arch (list) – architecture specification: a list of output channel for each block. Each block doubles the resolution of the generated image. Example: [512, 256, 128, 64, 32].
- n_skip_max (int) – how many blocks far back will be used for the skip connections maximum.
- in_noise (int) – dimension of the input noise vector
- out_ch (int) – number of channels on the image
- batchnorm_in_output (bool) – whether to have a batchnorm just before projecting to RGB. I have found it better on False, but the official AutoGAN repo has it.
-
forward
(z)¶ Forward pass
Parameters: z (tensor) – A batch of noise vectors Returns: generated batch of images
-
torchelie.models.
autogan_32
(in_noise, out_ch=3)¶
-
torchelie.models.
autogan_64
(in_noise, out_ch=3)¶
-
torchelie.models.
autogan_128
(in_noise, out_ch=3)¶
-
torchelie.models.
snres_discr
(arch, in_ch=3, out_ch=1)¶ Make a resnet discriminator with spectral norm, using SNResidualDiscrBlock.
Parameters: - arch (list) – a list of ints to specify output channels of the blocks, and ‘D’ to downsample. Example: [32, ‘D’, 64, ‘D’]
- in_ch (int) – number of input channels
- out_ch (int) – number of output channels
Returns: an instance
-
torchelie.models.
snres_projdiscr
(arch, num_classes, in_ch=3)¶ Make a resnet discriminator with spectral norm and projection, using SNResidualDiscrBlock.
Parameters: - arch (list) – a list of ints to specify output channels of the blocks, and ‘D’ to downsample. Example: [32, ‘D’, 64, ‘D’]
- in_ch (int) – number of input channels
- num_classes (int) – number of classes in the dataset
Returns: an instance
-
torchelie.models.
snres_discr_4l
(in_ch=3, out_ch=1)¶ Make a 4 layers resnet discriminator with spectral norm, using SNResidualDiscrBlock.
Parameters: - in_ch (int) – number of input channels
- out_ch (int) – number of output channels
Returns: an instance
-
torchelie.models.
snres_projdiscr_4l
(num_classes, in_ch=3)¶ Make a 4 layers resnet discriminator with spectral norm and projection, using SNResidualDiscrBlock.
Parameters: - in_ch (int) – number of input channels
- num_classes (int) – number of classes in the dataset
Returns: an instance
-
torchelie.models.
snres_discr_5l
(in_ch=3, out_ch=1)¶ Make a 5 layers resnet discriminator with spectral norm, using SNResidualDiscrBlock.
Parameters: - in_ch (int) – number of input channels
- out_ch (int) – number of output channels
Returns: an instance
-
torchelie.models.
snres_projdiscr_5l
(num_classes, in_ch=3)¶ Make a 5 layers resnet discriminator with spectral norm and projection, using SNResidualDiscrBlock.
Parameters: - in_ch (int) – number of input channels
- num_classes (int) – number of classes in the dataset
Returns: an instance
Convolutional¶
-
torchelie.models.
VggBNBone
(arch, in_ch=3, leak=0, block=<function Conv2dBNReLU>, debug=False)¶ Construct a VGG net
How to specify a VGG architecture:
It’s a list of blocks specifications. Blocks are either:
- ‘M’ for maxpool of kernel size 2 and stride 2
- ‘A’ for average pool of kernel size 2 and stride 2
- ‘U’ for nearest neighbors upsampling (scale factor 2)
- an integer ch for a block with ch output channels
Parameters: - arch (list) – architecture specification
- in_ch (int) – number of input channels
- leak (float) – leak in relus
- block (fn) – block ctor
Returns: A VGG instance
-
torchelie.models.
ResNetBone
(arch, head, block, in_ch=3, debug=False)¶ A resnet
How to specify an architecture:
It’s a list of block specifications. Each element is a string of the form “output channels:stride”. For instance “64:2” is a block with input stride 2 and 64 output channels.
Parameters: - arch (list) – the architecture specification
- head (fn) – the module ctor to build for the first conv
- block (fn) – the residual block to use ctor
- in_ch (int) – number of input channels, 3 for RGB images
- debug (bool) – should insert debug layers between each layer
Returns: A Resnet instance
-
torchelie.models.
VectorCondResNetBone
(arch, head, hidden, in_ch=3, debug=False)¶ A resnet with vector side condition.
Parameters: - arch (list) – the architecture specification
- head (fn) – the module ctor to build for the first conv
- hidden (int) – the hidden size of condition projection
- in_ch (int) – number of input channels, 3 for RGB images
- debug (bool) – should insert debug layers between each layer
Returns: A Resnet instance
-
torchelie.models.
ClassCondResNetBone
(*args, **kwargs)¶ A resnet with class side condition.
Parameters: - arch (list) – the architecture specification
- head (fn) – the module ctor to build for the first conv
- hidden (int) – the hidden size of the side label embedding
- num_classes (int) – the number of possible labels in the side condition
- in_ch (int) – number of input channels, 3 for RGB images
- debug (bool) – should insert debug layers between each layer
Returns: A Resnet instance
-
class
torchelie.models.
UNetBone
(arch, in_ch=3, out_ch=1)¶ Configurable UNet model.
Note: Not all input sizes are valid. Make sure that the model can decode an image of the same size first.
Parameters: - arch (list) – an architecture specification made of: - an int, for an kernel with specified output_channels - ‘U’ for upsampling+conv - ‘D’ for downsampling (maxpooling)
- in_ch (int) – number of input channels
- out_ch (int) – number of output channels
-
class
torchelie.models.
Attention56Bone
(in_ch=3)¶ Attention56 bone
Parameters: in_ch (int) – number of channels in the images
Image classifiers¶
-
torchelie.models.
VggDebug
(num_classes, in_ch=1, debug=False)¶ A not so small Vgg net classifier for testing purposes
Parameters: - num_classes (int) – number of output classes
- in_ch (int) – number of input channels, 3 for RGB images
- debug (bool) – whether to add debug layers
Returns: a VGG instance
-
torchelie.models.
ResNetDebug
(num_classes, in_ch=3, debug=False)¶ A not so big predefined resnet classifier for debugging purposes.
Parameters: - num_classes (int) – the number of output classes
- in_ch (int) – number of input channels, 3 for RGB images
- debug (bool) – whereas to print additional debug info
Returns: a resnet instance
-
torchelie.models.
PreactResNetDebug
(num_classes, in_ch=3, debug=False)¶ A not so big predefined preactivation resnet classifier for debugging purposes.
Parameters: - num_classes (int) – the number of output classes
- in_ch (int) – number of input channels, 3 for RGB images
- debug (bool) – whereas to print additional debug info
Returns: a resnet instance
-
torchelie.models.
VectorCondResNetDebug
(vector_size, in_ch=3, debug=False)¶ A not so big predefined resnet classifier for debugging purposes.
Parameters: - vector_size (int) – size of the conditioning vector
- in_ch (int) – number of input channels, 3 for RGB images
- debug (bool) – whereas to print additional debug info
Returns: a resnet instance
-
torchelie.models.
ClassCondResNetDebug
(num_classes, num_cond_classes, in_ch=3, debug=False)¶ A not so big predefined resnet classifier for debugging purposes.
Parameters: - num_cond_classes (int) – the number of possible labels in the side condition
- num_classes (int) – the number of output classes
- in_ch (int) – number of input channels, 3 for RGB images
- debug (bool) – whereas to print additional debug info
Returns: a resnet instance
-
torchelie.models.
attention56
(num_classes, in_ch=3)¶ Build a attention56 network
Parameters: - num_classes (int) – number of classes
- in_ch (int) – number of channels in the images
Image Segmenter¶
-
torchelie.models.
UNet
(in_ch=3, out_ch=1)¶ Instantiate the UNet network specified in _U-Net: Convolutional Networks for Biomedical Image Segmentation_ (Ronneberger, 2015)
Valid input sizes include : 572x572, 132x132
Parameters: - in_ch (int) – number of input channels
- out_ch (int) – number of output channels
Returns: An instantiated UNet
-
class
torchelie.models.
Hourglass
(noise_dim=32, down_channels=[128, 128, 128, 128, 128], skip_channels=4, down_kernel=[3, 3, 3, 3, 3], up_kernel=[3, 3, 3, 3, 3], upsampling='bilinear', pad=<sphinx.ext.autodoc.importer._MockObject object>, relu=<sphinx.ext.autodoc.importer._MockObject object>)¶ Hourglass model from Deep Image Prior.
Classification heads¶
-
class
torchelie.models.
Classifier
(feat_extractor, feature_size, num_classes)¶ A classification head added on top of a feature extraction model.
Parameters: - feat_extractor (nn.Module) – a feature extraction model
- feature_size (int) – the number of features in the last layer of the feature extractor
- num_classes (int) – the number of output classes
-
forward
(*xs)¶ Forward pass
Parameters: *xs – arguments for feat_extractor
-
class
torchelie.models.
Classifier1
(feat_extractor, feature_size, num_classes)¶ A one layer classification head added on top of a feature extraction model.
Parameters: - feat_extractor (nn.Module) – a feature extraction model
- feature_size (int) – the number of features in the last layer of the feature extractor
- num_classes (int) – the number of output classes
-
forward
(*xs)¶ Forward pass
Parameters: *xs – arguments for feat_extractor
-
class
torchelie.models.
ProjectionDiscr
(feat_extractor, feature_size, num_classes)¶ A classification head for conditional GANs discriminators using a projection discriminator from https://arxiv.org/abs/1802.05637
Parameters: - feat_extractor (nn.Module) – a feature extraction model
- feature_size (int) – the number of features in the last layer of the feature extractor
- num_classes (int) – the number of output classes
-
forward
(x, y)¶ Forward pass
Parameters: - x – argument for feat_extractor
- y – class label
-
torchelie.models.
PerceptualNet
(layers)¶ Make a VGG16 with appropriately named layers that records intermediate activations.
Parameters: layers (list of str) – the names of the layers for which to save the activations.
PixelCNN¶
-
class
torchelie.models.
PixelCNN
(hid, sz, channels=3, n_layer=3)¶ A PixelCNN model with 6 blocks
Parameters: - hid (int) – the number of hidden channels in the blocks
- sz ((int, int)) – the size of the images to learn. Must be square
- channels (int) – number of channels in the data. 3 for RGB images
-
forward
(x)¶ A forward pass for training
-
sample
(temp, N)¶ Sample a batch of images
Parameters: - temp (float) – the sampling temperature
- N (int) – number of images to generate in the batch
Returns: A batch of images