torchelie.callbacks¶
Loggers¶
-
class
torchelie.callbacks.
StdoutLogger
(log_every=10, prefix='')¶ Log metrics to stdout. It logs scalars, scalar tensors and strings.
- Parameters
log_every (int) – batch logging freq. -1 logs on epoch ends only.
prefix (str) – prefix for all metrics name
-
class
torchelie.callbacks.
VisdomLogger
(visdom_env='main', log_every=10, prefix='', post_epoch_ends=True)¶ Log metrics to Visdom. It logs scalars and scalar tensors as plots, 3D and 4D tensors as images, and strings as HTML.
- Parameters
visdom_env (str) – name of the target visdom env
log_every (int) – batch logging freq. -1 logs on epoch ends only.
prefix (str) – prefix for all metrics name
Logging¶
-
class
torchelie.callbacks.
TopkAccAvg
(k: int = 5, post_each_batch: bool = True, avg_type: str = 'window')¶ Log the average topk accuracy to the metrics. The true classes is expected in
state['batch'][1]
, and the logits predictions are expected instate['preds']
.- Parameters
k (int) – top k
post_each_batch (bool) – whether to post on each batch or on epoch end. Default: True.
avg_type (str) – type of averaging. One of ‘running’ for a running average over each epoch, ‘moving’ for an exponential moving average, ‘window’ for an average over the 100 last values.
-
class
torchelie.callbacks.
AccAvg
(post_each_batch=True, avg_type='window')¶ Log the average accuracy to the metrics. The true classes is expected in
state['batch'][1]
, and the logits predictions are expected instate['preds']
.- Parameters
post_each_batch (bool) – whether to post on each batch or on epoch end. Default: True.
avg_type (str) – type of averaging. One of ‘running’ for a running average over each epoch, ‘moving’ for an exponential moving average, ‘window’ for an average over the 100 last values.
-
class
torchelie.callbacks.
EpochMetricAvg
(name, post_each_batch=True)¶ Log to the metrics a value averaged over an epoch in the current state
- Parameters
name (str) – the name of the value to log
post_each_batch (bool) – whether to post on each batch (True, default), or only on epoch ends (False). Notice that posting on each batch necessarily yields an approximate average.
-
class
torchelie.callbacks.
WindowedMetricAvg
(name, post_each_batch=True, k=100)¶ Log to the metrics a window averaged value in the current state
- Parameters
name (str) – the name of the value to log
post_each_batch (bool) – whether to post on each batch (True, default), or only on epoch ends (False)
-
class
torchelie.callbacks.
Log
(from_k, to, post_each_batch=True)¶ Move a value from the state to the metrics.
- Parameters
from_k (str) – path in the state of the value to log (as accepted by
torchelie.utils.dict_by_key()
to (str) – metrics name
-
class
torchelie.callbacks.
Counter
¶ Count iterations and epochs. Mandatory as first callback as other callbacks may depend on it.
It writes in the state the current epoch (as ‘epoch’), the current iteration within batch (‘epoch_batch’) and the overall iteration number (‘iters’):
-
class
torchelie.callbacks.
Throughput
¶ For debugging and benchmarking purposes: compute the througput (samples/s) of both the forward pass and the full iteration.
It creates 4 new metrics:
iter_time
: number of seconds between two consecutives batch_startevent.
iter_throughput
imgs/s for the whole iteration asbatch_size / iter_time
forward_time
: number of seconds between batch_start and batch_endevents.
forward_throughput
imgs/s for the forward pass asbatch_size / forward_time
Visualization¶
-
class
torchelie.callbacks.
ConfusionMatrix
(labels, normalize=False)¶ Generate a HTML confusion matrix.
- Parameters
labels (list of str) – classes name
normalize (bool) – whether to show absolute or relative frequencies
-
class
torchelie.callbacks.
MetricsTable
(post_each_batch=True, epoch_ends=True)¶ Generate a HTML table with all the current metrics, to be displayed in Visdom.
- Parameters
post_each_batch (bool) – whether to post on each batch or on epoch end. Default: True.
-
class
torchelie.callbacks.
ClassificationInspector
(nb_show, classes, post_each_batch=True)¶ For image classification tasks, create a HTML report with best classified samples, worst classified samples, and samples closest to the boundary decision.
- Parameters
nb_show (int) – how many samples to show
classes (list of str) – classes names
post_each_batch (bool) – or only on epoch end.
-
class
torchelie.callbacks.
ImageGradientVis
¶ Log gradients backpropagated to the input as a feature visualization mean. Works only for image data.
-
class
torchelie.callbacks.
GANMetrics
(real_key: str = 'batch.0', fake_key: str = 'fake', device='cpu', metrics: List[str] = ['fid', 'kid', 'precision', 'recall'])¶ Kernel Inception Distance and ‘Fréchet Inception Distance <https://arxiv.org/abs/1706.08500>`_ on batches.
- Parameters
real_key (str) – name of the batch of real images in the state
fake_key (str) – name of the batch of generated images in the state
device – a torch device on which to compute the InceptionV3 activations.
metrics (List[str]) – which metrics to use ‘kid’, ‘fid’, ‘precision’,
'recall' –
'ids'. –
Warning
The FID is heavily biased but stable. This means the scale of the FID will heavily vary according to your batch size. If you want to compare your FID with papers, do it on 50k samples to match the scale. However, you can use the FID on much smaller batches to track your training and relative improvements.
tl;dr: FID values can only be compared for a similar number of samples, papers use 50k, but for your own experiments any number work as long as it stays the same.
See page 8 Fig. 1 a for more details.
Warning
The KID is not biased but slightly unstable with less than 500 samples. With more than 500 samples, the FID can be compared between papers. With less than that, you can expect some variance of about +/- 1e-3.
See page 8 Fig. 1 a for more details.
Note
When doing distributed training, the computation is ran on all gathered batches.
Note
IDS is Paired Inception Discriminative Score from CoModGAN
-
class
torchelie.callbacks.
SegmentationInspector
(nb_show, classes, post_each_batch=True)¶ For image binnary segmentation tasks, create a HTML report with best segmented samples, worst segmented samples, and samples closest to the boundary decision.
- Parameters
nb_show (int) – how many samples to show
classes (list of str) – classes names
post_each_batch (bool) – or only on epoch end.
Training¶
-
class
torchelie.callbacks.
Optimizer
(opt, accumulation=1, clip_grad_norm=None, centralize_grad=False, log_lr=False, log_mom=False, scaler=None, grad_multiplier=1)¶ Apply an optimizer’s
step()
andzero_grad()
.- Parameters
opt (Optimizer) – the optimizer to use
accumulations (int) – number of batches to accumulate gradients over
clip_grad_norm (float or None) – maximal norm of gradients to clip, before applying
opt.step()
log_lr (bool) – whether to log the current learning rates in the metrics
log_mom (bool) – whether to log the current momentum / beta1 in the metrics
-
class
torchelie.callbacks.
LRSched
(sched, metric='loss', step_each_batch=False)¶ Call
lr_sched.step()
.- Parameters
sched (Scheduler) – the scheduler to run
metric (str or None) – if
step()
takes a value as an argument, that value should be in the state, and named here. Otherwise, just use None if it takes no argument.step_each_batch (bool) – whether to call
step()
on each batch or on each epoch.
Model¶
-
class
torchelie.callbacks.
Polyak
(original: torch.nn.modules.module.Module, copy: torch.nn.modules.module.Module, beta: float = 0.999)¶ Polyak averaging (Exponential moving average).
- Parameters
original (nn.Module) – source module
copy (nn.Module) – averaged model
beta (float) – decay value
-
class
torchelie.callbacks.
Checkpoint
(filename_base, objects, max_saves=10, key_best=None)¶ Save object to disk on epoch end
- Parameters
filename_base (str) – a format string that is the filename. The format string can have keyword parameters that will be indexed in the state.
objects – what to save. It must have a
state_dict()
membermax_saves (int) – maximum number of checkpoints to save. Older checkpoints will be removed.
key_best (func) – key to determinte the best test. The value of the key parameter should be a function that takes a single argument and returns a key to use for sorting purposes.
Misc¶
-
class
torchelie.callbacks.
CallRecipe
(loop, run_every=100, prefix='test', init_fun=None)¶ Call another recipe.
- Parameters
loop (Recipe) – the recipe to call
run_every (int) – how often to call that recipe
prefix (str) – prefix of the metrics of the recipe
init_fun (Callable or None) – a fun to call before running the recipe