torch-fidelity: High-fidelity performance metrics for generative models in PyTorch

torch-fidelity provides precise, efficient, and extensible implementations of the popular metrics for generative model evaluation, including:

Inception Score (ISC)
Fréchet Inception Distance (FID)
Kernel Inception Distance (KID)
Perceptual Path Length (PPL)
Precision and Recall (PRC)
Monge Inception Distance (MIND)

Numerical Precision: Unlike many other reimplementations, the values produced by torch-fidelity match reference implementations up to floating point’s machine precision. This allows using torch-fidelity for reporting metrics in papers instead of scattered and slow reference implementations.

Efficiency: Feature sharing between different metrics saves recomputation time, and an additional caching level avoids recomputing features and statistics whenever possible. High efficiency allows using torch-fidelity in the training loop, for example at the end of every epoch.

Extensibility: Going beyond 2D image generation is easy due to high modularity and abstraction of the metrics from input data, models, and feature extractors. Register a custom feature extractor to evaluate any modality — video, audio, 3D volumes, or anything else with a suitable learned representation.

TLDR; fast and reliable generative model evaluation in PyTorch

Overview

Citation

Citation is recommended to reinforce the evaluation protocol in works relying on torch-fidelity. To ensure reproducibility, use the following BibTeX:

@misc{obukhov2020torchfidelity,
  author={Anton Obukhov and Maximilian Seitzer and Po-Wei Wu and Semen Zhydenko and Jonathan Kyl and Elvis Yu-Jing Lin},
  year=2020,
  title={High-fidelity performance metrics for generative models in PyTorch},
  url={https://github.com/toshas/torch-fidelity},
  publisher={Zenodo},
  version={v0.4.0},
  doi={10.5281/zenodo.3786539},
  note={Version: 0.4.0, DOI: 10.5281/zenodo.3786539}
}