Research

This is a list of papers I've worked on organized by topic. I've tried to give a brief summary for each paper. Some of the papers have short explanatory videos! I'm working on adding more. [NOTE: this page is somewhat out of date]

GANs

Conditional image synthesis with auxiliary classifier gans
A Odena, C Olah, J Shlens
ICML 2017
This was arguably the first paper in which GANs were made to work on the ImageNet dataset. We gave a way to use label information to improve image synthesis performance.

Self-attention generative adversarial networks
H Zhang, I Goodfellow, D Metaxas, A Odena
ICML 2019 (Long Talk)
Slides
We propose several tweaks to the GAN training procedure that dramatically improve image synthesis performance. BigGAN is based on this work. Source code is available here.

Is Generator Conditioning Causally Related to GAN Performance?
A Odena, J Buckman, C Olsson, T B Brown, C Olah, C Raffel, I Goodfellow
ICML 2018
We show that the conditioning of the input-output jacobian of GAN generators is predictive of many GAN training pathologies. We then give evidence that the relationship is causal by conducting an intervention that clips the range of the jacobian singular values.

Discriminator Rejection Sampling
S Azadi, C Olsson, T Darrell, I Goodfellow, A Odena
ICLR 2019
We show that GAN discriminators can be used after training is finished to perform rejection sampling on GAN generators.

Skill Rating for Generative Models
C Olsson, S Bhupatiraju, T B Brown, A Odena, I Goodfellow
Preprint
We show how to use chess-style tournament ranking to evaluate GANs and other generative models.

Open Questions about Generative Adversarial Networks
A Odena
Distill (Commentary)
I give a set of Open Problems that I think machine learning researchers working on GANs ought to think about.

Top-K Training of GANs: Improving Generators by Making Critics Less Critical
S Sinha, A Goyal, C Raffel, A Odena
Preprint
We introduce a simple modification to the GAN training algorithm that materially improves results with no increase in computational cost: When updating the generator parameters, we simply zero out the gradient contributions from the elements of the batch that the critic scores as `least realistic'.

Improved consistency regularization for gans
Z Zhao, S Singh, H Lee, Z Zhang, A Odena, H Zhang
Preprint
Several improvements on the Consistency Regularization for GANs paper.

Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models
G Darras, A Odena, H Zhang, A Dimakis
CVPR 2020
We design special sparse-attention mechanisms for images. We also show how to invert GANs with attention layers, which is important, because all GANs now have attention layers!

Semi-Supervised Learning

Semi-supervised learning with generative adversarial networks
A Odena
Workshop on Data-Efficient Machine Learning (ICML 2016)
I invented (concurrent with this paper) a techique for using GANs to do semi-supervised learning.

Realistic Evaluation of Deep Semi-Supervised Learning Algorithms
A Oliver*, A Odena*, C Raffel*, ED Cubuk, IJ Goodfellow
NeurIPS 2018 (Spotlight)
We argue that existing methods for evaluating semi-supervised learning techniques are flawed and propose a new framework for doing these evaluations. Source code is available here.

Machine Learning and Computer Systems

TensorFuzz: Debugging neural networks with coverage-guided fuzzing
A Odena, C Olsson, D Anderson, I Goodfellow
ICML 2019 (Long Talk)
Slides
We apply the notions of coverage-guided-fuzzing and property-based-testing to neural networks. We show that approximate-nearest-neighbors algorithms can give useful coverage metrics in this context. Source code is available here.

Learning to Represent Programs with Property Signatures
A Odena, C Sutton
ICLR 2020
Video
We introduce the notion of property signatures, a representation for programs and program specifications meant for consumption by machine learning algorithms.

Faster Asynchronous SGD
A Odena
Workshop on Optimization Methods for the Next Generation of Machine Learning (ICML) 2016
I speed up asynchronous SGD by quantifying gradient update staleness in terms of moving averages of gradient statistics.

Miscellaneous Machine Learning

Deconvolution and checkerboard artifacts
A Odena, V Dumoulin, C Olah
Distill
We show that the ubiquitous "deconvolution" operation used in image-upsampling produces strange checkerboard-artifacts. We then propose a simple fix.

Changing Model Behavior at Test-Time Using Reinforcement Learning
A Odena, D Lawson, C Olah
ICLR 2017 (Workshop Track)
I show how to change the test-time resource-usage of neural networks on a per-input basis using reinforcement learning.