Effective Data Augmentation With Diffusion Models


Brandon Trabucco ¹ , Kyle Doherty ² , Max Gurinas ³ , Ruslan Salakhutdinov ¹

¹ Carnegie Mellon University , ² MPG Ranch , ³ University of Chicago Laboratory Schools

Paper: arXiv | Code: GitHub

Abstract

Data augmentation is one of the most prevalent tools in deep learning, underpinning many recent advances, including those from classification, generative models, and representation learning. The standard approach to data augmentation combines simple transformations like rotations and flips to generate new images from existing ones. However, these new images lack diversity along key semantic axes present in the data. Current augmentations cannot alter the high-level semantic attributes, such as animal species present in a scene, to enhance the diversity of data. We address the lack of diversity in data augmentation with image-to-image transformations parameterized by pre-trained text-to-image diffusion models. Our method edits images to change their semantics using an off-the-shelf diffusion model, and generalizes to novel visual concepts from a few labelled examples. We evaluate our approach on few-shot image classification tasks, and on a real-world weed recognition task, and observe an improvement in accuracy in tested domains.

Our augmentation adapts to the images in your datasets by learning pseudo-prompts <y> for each class.

We generate augmentations using the structural layout of real images as a guide.

Generations from DA-Fusion preserve the layout of trees, but produce different structural elements.

We see strong performance across seven few-shot classification tasks.

@misc{https://doi.org/10.48550/arxiv.2302.07944,

doi = {10.48550/ARXIV.2302.07944},

url = {https://arxiv.org/abs/2302.07944},

author = {Trabucco, Brandon and Doherty, Kyle and Gurinas, Max and Salakhutdinov, Ruslan},

keywords = {Computer Vision and Pattern Recognition (cs.CV), Artificial Intelligence (cs.AI), FOS: Computer and information sciences, FOS: Computer and information sciences},

title = {Effective Data Augmentation With Diffusion Models},

publisher = {arXiv},

year = {2023},

copyright = {arXiv.org perpetual, non-exclusive license}

}