A Study on Generative Adversarial Networks Exacerbating Social Data Bias

abstract: Generative Adversarial Networks are designed, in theory, to replicate the distribution of the data they are trained on. With real-world limitations, such as finite network capacity and training set size, they inevitably suffer a yet unavoidable technical failure: mode collapse. GAN-generat...

Full description

Bibliographic Details
Other Authors:	Jain, Niharika (Author)
Format:	Dissertation
Language:	English
Published:	2020
Subjects:	Artificial intelligence Computer science Ethics bias data augmentation generative adversarial network machine learning society
Online Access:	http://hdl.handle.net/2286/R.I.57433

id	ndltd-asu.edu-item-57433
record_format	oai_dc
spelling	ndltd-asu.edu-item-574332020-06-02T03:01:32Z A Study on Generative Adversarial Networks Exacerbating Social Data Bias abstract: Generative Adversarial Networks are designed, in theory, to replicate the distribution of the data they are trained on. With real-world limitations, such as finite network capacity and training set size, they inevitably suffer a yet unavoidable technical failure: mode collapse. GAN-generated data is not nearly as diverse as the real-world data the network is trained on; this work shows that this effect is especially drastic when the training data is highly non-uniform. Specifically, GANs learn to exacerbate the social biases which exist in the training set along sensitive axes such as gender and race. In an age where many datasets are curated from web and social media data (which are almost never balanced), this has dangerous implications for downstream tasks using GAN-generated synthetic data, such as data augmentation for classification. This thesis presents an empirical demonstration of this phenomenon and illustrates its real-world ramifications. It starts by showing that when asked to sample images from an illustrative dataset of engineering faculty headshots from 47 U.S. universities, unfortunately skewed toward white males, a DCGAN’s generator “imagines” faces with light skin colors and masculine features. In addition, this work verifies that the generated distribution diverges more from the real-world distribution when the training data is non-uniform than when it is uniform. This work also shows that a conditional variant of GAN is not immune to exacerbating sensitive social biases. Finally, this work contributes a preliminary case study on Snapchat’s explosively popular GAN-enabled “My Twin” selfie lens, which consistently lightens the skin tone for women of color in an attempt to make faces more feminine. The results and discussion of the study are meant to caution machine learning practitioners who may unsuspectingly increase the biases in their applications. Dissertation/Thesis Jain, Niharika (Author) Kambhampati, Subbarao (Advisor) Liu, Huan (Committee member) Manikonda, Lydia (Committee member) Arizona State University (Publisher) Artificial intelligence Computer science Ethics bias data augmentation generative adversarial network machine learning society eng 53 pages Masters Thesis Computer Science 2020 Masters Thesis http://hdl.handle.net/2286/R.I.57433 http://rightsstatements.org/vocab/InC/1.0/ 2020
collection	NDLTD
language	English
format	Dissertation
sources	NDLTD
topic	Artificial intelligence Computer science Ethics bias data augmentation generative adversarial network machine learning society
spellingShingle	Artificial intelligence Computer science Ethics bias data augmentation generative adversarial network machine learning society A Study on Generative Adversarial Networks Exacerbating Social Data Bias
description	abstract: Generative Adversarial Networks are designed, in theory, to replicate the distribution of the data they are trained on. With real-world limitations, such as finite network capacity and training set size, they inevitably suffer a yet unavoidable technical failure: mode collapse. GAN-generated data is not nearly as diverse as the real-world data the network is trained on; this work shows that this effect is especially drastic when the training data is highly non-uniform. Specifically, GANs learn to exacerbate the social biases which exist in the training set along sensitive axes such as gender and race. In an age where many datasets are curated from web and social media data (which are almost never balanced), this has dangerous implications for downstream tasks using GAN-generated synthetic data, such as data augmentation for classification. This thesis presents an empirical demonstration of this phenomenon and illustrates its real-world ramifications. It starts by showing that when asked to sample images from an illustrative dataset of engineering faculty headshots from 47 U.S. universities, unfortunately skewed toward white males, a DCGAN’s generator “imagines” faces with light skin colors and masculine features. In addition, this work verifies that the generated distribution diverges more from the real-world distribution when the training data is non-uniform than when it is uniform. This work also shows that a conditional variant of GAN is not immune to exacerbating sensitive social biases. Finally, this work contributes a preliminary case study on Snapchat’s explosively popular GAN-enabled “My Twin” selfie lens, which consistently lightens the skin tone for women of color in an attempt to make faces more feminine. The results and discussion of the study are meant to caution machine learning practitioners who may unsuspectingly increase the biases in their applications. === Dissertation/Thesis === Masters Thesis Computer Science 2020
author2	Jain, Niharika (Author)
author_facet	Jain, Niharika (Author)
title	A Study on Generative Adversarial Networks Exacerbating Social Data Bias
title_short	A Study on Generative Adversarial Networks Exacerbating Social Data Bias
title_full	A Study on Generative Adversarial Networks Exacerbating Social Data Bias
title_fullStr	A Study on Generative Adversarial Networks Exacerbating Social Data Bias
title_full_unstemmed	A Study on Generative Adversarial Networks Exacerbating Social Data Bias
title_sort	study on generative adversarial networks exacerbating social data bias
publishDate	2020
url	http://hdl.handle.net/2286/R.I.57433
_version_	1719315880357134336

A Study on Generative Adversarial Networks Exacerbating Social Data Bias

Similar Items