Machine Learning for Dissimulating Reality

In the last decade, advances in statistical modeling and computer science have boosted the production of machine-produced contents in different fields: from language to image generation, the quality of the generated outputs is remarkably high, sometimes better than those produced by a human being. M...

Full description

Bibliographic Details
Main Author:	Andrea Giussani
Format:	Article
Language:	English
Published:	MDPI AG 2021-04-01
Series:	Proceedings
Subjects:	machine learning natural language processing supervised learning classification task
Online Access:	https://www.mdpi.com/2504-3900/77/1/17

id	doaj-0e4626c0783146199a2aae94a8714a4d
record_format	Article
spelling	doaj-0e4626c0783146199a2aae94a8714a4d2021-04-27T23:01:10ZengMDPI AGProceedings2504-39002021-04-0177171710.3390/proceedings2021077017Machine Learning for Dissimulating RealityAndrea Giussani0Department of Decision Sciences and Bocconi Institute for Data Science and Analytics, Bocconi University, 20136 Milan, ItalyIn the last decade, advances in statistical modeling and computer science have boosted the production of machine-produced contents in different fields: from language to image generation, the quality of the generated outputs is remarkably high, sometimes better than those produced by a human being. Modern technological advances such as OpenAI’s GPT-2 (and recently GPT-3) permit automated systems to dramatically alter reality with synthetic outputs so that humans are not able to distinguish the real copy from its counteracts. An example is given by an article entirely written by GPT-2, but many other examples exist. In the field of computer vision, Nvidia’s Generative Adversarial Network, commonly known as StyleGAN (Karras et al. 2018), has become the de facto reference point for the production of a huge amount of fake human face portraits; additionally, recent algorithms were developed to create both musical scores and mathematical formulas. This presentation aims to stimulate participants on the state-of-the-art results in this field: we will cover both GANs and language modeling with recent applications. The novelty here is that we apply a transformer-based machine learning technique, namely RoBerta (Liu et al. 2019), to the detection of human-produced versus machine-produced text concerning fake news detection. RoBerta is a recent algorithm that is based on the well-known Bidirectional Encoder Representations from Transformers algorithm, known as BERT (Devlin et al. 2018); this is a bi-directional transformer used for natural language processing developed by Google and pre-trained over a huge amount of unlabeled textual data to learn embeddings. We will then use these representations as an input of our classifier to detect real vs. machine-produced text. The application is demonstrated in the presentation.https://www.mdpi.com/2504-3900/77/1/17machine learningnatural language processingsupervised learningclassification task
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Andrea Giussani
spellingShingle	Andrea Giussani Machine Learning for Dissimulating Reality Proceedings machine learning natural language processing supervised learning classification task
author_facet	Andrea Giussani
author_sort	Andrea Giussani
title	Machine Learning for Dissimulating Reality
title_short	Machine Learning for Dissimulating Reality
title_full	Machine Learning for Dissimulating Reality
title_fullStr	Machine Learning for Dissimulating Reality
title_full_unstemmed	Machine Learning for Dissimulating Reality
title_sort	machine learning for dissimulating reality
publisher	MDPI AG
series	Proceedings
issn	2504-3900
publishDate	2021-04-01
description	In the last decade, advances in statistical modeling and computer science have boosted the production of machine-produced contents in different fields: from language to image generation, the quality of the generated outputs is remarkably high, sometimes better than those produced by a human being. Modern technological advances such as OpenAI’s GPT-2 (and recently GPT-3) permit automated systems to dramatically alter reality with synthetic outputs so that humans are not able to distinguish the real copy from its counteracts. An example is given by an article entirely written by GPT-2, but many other examples exist. In the field of computer vision, Nvidia’s Generative Adversarial Network, commonly known as StyleGAN (Karras et al. 2018), has become the de facto reference point for the production of a huge amount of fake human face portraits; additionally, recent algorithms were developed to create both musical scores and mathematical formulas. This presentation aims to stimulate participants on the state-of-the-art results in this field: we will cover both GANs and language modeling with recent applications. The novelty here is that we apply a transformer-based machine learning technique, namely RoBerta (Liu et al. 2019), to the detection of human-produced versus machine-produced text concerning fake news detection. RoBerta is a recent algorithm that is based on the well-known Bidirectional Encoder Representations from Transformers algorithm, known as BERT (Devlin et al. 2018); this is a bi-directional transformer used for natural language processing developed by Google and pre-trained over a huge amount of unlabeled textual data to learn embeddings. We will then use these representations as an input of our classifier to detect real vs. machine-produced text. The application is demonstrated in the presentation.
topic	machine learning natural language processing supervised learning classification task
url	https://www.mdpi.com/2504-3900/77/1/17
work_keys_str_mv	AT andreagiussani machinelearningfordissimulatingreality
_version_	1721505501121871872

Machine Learning for Dissimulating Reality

Similar Items