Deep Learning Based Robust Text Classification Method via Virtual Adversarial Training

The existing methods of generating adversarial texts usually change the original meanings of texts significantly and even generate the unreadable texts. These less readable adversarial texts can misclassify the machine classifier successfully, but they cannot deceive the human observers very well. I...

Full description

Bibliographic Details
Main Authors: Wei Zhang, Qian Chen, Yunfang Chen
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9040544/
id doaj-e0e97d8d32c6427dbaea88fb808db95e
record_format Article
spelling doaj-e0e97d8d32c6427dbaea88fb808db95e2021-03-30T01:31:16ZengIEEEIEEE Access2169-35362020-01-018611746118210.1109/ACCESS.2020.29816169040544Deep Learning Based Robust Text Classification Method via Virtual Adversarial TrainingWei Zhang0https://orcid.org/0000-0002-1658-0236Qian Chen1Yunfang Chen2School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, ChinaSchool of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, ChinaSchool of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, ChinaThe existing methods of generating adversarial texts usually change the original meanings of texts significantly and even generate the unreadable texts. These less readable adversarial texts can misclassify the machine classifier successfully, but they cannot deceive the human observers very well. In this paper, we propose a novel method that generates readable adversarial texts with some perturbations that can also confuse human observers successfully. Based on the continuous bag-of-words (CBOW) model, the proposed method looks for the appropriate perturbations to generate the adversarial texts through controlling the perturbation direction vectors. Meanwhile, we apply adversarial training to regularize the classification model and extend it to semi-supervised tasks with virtual adversarial training. Experiments are conducted to show that the generated adversaries are interpretable and confused to humans and the virtual adversarial training effectively improves the robustness of the model.https://ieeexplore.ieee.org/document/9040544/Adversarial trainingmodel interpretationtext classificationdeep learning
collection DOAJ
language English
format Article
sources DOAJ
author Wei Zhang
Qian Chen
Yunfang Chen
spellingShingle Wei Zhang
Qian Chen
Yunfang Chen
Deep Learning Based Robust Text Classification Method via Virtual Adversarial Training
IEEE Access
Adversarial training
model interpretation
text classification
deep learning
author_facet Wei Zhang
Qian Chen
Yunfang Chen
author_sort Wei Zhang
title Deep Learning Based Robust Text Classification Method via Virtual Adversarial Training
title_short Deep Learning Based Robust Text Classification Method via Virtual Adversarial Training
title_full Deep Learning Based Robust Text Classification Method via Virtual Adversarial Training
title_fullStr Deep Learning Based Robust Text Classification Method via Virtual Adversarial Training
title_full_unstemmed Deep Learning Based Robust Text Classification Method via Virtual Adversarial Training
title_sort deep learning based robust text classification method via virtual adversarial training
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2020-01-01
description The existing methods of generating adversarial texts usually change the original meanings of texts significantly and even generate the unreadable texts. These less readable adversarial texts can misclassify the machine classifier successfully, but they cannot deceive the human observers very well. In this paper, we propose a novel method that generates readable adversarial texts with some perturbations that can also confuse human observers successfully. Based on the continuous bag-of-words (CBOW) model, the proposed method looks for the appropriate perturbations to generate the adversarial texts through controlling the perturbation direction vectors. Meanwhile, we apply adversarial training to regularize the classification model and extend it to semi-supervised tasks with virtual adversarial training. Experiments are conducted to show that the generated adversaries are interpretable and confused to humans and the virtual adversarial training effectively improves the robustness of the model.
topic Adversarial training
model interpretation
text classification
deep learning
url https://ieeexplore.ieee.org/document/9040544/
work_keys_str_mv AT weizhang deeplearningbasedrobusttextclassificationmethodviavirtualadversarialtraining
AT qianchen deeplearningbasedrobusttextclassificationmethodviavirtualadversarialtraining
AT yunfangchen deeplearningbasedrobusttextclassificationmethodviavirtualadversarialtraining
_version_ 1724186961512497152