Robust Large Margin Approaches for Machine Learning in Adversarial Settings

Machine learning algorithms are invented to learn from data and to use data to perform predictions and analyses. Many agencies are now using machine learning algorithms to present services and to perform tasks that used to be done by humans. These services and tasks include making high-stake decisio...

Full description

Bibliographic Details
Main Author: Torkamani, MohamadAli
Other Authors: Lowd, Daniel
Language:en_US
Published: University of Oregon 2016
Subjects:
Online Access:http://hdl.handle.net/1794/20677
id ndltd-uoregon.edu-oai-scholarsbank.uoregon.edu-1794-20677
record_format oai_dc
collection NDLTD
language en_US
sources NDLTD
topic Adversarial machine learning
Convex optimization
Customized regularization
Dropout
Robust machine learning
Robust optimization
spellingShingle Adversarial machine learning
Convex optimization
Customized regularization
Dropout
Robust machine learning
Robust optimization
Torkamani, MohamadAli
Robust Large Margin Approaches for Machine Learning in Adversarial Settings
description Machine learning algorithms are invented to learn from data and to use data to perform predictions and analyses. Many agencies are now using machine learning algorithms to present services and to perform tasks that used to be done by humans. These services and tasks include making high-stake decisions. Determining the right decision strongly relies on the correctness of the input data. This fact provides a tempting incentive for criminals to try to deceive machine learning algorithms by manipulating the data that is fed to the algorithms. And yet, traditional machine learning algorithms are not designed to be safe when confronting unexpected inputs. In this dissertation, we address the problem of adversarial machine learning; i.e., our goal is to build safe machine learning algorithms that are robust in the presence of noisy or adversarially manipulated data. Many complex questions -- to which a machine learning system must respond -- have complex answers. Such outputs of the machine learning algorithm can have some internal structure, with exponentially many possible values. Adversarial machine learning will be more challenging when the output that we want to predict has a complex structure itself. In this dissertation, a significant focus is on adversarial machine learning for predicting structured outputs. In this thesis, first, we develop a new algorithm that reliably performs collective classification: It jointly assigns labels to the nodes of graphed data. It is robust to malicious changes that an adversary can make in the properties of the different nodes of the graph. The learning method is highly efficient and is formulated as a convex quadratic program. Empirical evaluations confirm that this technique not only secures the prediction algorithm in the presence of an adversary, but it also generalizes to future inputs better, even if there is no adversary. While our robust collective classification method is efficient, it is not applicable to generic structured prediction problems. Next, we investigate the problem of parameter learning for robust, structured prediction models. This method constructs regularization functions based on the limitations of the adversary in altering the feature space of the structured prediction algorithm. The proposed regularization techniques secure the algorithm against adversarial data changes, with little additional computational cost. In this dissertation, we prove that robustness to adversarial manipulation of data is equivalent to some regularization for large-margin structured prediction, and vice versa. This confirms some of the previous results for simpler problems. As a matter of fact, an ordinary adversary regularly either does not have enough computational power to design the ultimate optimal attack, or it does not have sufficient information about the learner's model to do so. Therefore, it often tries to apply many random changes to the input in a hope of making a breakthrough. This fact implies that if we minimize the expected loss function under adversarial noise, we will obtain robustness against mediocre adversaries. Dropout training resembles such a noise injection scenario. Dropout training was initially proposed as a regularization technique for neural networks. The procedure is simple: At each iteration of training, randomly selected features are set to zero. We derive a regularization method for large-margin parameter learning based on dropout. Our method calculates the expected loss function under all possible dropout values. This method results in a simple objective function that is efficient to optimize. We extend dropout regularization to non-linear kernels in several different directions. We define the concept of dropout for input space, feature space, and input dimensions, and we introduce methods for approximate marginalization over feature space, even if the feature space is infinite-dimensional. Empirical evaluations show that our techniques consistently outperform the baselines on different datasets.
author2 Lowd, Daniel
author_facet Lowd, Daniel
Torkamani, MohamadAli
author Torkamani, MohamadAli
author_sort Torkamani, MohamadAli
title Robust Large Margin Approaches for Machine Learning in Adversarial Settings
title_short Robust Large Margin Approaches for Machine Learning in Adversarial Settings
title_full Robust Large Margin Approaches for Machine Learning in Adversarial Settings
title_fullStr Robust Large Margin Approaches for Machine Learning in Adversarial Settings
title_full_unstemmed Robust Large Margin Approaches for Machine Learning in Adversarial Settings
title_sort robust large margin approaches for machine learning in adversarial settings
publisher University of Oregon
publishDate 2016
url http://hdl.handle.net/1794/20677
work_keys_str_mv AT torkamanimohamadali robustlargemarginapproachesformachinelearninginadversarialsettings
_version_ 1718804355338993664
spelling ndltd-uoregon.edu-oai-scholarsbank.uoregon.edu-1794-206772018-12-20T05:48:31Z Robust Large Margin Approaches for Machine Learning in Adversarial Settings Torkamani, MohamadAli Lowd, Daniel Adversarial machine learning Convex optimization Customized regularization Dropout Robust machine learning Robust optimization Machine learning algorithms are invented to learn from data and to use data to perform predictions and analyses. Many agencies are now using machine learning algorithms to present services and to perform tasks that used to be done by humans. These services and tasks include making high-stake decisions. Determining the right decision strongly relies on the correctness of the input data. This fact provides a tempting incentive for criminals to try to deceive machine learning algorithms by manipulating the data that is fed to the algorithms. And yet, traditional machine learning algorithms are not designed to be safe when confronting unexpected inputs. In this dissertation, we address the problem of adversarial machine learning; i.e., our goal is to build safe machine learning algorithms that are robust in the presence of noisy or adversarially manipulated data. Many complex questions -- to which a machine learning system must respond -- have complex answers. Such outputs of the machine learning algorithm can have some internal structure, with exponentially many possible values. Adversarial machine learning will be more challenging when the output that we want to predict has a complex structure itself. In this dissertation, a significant focus is on adversarial machine learning for predicting structured outputs. In this thesis, first, we develop a new algorithm that reliably performs collective classification: It jointly assigns labels to the nodes of graphed data. It is robust to malicious changes that an adversary can make in the properties of the different nodes of the graph. The learning method is highly efficient and is formulated as a convex quadratic program. Empirical evaluations confirm that this technique not only secures the prediction algorithm in the presence of an adversary, but it also generalizes to future inputs better, even if there is no adversary. While our robust collective classification method is efficient, it is not applicable to generic structured prediction problems. Next, we investigate the problem of parameter learning for robust, structured prediction models. This method constructs regularization functions based on the limitations of the adversary in altering the feature space of the structured prediction algorithm. The proposed regularization techniques secure the algorithm against adversarial data changes, with little additional computational cost. In this dissertation, we prove that robustness to adversarial manipulation of data is equivalent to some regularization for large-margin structured prediction, and vice versa. This confirms some of the previous results for simpler problems. As a matter of fact, an ordinary adversary regularly either does not have enough computational power to design the ultimate optimal attack, or it does not have sufficient information about the learner's model to do so. Therefore, it often tries to apply many random changes to the input in a hope of making a breakthrough. This fact implies that if we minimize the expected loss function under adversarial noise, we will obtain robustness against mediocre adversaries. Dropout training resembles such a noise injection scenario. Dropout training was initially proposed as a regularization technique for neural networks. The procedure is simple: At each iteration of training, randomly selected features are set to zero. We derive a regularization method for large-margin parameter learning based on dropout. Our method calculates the expected loss function under all possible dropout values. This method results in a simple objective function that is efficient to optimize. We extend dropout regularization to non-linear kernels in several different directions. We define the concept of dropout for input space, feature space, and input dimensions, and we introduce methods for approximate marginalization over feature space, even if the feature space is infinite-dimensional. Empirical evaluations show that our techniques consistently outperform the baselines on different datasets. 2016-11-21T16:55:07Z 2016-11-21T16:55:07Z 2016-11-21 Electronic Thesis or Dissertation http://hdl.handle.net/1794/20677 en_US All Rights Reserved. University of Oregon