MIDAS: Model-Independent Training Data Selection Under Cost Constraints

In general, as the amount of training data is increased, a deep learning model gains a higher training accuracy. To assign labels to training data for use in supervised learning, human resources are required, which incur temporal and economic costs. Therefore, if a sufficient amount of training data...

Full description

Bibliographic Details
Published in:IEEE Access
Main Authors: Gyoungdon Joo, Chulyun Kim
Format: Article
Language:English
Published: IEEE 2018-01-01
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8540354/
_version_ 1852717490540380160
author Gyoungdon Joo
Chulyun Kim
author_facet Gyoungdon Joo
Chulyun Kim
author_sort Gyoungdon Joo
collection DOAJ
container_title IEEE Access
description In general, as the amount of training data is increased, a deep learning model gains a higher training accuracy. To assign labels to training data for use in supervised learning, human resources are required, which incur temporal and economic costs. Therefore, if a sufficient amount of training data cannot be constructed owing to existing cost constraints, it becomes necessary to select the training data that can maximize the accuracy of the deep learning model with only a limited amount of training data. However, although conventional studies on such training data selections take into consideration the training data labeling cost, the selection cost required in the training data selection process is not taken into consideration, which is a problem. Therefore, with the consideration of the selection cost constraint in addition to the data labeling cost constraint, we introduce a training data selection problem and propose novel algorithms to solve it. The advantage of the proposed algorithms is that they can be applied to any network model or data model of deep learning. The performance was verified through experiments using various network models and data.
format Article
id doaj-art-35242510b6bd4fcc8c0f0a7eaa4fdf8e
institution Directory of Open Access Journals
issn 2169-3536
language English
publishDate 2018-01-01
publisher IEEE
record_format Article
spelling doaj-art-35242510b6bd4fcc8c0f0a7eaa4fdf8e2025-08-19T21:13:54ZengIEEEIEEE Access2169-35362018-01-016744627447410.1109/ACCESS.2018.28822698540354MIDAS: Model-Independent Training Data Selection Under Cost ConstraintsGyoungdon Joo0Chulyun Kim1https://orcid.org/0000-0002-6471-5334National Association of Cognitive Science Industries, Institute of Cognitive Intelligence, Seoul, South KoreaSookmyung Women’s University, Seoul, South KoreaIn general, as the amount of training data is increased, a deep learning model gains a higher training accuracy. To assign labels to training data for use in supervised learning, human resources are required, which incur temporal and economic costs. Therefore, if a sufficient amount of training data cannot be constructed owing to existing cost constraints, it becomes necessary to select the training data that can maximize the accuracy of the deep learning model with only a limited amount of training data. However, although conventional studies on such training data selections take into consideration the training data labeling cost, the selection cost required in the training data selection process is not taken into consideration, which is a problem. Therefore, with the consideration of the selection cost constraint in addition to the data labeling cost constraint, we introduce a training data selection problem and propose novel algorithms to solve it. The advantage of the proposed algorithms is that they can be applied to any network model or data model of deep learning. The performance was verified through experiments using various network models and data.https://ieeexplore.ieee.org/document/8540354/Training data selectioncost constraintsmodel-independentdeep learningactive learningmachine learning
spellingShingle Gyoungdon Joo
Chulyun Kim
MIDAS: Model-Independent Training Data Selection Under Cost Constraints
Training data selection
cost constraints
model-independent
deep learning
active learning
machine learning
title MIDAS: Model-Independent Training Data Selection Under Cost Constraints
title_full MIDAS: Model-Independent Training Data Selection Under Cost Constraints
title_fullStr MIDAS: Model-Independent Training Data Selection Under Cost Constraints
title_full_unstemmed MIDAS: Model-Independent Training Data Selection Under Cost Constraints
title_short MIDAS: Model-Independent Training Data Selection Under Cost Constraints
title_sort midas model independent training data selection under cost constraints
topic Training data selection
cost constraints
model-independent
deep learning
active learning
machine learning
url https://ieeexplore.ieee.org/document/8540354/
work_keys_str_mv AT gyoungdonjoo midasmodelindependenttrainingdataselectionundercostconstraints
AT chulyunkim midasmodelindependenttrainingdataselectionundercostconstraints