A Metamorphic Testing Approach for Assessing Question Answering Systems

Question Answering (QA) enables the machine to understand and answer questions posed in natural language, which has emerged as a powerful tool in various domains. However, QA is a challenging task and there is an increasing concern about its quality. In this paper, we propose to apply the technique...

Full description

Bibliographic Details
Main Authors: Kaiyi Tu, Mingyue Jiang, Zuohua Ding
Format: Article
Language:English
Published: MDPI AG 2021-03-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/9/7/726
id doaj-513c9e8cd2a54023afe35034d9046ba2
record_format Article
spelling doaj-513c9e8cd2a54023afe35034d9046ba22021-03-28T23:00:10ZengMDPI AGMathematics2227-73902021-03-01972672610.3390/math9070726A Metamorphic Testing Approach for Assessing Question Answering SystemsKaiyi Tu0Mingyue Jiang1Zuohua Ding2School of Information Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, ChinaSchool of Information Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, ChinaSchool of Information Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, ChinaQuestion Answering (QA) enables the machine to understand and answer questions posed in natural language, which has emerged as a powerful tool in various domains. However, QA is a challenging task and there is an increasing concern about its quality. In this paper, we propose to apply the technique of metamorphic testing (MT) to evaluate QA systems from the users’ perspectives, in order to help the users to better understand the capabilities of these systems and then to select appropriate QA systems for their specific needs. Two typical categories of QA systems, namely, the textual QA (TQA) and visual QA (VQA), are studied, and a total number of 17 metamorphic relations (MRs) are identified for them. These MRs respectively focus on some characteristics of different aspects of QA. We further apply MT to four QA systems (including two APIs from the AllenNLP platform, one API from the Transformers platform, and one API from CloudCV) by using all of the MRs. Our experimental results demonstrate the capabilities of the four subject QA systems from various aspects, revealing their strengths and weaknesses. These results further suggest that MT can be an effective method for assessing QA systems.https://www.mdpi.com/2227-7390/9/7/726textual question answeringvisual question answeringmetamorphic testingmetamorphic relationsquality assessment
collection DOAJ
language English
format Article
sources DOAJ
author Kaiyi Tu
Mingyue Jiang
Zuohua Ding
spellingShingle Kaiyi Tu
Mingyue Jiang
Zuohua Ding
A Metamorphic Testing Approach for Assessing Question Answering Systems
Mathematics
textual question answering
visual question answering
metamorphic testing
metamorphic relations
quality assessment
author_facet Kaiyi Tu
Mingyue Jiang
Zuohua Ding
author_sort Kaiyi Tu
title A Metamorphic Testing Approach for Assessing Question Answering Systems
title_short A Metamorphic Testing Approach for Assessing Question Answering Systems
title_full A Metamorphic Testing Approach for Assessing Question Answering Systems
title_fullStr A Metamorphic Testing Approach for Assessing Question Answering Systems
title_full_unstemmed A Metamorphic Testing Approach for Assessing Question Answering Systems
title_sort metamorphic testing approach for assessing question answering systems
publisher MDPI AG
series Mathematics
issn 2227-7390
publishDate 2021-03-01
description Question Answering (QA) enables the machine to understand and answer questions posed in natural language, which has emerged as a powerful tool in various domains. However, QA is a challenging task and there is an increasing concern about its quality. In this paper, we propose to apply the technique of metamorphic testing (MT) to evaluate QA systems from the users’ perspectives, in order to help the users to better understand the capabilities of these systems and then to select appropriate QA systems for their specific needs. Two typical categories of QA systems, namely, the textual QA (TQA) and visual QA (VQA), are studied, and a total number of 17 metamorphic relations (MRs) are identified for them. These MRs respectively focus on some characteristics of different aspects of QA. We further apply MT to four QA systems (including two APIs from the AllenNLP platform, one API from the Transformers platform, and one API from CloudCV) by using all of the MRs. Our experimental results demonstrate the capabilities of the four subject QA systems from various aspects, revealing their strengths and weaknesses. These results further suggest that MT can be an effective method for assessing QA systems.
topic textual question answering
visual question answering
metamorphic testing
metamorphic relations
quality assessment
url https://www.mdpi.com/2227-7390/9/7/726
work_keys_str_mv AT kaiyitu ametamorphictestingapproachforassessingquestionansweringsystems
AT mingyuejiang ametamorphictestingapproachforassessingquestionansweringsystems
AT zuohuading ametamorphictestingapproachforassessingquestionansweringsystems
AT kaiyitu metamorphictestingapproachforassessingquestionansweringsystems
AT mingyuejiang metamorphictestingapproachforassessingquestionansweringsystems
AT zuohuading metamorphictestingapproachforassessingquestionansweringsystems
_version_ 1724199382264315904