Building High-Quality Datasets for Information Retrieval Evaluation at a Reduced Cost
Information Retrieval is not any more exclusively about document ranking. Continuously new tasks are proposed on this and sibling fields. With this proliferation of tasks, it becomes crucial to have a cheap way of constructing test collections to evaluate the new developments. Building test collecti...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2019-08-01
|
Series: | Proceedings |
Subjects: | |
Online Access: | https://www.mdpi.com/2504-3900/21/1/33 |
id |
doaj-038130998ecf4debbf707f0e9150ceb1 |
---|---|
record_format |
Article |
spelling |
doaj-038130998ecf4debbf707f0e9150ceb12020-11-24T21:38:51ZengMDPI AGProceedings2504-39002019-08-012113310.3390/proceedings2019021033proceedings2019021033Building High-Quality Datasets for Information Retrieval Evaluation at a Reduced CostDavid Otero0Daniel Valcarce1Javier Parapar2Álvaro Barreiro3Information Retrieval Lab, Centro de Investigación en Tecnoloxías da Información e as Comunicacións (CITIC), Universidade da Coruña, 15071 A Coruña, SpainInformation Retrieval Lab, Centro de Investigación en Tecnoloxías da Información e as Comunicacións (CITIC), Universidade da Coruña, 15071 A Coruña, SpainInformation Retrieval Lab, Centro de Investigación en Tecnoloxías da Información e as Comunicacións (CITIC), Universidade da Coruña, 15071 A Coruña, SpainInformation Retrieval Lab, Centro de Investigación en Tecnoloxías da Información e as Comunicacións (CITIC), Universidade da Coruña, 15071 A Coruña, SpainInformation Retrieval is not any more exclusively about document ranking. Continuously new tasks are proposed on this and sibling fields. With this proliferation of tasks, it becomes crucial to have a cheap way of constructing test collections to evaluate the new developments. Building test collections is time and resource consuming: it requires time to obtain the documents, to define the user needs and it requires the assessors to judge a lot of documents. To reduce the latest, pooling strategies aim to decrease the assessment effort by presenting to the assessors a sample of documents in the corpus with the maximum number of relevant documents in it. In this paper, we propose the preliminary design of different techniques to easily and cheapily build high-quality test collections without the need of having participants systems.https://www.mdpi.com/2504-3900/21/1/33information retrievalevaluationdatasetscost |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
David Otero Daniel Valcarce Javier Parapar Álvaro Barreiro |
spellingShingle |
David Otero Daniel Valcarce Javier Parapar Álvaro Barreiro Building High-Quality Datasets for Information Retrieval Evaluation at a Reduced Cost Proceedings information retrieval evaluation datasets cost |
author_facet |
David Otero Daniel Valcarce Javier Parapar Álvaro Barreiro |
author_sort |
David Otero |
title |
Building High-Quality Datasets for Information Retrieval Evaluation at a Reduced Cost |
title_short |
Building High-Quality Datasets for Information Retrieval Evaluation at a Reduced Cost |
title_full |
Building High-Quality Datasets for Information Retrieval Evaluation at a Reduced Cost |
title_fullStr |
Building High-Quality Datasets for Information Retrieval Evaluation at a Reduced Cost |
title_full_unstemmed |
Building High-Quality Datasets for Information Retrieval Evaluation at a Reduced Cost |
title_sort |
building high-quality datasets for information retrieval evaluation at a reduced cost |
publisher |
MDPI AG |
series |
Proceedings |
issn |
2504-3900 |
publishDate |
2019-08-01 |
description |
Information Retrieval is not any more exclusively about document ranking. Continuously new tasks are proposed on this and sibling fields. With this proliferation of tasks, it becomes crucial to have a cheap way of constructing test collections to evaluate the new developments. Building test collections is time and resource consuming: it requires time to obtain the documents, to define the user needs and it requires the assessors to judge a lot of documents. To reduce the latest, pooling strategies aim to decrease the assessment effort by presenting to the assessors a sample of documents in the corpus with the maximum number of relevant documents in it. In this paper, we propose the preliminary design of different techniques to easily and cheapily build high-quality test collections without the need of having participants systems. |
topic |
information retrieval evaluation datasets cost |
url |
https://www.mdpi.com/2504-3900/21/1/33 |
work_keys_str_mv |
AT davidotero buildinghighqualitydatasetsforinformationretrievalevaluationatareducedcost AT danielvalcarce buildinghighqualitydatasetsforinformationretrievalevaluationatareducedcost AT javierparapar buildinghighqualitydatasetsforinformationretrievalevaluationatareducedcost AT alvarobarreiro buildinghighqualitydatasetsforinformationretrievalevaluationatareducedcost |
_version_ |
1725934132042661888 |