Pseudo Relevance Feedback with Deep Language Models and Dense Retrievers: Successes and Pitfalls

Pseudo Relevance Feedback (PRF) is known to improve the effectiveness of bag-of-words retrievers. At the same time, deep language models have been shown to outperform traditional bag-of-words rerankers. However, it is unclear how to integrate PRF directly with emergent deep language models. This art...

Full description

Bibliographic Details
Main Authors:	Koopman, B. (Author), Li, H. (Author), Mourad, A. (Author), Zhuang, S. (Author), Zuccon, G. (Author)
Format:	Article
Language:	English
Published:	Association for Computing Machinery 2023
Subjects:	Bag of words BERT Computational linguistics Dense retriever dense retrievers Empirical evaluations Feedback approach Feedback signal Information retrieval Language model Pre-trained language model for information retrieval pre-trained language models for information retrieval Pseudo relevance feedback Pseudo-relevance feedbacks Relevance feedback method
Online Access:	View Fulltext in Publisher View in Scopus


LEADER	03104nam a2200373Ia 4500
001	10.1145-3570724
008	230529s2023 CNT 000 0 und d
020			\|a 10468188 (ISSN)
245	1	0	\|a Pseudo Relevance Feedback with Deep Language Models and Dense Retrievers: Successes and Pitfalls
260		0	\|b Association for Computing Machinery \|c 2023
856			\|z View Fulltext in Publisher \|u https://doi.org/10.1145/3570724
856			\|z View in Scopus \|u https://www.scopus.com/inward/record.uri?eid=2-s2.0-85159646060&doi=10.1145%2f3570724&partnerID=40&md5=b7c87e3f4df8c2a8a8493579a299aa7c
520	3		\|a Pseudo Relevance Feedback (PRF) is known to improve the effectiveness of bag-of-words retrievers. At the same time, deep language models have been shown to outperform traditional bag-of-words rerankers. However, it is unclear how to integrate PRF directly with emergent deep language models. This article addresses this gap by investigating methods for integrating PRF signals with rerankers and dense retrievers based on deep language models. We consider text-based, vector-based and hybrid PRF approaches and investigate different ways of combining and scoring relevance signals. An extensive empirical evaluation was conducted across four different datasets and two task settings (retrieval and ranking).Text-based PRF results show that the use of PRF had a mixed effect on deep rerankers across different datasets. We found that the best effectiveness was achieved when (i) directly concatenating each PRF passage with the query, searching with the new set of queries, and then aggregating the scores; (ii) using Borda to aggregate scores from PRF runs.Vector-based PRF results show that the use of PRF enhanced the effectiveness of deep rerankers and dense retrievers over several evaluation metrics. We found that higher effectiveness was achieved when (i) the query retains either the majority or the same weight within the PRF mechanism, and (ii) a shallower PRF signal (i.e., a smaller number of top-ranked passages) was employed, rather than a deeper signal. Our vector-based PRF method is computationally efficient; thus, this represents a general PRF method others can use with deep rerankers and dense retrievers. © 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.
650	0	4	\|a Bag of words
650	0	4	\|a BERT
650	0	4	\|a Computational linguistics
650	0	4	\|a Dense retriever
650	0	4	\|a dense retrievers
650	0	4	\|a Empirical evaluations
650	0	4	\|a Feedback approach
650	0	4	\|a Feedback signal
650	0	4	\|a Information retrieval
650	0	4	\|a Language model
650	0	4	\|a Pre-trained language model for information retrieval
650	0	4	\|a pre-trained language models for information retrieval
650	0	4	\|a Pseudo relevance feedback
650	0	4	\|a Pseudo-relevance feedbacks
650	0	4	\|a Relevance feedback method
700	1	0	\|a Koopman, B. \|e author
700	1	0	\|a Li, H. \|e author
700	1	0	\|a Mourad, A. \|e author
700	1	0	\|a Zhuang, S. \|e author
700	1	0	\|a Zuccon, G. \|e author
773			\|t ACM Transactions on Information Systems

Pseudo Relevance Feedback with Deep Language Models and Dense Retrievers: Successes and Pitfalls

Similar Items