This repository contains the codebase of Combining Query Performance Predictors: A Reproducibility Study (published in ECIR 2025)
- As the original pre-retrieval QPP methods were not provided by Hauff et al. (ECIR 2009), we have implemented them here, specifically
AvgIDF,MaxIDF,SumSCQ,AvgSCQ,MaxSCQ,SumVAR,AvgVAR,MaxVAR,AvP,AvNP. The implementations are avaialble in the PreRetQPP directory.
-
MaxIDF / AvgIDF
-
SumSCQ / AvSCQ / MaxSCQ
-
Ambiguity-based (AvP, AvNP)
-
Ranking Sensitivity-based (SumVAR, AvVAR, MaxVAR)
- For the post-retrieval QPP methods, we utilized code provided by the original authors when available. For methods lacking a readily available codebase, we implemented them independently. Below are the links to all 10 methods used in this project:
-
WIG (Weighted Information Gain) (unofficial implementation)
-
NQC (Normalized Query Commitment) (unofficial implementation)
-
Clarity (unofficial implementation)
-
UEF (Uncertain Estimation Fusion) (unofficial implementation)
-
NeuralQPP
- available in the NeuralQPP directory
-
Deep-QPP (from the author's repository)
-
qppBERT-PL (from the author's repository)
-
BERT-QPP (from the author's repository)
These repositories serve as resources for implementations of various QPP methods, some of which may require adaptation to integrate into the project’s specific framework.
Pre-computed QPP, AP measures can be found in data folder.
To run different (penalized) regression methods and sampling strategies on three collections — TREC Robust, TREC DL 2019 & 2020, and ClueWeb09B — use the following instructions.
To run the leave one out based approaches used by Hauff et al. :
python3 leave-one-out.py --k 1000 --input data --qpp_type pre --dataset trec678 --ols_type olsTo run lars-traps with leave one out :
python3 lars-traps.py --k 1000 --input data --qpp_type pre --dataset trec678To run bolasso with leave one out:
python3 bolasso.py --k 1000 --input data --qpp_type pre --dataset trec678To run lars-traps with half-split :
python3 lars-traps-split-half.py --k 1000 --input data --qpp_type pre --dataset trec678rbTo run bolasso with half-split:
python3 bolasso-split-half.py --k 1000 --input data --qpp_type post --dataset trec678rbTo compute smare:
python3 smare <path_to_csv_file>Compute correlation metrics:
python3 compute-correlation.py <path_csv_file> <retrieval_depth (1000 / 100)>Fit linear regression with indv. predictor (half-split):
python3 indv-predictor-regr-half-split.py --k 1000 --input data --qpp_type pre --dataset trec678rb --ols_type olsTo run multiple regression with half-split:
python3 multiple-regression-half-split.py --k 1000 --input data --qpp_type pre --dataset trec678rb --ols_type ols@InProceedings{10.1007/978-3-031-88717-8_9,
author="Saha, Sourav
and Datta, Suchana
and Roy, Dwaipayan
and Mitra, Mandar
and Greene, Derek",
title="Combining Query Performance Predictors: A Reproducibility Study",
booktitle="Advances in Information Retrieval",
year="2025",
publisher="Springer Nature Switzerland",
address="Cham",
pages="112--129",
isbn="978-3-031-88717-8"
}