Andrea Pasin

When Reducing Representations Improves Performance

Andrea Pasin, Guglielmo Faggioli, Nicola Ferro, Raffaele Perego, and Nicola Tonellotto

Conference Paper In Advances in Information Retrieval - 48th European Conference on Information Retrieval, ECIR 2026.

Abstract

Neural models have transformed Information Retrieval (IR) by enabling semantic search, representing queries and documents as dense embeddings in latent spaces. However, recent works indicate the contribution of single dimensions in these representations to ranking quality is uneven: some dimensions are essential, while others may even degrade performance. Dimension IMportance Estimators (DIMEs) are heuristics to guide the search for the subsets of dimensions that induce an optimal subaspace where retrieval is more effective. To explore these subspaces, DIMEs rely on two simplifying assumptions: the linearity of subspaces and the independence of dimensions. In this paper, we move a step forward by relaxing the independence assumption and employ- ing genetic algorithms to select the optimal set of dimensions. We show that selecting optimal dimensions for individual queries can achieve up to 0.981 nDCG@10 and 0.831 AP using state-of-the-art dense retrieval models on the considered datasets. Additionally, we identify subsets of dimensions that improve ranking quality across multiple queries simul- taneously. Finally, we show that a dataset-specific subset of dimensions enables dense retrieval models to generalize across other datasets without loss of performance

Quantumclef 2025: Overview of the second quantum computing challenge for information retrieval and recommender systems at CLEF

Andrea Pasin, Maurizio Ferrari Dacrema, Washington Cunha, Marcos André Gonçalves, Paolo Cremonesi, and Nicola Ferro

Conference Paper In Working Notes of the 16th International Conference of the CLEF Association, CLEF 2025, Madrid, Spain, September 9-12, 2025, Proceedings

Abstract

The emerging field of Quantum Computing (QC) is attracting considerable research interest due to its potential. It is in fact believed that QC could revolutionize the way we approach complex problems by significantly reducing the time required to solve them. Although QC is still in its early stages of development, certain problems can already be addressed using quantum computers, offering a glimpse into its capabilities. The goal of the QuantumCLEF lab is to raise awareness of QC and to design, develop, and evaluate new QC algorithms aimed at solving challenges typically encountered in the implementation of Information Retrieval (IR) and Recommender Systems (RS). Furthermore, the lab provides a valuable opportunity to engage with QC technologies, which are often difficult to access. In this work, we present an overview of the second edition of QuantumCLEF, a lab focused on applying Quantum Annealing (QA), a specific QC paradigm, to three tasks: Feature Selection for IR and RS systems, Instance Selection for IR systems, and Clustering for IR systems. A total of 44 teams registered for the lab, with 5 teams successfully submitting their runs in accordance with the lab guidelines. Given the novelty of the topics, participants were provided with extensive examples and comprehensive materials to help them understand how QA works and how to program quantum annealers.

Overview of QuantumCLEF 2025: The Second Quantum Computing Challenge for Information Retrieval and Recommender Systems at CLEF

Andrea Pasin, Maurizio Ferrari Dacrema, Washington Cunha, Marcos André Gonçalves, Paolo Cremonesi, and Nicola Ferro

Conference Paper In Experimental IR Meets Multilinguality, Multimodality, and Interaction - 16th International Conference of the CLEF Association, CLEF 2025, Madrid, Spain, September 9-12, 2025, Proceedings

Abstract

Quantum Computing (QC) is an emerging research field that is attracting significant interest from the scientific community due to its potential to solve complex problems more efficiently than traditional computers by leveraging the principles of quantum physics. Even though real quantum computers exist, at the moment we are still in the early stages of development of these innovative technologies, and many of their capabilities and limitations are yet to be discovered. In this work, we present an overview of the second edition of QuantumCLEF, a lab that focuses on the application of Quantum Annealing (QA), a specific QC paradigm, for different tasks related to IR and RS. The main objective of the QuantumCLEF lab is to investigate QC, raise awareness, and develop and evaluate new QC algorithms for different applications. This lab represents a great chance for researchers and industry practitioners to understand more about this new field by having access to real quantum computers, which are still not easily accessible nowadays. This edition consisted of three different tasks: Feature Selection for IR and RS systems, Instance Selection for IR systems, and Clustering for IR systems. There have been a total of 44 teams that registered for this lab, and eventually, 5 teams managed to successfully submit their runs following the lab guidelines. Participants have been provided with examples, tutorials, and comprehensive materials due to the novelty of the QC field, allowing them to understand how QA works and how to program quantum annealers.

The KIMERA Infrastructure: Shifting from Evaluation-as-a-Service to Evaluation-in-the-Cloud

Andrea Pasin, and Nicola Ferro

Conference Paper In Proceedings of the 15th Italian Information Retrieval Workshop, IIR 2025, Cagliari, Italy, September 3-5, 2025, Proceedings

Abstract

Experimental evaluation plays a key role in Information Retrieval (IR), and Evaluation-as-a-Service (EaaS) was proposed as a viable approach for efficiently running experiments without distributing experimental collections. We now introduce Kubernetes Infrastructure for Managed Evaluation and Resource Access (KIMERA), a cloud-based platform implemented with Kubernetes that advances EaaS toward Evaluation-in-the-Cloud (EitC), enabling researchers to develop and run IR systems directly through a web interface. KIMERA ensures scalability, reproducibility, and fairness across experiments, and it can integrate easy access to external services such as Large Language Models and Quantum Computing via APIs. It supports detailed resource tracking for a comprehensive evaluation of effectiveness and efficiency.

KIMERA: From Evaluation-as-a-Service to Evaluation-in-the-Cloud

Andrea Pasin, and Nicola Ferro

Conference Paper In Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2025, Padua, Italy, July 13-18, 2025

Abstract

Experimental evaluation steers the development of Information Retrieval (IR) systems, and large-scale evaluation campaigns provide the field with a common infrastructure to conduct comparable evaluation exercises. Over the years, tools and platforms have been developed to manage and automate these activities, enhance the reproducibility of conducted experiments and facilitate data sharing. In this context, Evaluation-as-a-Service (EaaS) emerged as an approach to avoid distributing experimental collections, which may contain copyrighted or sensitive data, and instead execute containerised code on that data on remote servers. We propose Kubernetes Infrastructure for Managed Evaluation and Resource Access (KIMERA) as the next step from EaaS into Evaluation-in-the-Cloud (EitC), allowing researchers to directly code and execute their systems through their browsers, requiring only an internet connection. Moreover, recent advancements, such as Large Language Models, or new computing paradigms, such as quantum computers, require external third party services and computational resources. In this respect, KIMERA streamlines and simplifies access to such services on-demand via their APIs. More in detail, KIMERA relies on state-of-the-art containerization and orchestration tools, such as Docker and Kubernetes, to provide a robust, scalable, secure, and fault-tolerant IR evaluation platform. KIMERA monitors and stores all the participants' submissions, accurately keeping track of the resource usage, allowing for evaluating both the efficiency and the effectiveness of the deployed methods. Moreover, all participants can be assigned workspaces sharing the same resources (i.e., CPU and RAM), thus enhancing reproducibility and comparability among systems. Finally, KIMERA has been designed with modularity and extensibility in mind, allowing it to be easily adapted to new use cases and usage scenarios. KIMERA has been developed and adopted in the context of the QuantumCLEF lab, to allow for mixed experiments, comparing approaches running on traditional hardware and on real quantum annealers provided by external companies. KIMERA has also been used as a learning resource to provide Quantum Computing tutorials for IR at major conferences, such as ECIR and SIGIR. The source code of KIMERA is openly available at https://github.com/MjPaxter/KIMERA.

QuantumCLEF 2025-The Second Edition of the Quantum Computing Lab at CLEF

Andrea Pasin, Maurizio Ferrari Dacrema, Paolo Cremonesi, Washington Cunha, Marcos André Gonçalves, and Nicola Ferro

Conference Paper In Advances in Information Retrieval - 47th European Conference on Information Retrieval, ECIR 2025, Lucca, Italy, April 6-10, 2025, Proceedings, Part V

Abstract

Over the last few years, Quantum Computing (QC) has captured the attention of numerous researchers from different fields since QC resources have become more applicable in solving practical problems. In the current landscape, Information Retrieval (IR) and Recommender Systems (RS) need to perform computationally intensive operations on massive and heterogeneous datasets. Therefore, it could be possible to use QC technologies such as Quantum Annealing (QA) to boost systems’ performance. The objective of this work is to present the second edition of the QuantumCLEF lab, which is composed of three tasks that aim at discovering and evaluating QA approaches compared to their traditional counterpart while also establishing collaborations among researchers from different fields to harness their knowledge and skills to solve the considered challenges and promote the usage of QA. This lab will allow participants to use real quantum computers provided by CINECA, one of the most important computing centers worldwide.

SEUPD@ CLEF: Team Axolotl on Rumor Verification using Evidence from Authorities

Andrea Pasin, and Nicola Ferro

Conference Paper In Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2024), Grenoble, France, September 9th to 12th.

Abstract

Nowadays, Search Engines (SEs) are technologies that are employed by the majority of people daily to satisfy information needs. Even though SEs and their underlying algorithms have been improved for several years, there are many challenges that are still to be solved. In this paper, we propose a possible approach to address Task 5 proposed in the CheckThat! Lab at CLEF 2024. The task involves the identification of relevant tweets from a set of authorities that can be used to verify a given rumor expressed in another tweet (i.e., to determine if the rumor can be trusted or not). It is also necessary to report whether the retrieved tweets support or oppose the considered rumor. We also show the results achieved by our system according to some of its possible configurations, analyzing the results and discussing which parameters impacted the performances the most, both in terms of efficiency and effectiveness. We observe that the usage of Large Language Models (LLMs) can boost effectiveness but results in a severe loss in terms of efficiency compared to less complex models. We finally show that our proposed system manages to achieve better results in terms of effectiveness compared to the ones achieved by the baseline provided by the Lab organizers on the English dataset available for this task.

Overview of QuantumCLEF 2024: The Quantum Computing Challenge for Information Retrieval and Recommender Systems at CLEF

Andrea Pasin, Maurizio Ferrari Dacrema, Paolo Cremonesi, and Nicola Ferro

Conference Paper In Proceedings of Experimental IR Meets Multilinguality, Multimodality, and Interaction-15th International Conference of the CLEF Association, CLEF.

Abstract

Quantum Computing (QC) is an innovative research field that has gathered the interest of many researchers in the last few years. In fact, it is believed that QC could potentially revolutionize the way we solve very complex problems by dramatically decreasing the time required to solve them. Even though QC is still in its early stages of development, it is already possible to tackle some problems by means of quantum computers and to start catching a glimpse of its potential. Therefore, the aim of the QuantumCLEF lab is to raise awareness about QC and to develop and evaluate new QC algorithms to solve challenges that can be encountered when implementing Information Retrieval (IR) and Recommender Systems (RS) systems. Furthermore, this lab rep- resents a good opportunity to engage with QC technologies which are typically not easily accessible. In this work, we present an overview of the first edition of QuantumCLEF, a lab that focuses on the application of Quantum Annealing (QA), a specific QC paradigm, to solve two tasks: Feature Selection for IR and RS systems, and Clustering for IR systems. There have been a total of 26 teams who registered for this lab and eventually 7 teams managed to successfully submit their runs following the lab guidelines. Due to the novelty of the topics, participants have been provided with many examples and comprehensive materials that allowed them to understand how QA works and how to program quantum annealers.

QuantumCLEF 2024: Overview of the Quantum Computing Challenge for Information Retrieval and Recommender Systems at CLEF

Andrea Pasin,Maurizio Ferrari Dacrema, Paolo Cremonesi, and Nicola Ferro

Conference Paper In Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2024), Grenoble, France, September 9th to 12th.

Abstract

The emerging field of Quantum Computing (QC) in computational science is attracting significant research interest due to its potential for groundbreaking applications. In fact, it is believed that QC could potentially revolutionize the way we solve very complex problems by significantly decreasing the time required to solve them. Even though QC is still in its early stages of development, it is already possible to tackle some problems using quantum computers and, thus, begin to see its potential. Therefore, the aim of the QuantumCLEF lab is to raise awareness about QC and to develop and evaluate new QC algorithms to solve challenges that are usually faced when implementing Information Retrieval (IR) and Recommender Systems (RS) systems. Furthermore, this lab represents a good opportunity to engage with QC technologies, which are typically not easily accessible due to their early development stage. In this work, we present an overview of the first edition of QuantumCLEF, a lab that focuses on the application of Quantum Annealing (QA), a specific QC paradigm, to solve two tasks: Feature Selection for IR and RS systems, and Clustering for IR systems. There were a total of 26 teams who registered for this lab, and eventually, 7 teams successfully submitted their runs following the lab guidelines. Due to the novelty of the topics, participants were provided with many examples and comprehensive materials to help them understand how QA works and how to program quantum annealers.

A Quantum Annealing-Based Instance Selection Approach for Transformer Fine-Tuning

Andrea Pasin, Washington Cunha, Marcos André Gonçalves, Nicola Ferro

Conference Paper In Proceedings of the 14th Italian Information Retrieval Workshop, Udine, Italy, September 5-6, 2024

Abstract

Currently, Deep Learning (DL) is widely used to solve very complex tasks. However, the training of DL models requires huge datasets and long training times. We introduce a novel quantum Instance Selection (IS) approach that reduces training dataset sizes by up to 28% while maintaining effectiveness, enhancing training efficiency and scalability. Our method leverages Quantum Annealing (QA), a specific Quantum Computing paradigm, that can address optimization problems. This is the first attempt to tackle the IS problem using QA, and we propose a new Quadratic Unconstrained Binary Optimization (QUBO) formulation for it. Extensive experiments with several Automatic Text Classification (ATC) datasets show our solution’s feasibility and competitiveness with current state-of-the-art IS solutions.

Using and Evaluating Quantum Computing for Information Retrieval and Recommender Systems

Maurizio Ferrari Dacrema, Andrea Pasin, Paolo Cremonesi, and Nicola Ferro

Conference Paper In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 3017-3020).

Abstract

The field of Quantum Computing (QC) has gained significant popularity in recent years, due to its potential to provide benefits in terms of efficiency and effectiveness when employed to solve certain computationally intensive tasks. In both Information Retrieval (IR) and Recommender Systems (RS) we are required to build methods that apply complex processing on large and heterogeneous datasets, it is natural therefore to wonder whether QC could also be applied to boost their performance. The tutorial aims to provide first an introduction to QC for an audience that is not familiar with the technology, then to show how to apply the QC paradigm of Quantum Annealing (QA) to solve practical problems that are currently faced by IR and RS systems. During the tutorial, participants will be provided with the fundamentals required to understand QC and to apply it in practice by using a real D-Wave quantum annealer through APIs.

A Quantum Annealing Instance Selection Approach for Efficient and Effective Transformer Fine-Tuning

Andrea Pasin, Washington Cunha, Marcos André Gonçalves, Nicola Ferro

Conference Paper In Proceedings of The 10th ACM SIGIR/The 14th International Conference on the Theory of Information Retrieval.

Abstract

Deep Learning approaches have become pervasive in recent years. In fact, they allow for solving tasks that were thought to be too complex a few decades ago, sometimes with superhuman effectiveness. However, these models require huge datasets to be properly trained and to provide a good generalization. This translates into high training and fine-tuning time, even several days for the most complex models and large datasets. In this work, we present a novel quantum IS approach that allows to significantly reduce the size of the training datasets (by up to 28%) while maintaining the model's effectiveness, thus promoting (training) speedups and scalability. Our solution is innovative in the sense that it exploits a different computing paradigm -- QA -- a specific Quantum Computing paradigm that can be used to tackle practical optimization problems. To the best of our knowledge, there have been no prior attempts to tackle the IS problem using QA. Furthermore, we propose a new QUBO formulation specific for the IS problem, which is a contribution in itself. Through an extensive set of experiments with several ATC benchmarks, we empirically demonstrate both the feasibility of our quantum solution and its competitiveness with the current state-of-the-art IS solutions.

PNRRorienta: A Web Application for Managing Schools, Courses, and Students Involved in the PNRR Orientation Initiative

Andrea Pasin, Lorenza Da Re, Andrea Gerosa, Lidia Pezzuoli, Silvia Preciso, Nicola Ferro

Conference Paper In Proceedings of the 32nd Symposium on Advanced Database Systems Villasimius, Italy, June 23rd to 26th, 2024.

Abstract

The National Recovery and Resilience Plan (PNRR) allocates funds to universities for participating in an initiative for delivering orientation courses to students in the last three years of secondary education. The University of Padua is among the institutions joining this initiative. The main objective of these courses is to help students understand the significance of higher education and its value to society. These courses also provide students with an opportunity to explore different educational offerings. Additionally, students can gain practical experience in active and laboratory-based disciplinary teaching, consolidate their knowledge, and develop reflective and transversal skills. Finally, students can also get an overview of various employment sectors and potential job prospects. However, this initiative requires a big effort to plan all the lectures and manage the huge amount of students, institutes, courses, and professors involved. Therefore, in this work we present a Web application, called PNRRorienta, we have designed and developed to manage and simplify all the tasks related to this initiative. The University of Padua has started to use this application in September 2023 and, as of February 2024, it handles more than 70 different secondary education institutes in the Veneto Region, almost 200 courses offered to students, more than 1,300 lectures, more than 400 professors, and almost 10,000 students actively enrolled.

Quantum Computing for Information Retrieval and Recommender Systems

Maurizio Ferrari Dacrema, Andrea Pasin, Paolo Cremonesi, and Nicola Ferro

Conference Paper In Nazli, G., Tonellotto, N., He, Y., Lipani, A., McDonald, G., Macdonald, C., and Ounis, I., editors, Advances in Information Retrieval. Proc. 46th European Conference on IR Research (ECIR 2024) - Part II. Lecture Notes in Computer Science (LNCS) 14609, Springer, Heidelberg, Germany

Abstract

Quantum Computing (QC) is a research field that has been in the limelight in recent years. In fact, many researchers and practitioners believe that it can provide benefits in terms of efficiency and effectiveness when employed to solve certain computationally intensive tasks. In Information Retrieval (IR) and Recommender Systems (RS) we are required to process very large and heterogeneous datasets by means of complex operations, it is natural therefore to wonder whether QC could also be applied to boost their performance. The goal of this tutorial is to show how QC works to an audience that is not familiar with the technology, as well as how to apply the QC paradigm of Quantum Annealing (QA) to solve practical problems that are currently faced by IR and RS systems. During the tutorial, participants will be provided with the fundamentals required to understand QC and to apply it in practice by using a real D-Wave quantum annealer through APIs.

QuantumCLEF - Quantum Computing at CLEF

Andrea Pasin, Maurizio Ferrari Dacrema, Paolo Cremonesi, and Nicola Ferro

Conference Paper In Nazli, G., Tonellotto, N., He, Y., Lipani, A., McDonald, G., Macdonald, C., and Ounis, I., editors, Advances in Information Retrieval. Proc. 46th European Conference on IR Research (ECIR 2024) - Part II. Lecture Notes in Computer Science (LNCS) 14609, Springer, Heidelberg, Germany

Abstract

Over the last few years, Quantum Computing (QC) has captured the attention of numerous researchers pertaining to different fields since, due to technological advancements, QC resources have become more available and also applicable in solving practical problems. In the current landscape, Information Retrieval (IR) and Recommender Systems (RS) need to perform computationally intensive operations on massive and heterogeneous datasets. Therefore, it could be possible to use QC and especially Quantum Annealing (QA) technologies to boost systems' performance both in terms of efficiency and effectiveness. The objective of this work is to present the first edition of the QuantumCLEF lab, which is composed of two tasks that aim at:

evaluating QA approaches compared to their traditional counterpart;
identifying new problem formulations to discover novel methods that leverage the capabilities of QA for improved solutions;
establishing collaborations among researchers from different fields to harness their knowledge and skills to solve the considered challenges and promote the usage of QA.

qCLEF: a Proposal to Evaluate Quantum Annealing for Information Retrieval and Recommender Systems

Andrea Pasin, Maurizio Ferrari Dacrema, Paolo Cremonesi, and Nicola Ferro

Conference Paper 14th International Conference of the CLEF Association, CLEF 2023, Thessaloniki, Greece, September 18-21, 2023, ProceedingsSep 2023Pages 97-108

Abstract

Quantum Computing (QC) has been a focus of research for many researchers over the last few years. As a result of technological development, QC resources are also becoming available and usable to solve practical problems in the Information Retrieval (IR) and Recommender Systems (RS) fields. Nowadays IR and RS need to perform complex operations on very large datasets. In this scenario, it could be possible to increase the performance of these systems both in terms of efficiency and effectiveness by employing QC and, especially, Quantum Annealing (QA). The goal of this work is to design a Lab composed of different Shared Tasks that aims to:

compare the performance of QA approaches with respect to their counterparts using traditional hardware;
identify new ways of formulating problems so that they can be solved with quantum annealers;
allow researchers from to different fields (e.g., Information Retrieval, Operations Research...) to work together and learn more about QA technologies.

This Lab uses the QC resources provided by CINECA, one of the most important computing centers worldwide, thanks to an already met agreement. In addition, we also show a possible implementation of the required infrastructure which uses Docker containers and the Kubernetes orchestrator to ensure scalability, fault tolerance and that can be deployed on the cloud.

SEUPD@ CLEF: Team INTSEG on argument retrieval for controversial questions

Sepide Bahrami, Gnana Prakash Goli, Andrea Pasin, Neemol Rajkumari, Mohammad Muzammil Sohail, Paria Tahan, Nicola Ferro

Conference Paper Working Notes Papers of the CLEF, 2022

Abstract

Search Engines play important roles in helping users to rapidly retrieve relevant information. The technology underlying Search Engines has been improved in the last years, both in terms of hardware capabilities and in terms of software. However, they are still affected by many issues due to the continuously growing amount of data and the various forms in which it comes. In this paper we discuss our solution to the Information Retrieval problem proposed by the CLEF 2022 Touché Task 1. We first describe in general the considered problem and subsequently present our Information Retrieval System implemented through Apache Lucene illustrating the various phases and methods applied to fulfil the objectives of the task. Eventually, we provide the obtained experimental results and possible explanations for them. In particular, we investigate the reasons for which some methods performed worse than others and describe possible ways to improve the system in the future.

Information Management Systems

Department of Information Engineering

University of Padua

Publications

Filter by Type

Filter by Year

Sort by Year

When Reducing Representations Improves Performance

Abstract

Quantumclef 2025: Overview of the second quantum computing challenge for information retrieval and recommender systems at CLEF

Abstract

Overview of QuantumCLEF 2025: The Second Quantum Computing Challenge for Information Retrieval and Recommender Systems at CLEF

Abstract

The KIMERA Infrastructure: Shifting from Evaluation-as-a-Service to Evaluation-in-the-Cloud

Abstract

KIMERA: From Evaluation-as-a-Service to Evaluation-in-the-Cloud

Abstract

QuantumCLEF 2025-The Second Edition of the Quantum Computing Lab at CLEF

Abstract

SEUPD@ CLEF: Team Axolotl on Rumor Verification using Evidence from Authorities

Abstract

Overview of QuantumCLEF 2024: The Quantum Computing Challenge for Information Retrieval and Recommender Systems at CLEF

Abstract

QuantumCLEF 2024: Overview of the Quantum Computing Challenge for Information Retrieval and Recommender Systems at CLEF

Abstract

A Quantum Annealing-Based Instance Selection Approach for Transformer Fine-Tuning

Abstract

Using and Evaluating Quantum Computing for Information Retrieval and Recommender Systems

Abstract

A Quantum Annealing Instance Selection Approach for Efficient and Effective Transformer Fine-Tuning

Abstract

PNRRorienta: A Web Application for Managing Schools, Courses, and Students Involved in the PNRR Orientation Initiative

Abstract

Quantum Computing for Information Retrieval and Recommender Systems

Abstract

QuantumCLEF - Quantum Computing at CLEF

Abstract

qCLEF: a Proposal to Evaluate Quantum Annealing for Information Retrieval and Recommender Systems

Abstract

SEUPD@ CLEF: Team INTSEG on argument retrieval for controversial questions

Abstract