Skip to the content.

Retrieval-augmented generation (RAG) systems have presented new challenges in terms of evaluation. The Eval4RAG Workshop at ECIR 2025 aims to help the community grapple with these challenges and reconcile the variety of evaluation proposed protocols. We ultimately aim to push towards a common mindset or conceptual framework for the evaluation of RAG systems that considers the diverse viewpoints of the community.

Tentative Program

                          Program Speaker or Participants
14:30 – 14:40 Opening with Overview of Existing RAG Shared Tasks TBD
14:40 – 15:15 Keynote Talk – A Journey Through Domain-Specific RAG and Agent Evaluation Fabio Petroni
15:18 – 15:25 Automated Evaluation of RAG in Romanian
Authors: Claudiu Creanga, Teodor Marchitan and Liviu Dinu
TBD
15:25 – 15:32 MARE: Automatic Modality-Agnostic Report Evaluation
Authors: Alexander Martin, Kate Sanders, William Walden, Eugene Yang, Reno Kriz, Francis Ferraro and Benjamin Van Durme
TBD
15:32 – 15:39 Controlled Retrieval-Augmented Context Evaluation
Authors: Jia-Huei Ju, Suzan Verberne, Maarten de Rijke and Andrew Yates
TBD
15:39 – 15:46 Open-ended error analysis in retrieval-augmented generation
Authors: Nadezhda Chirkova
TBD
15:46 – 15:53 Challenges in RAG Evaluation for Text Classification in Evidence Synthesis
Authors: Sagar Uprety, Ailbhe Finnerty and James Thomas
TBD
15:53 – 16:00 Automated Evaluations of RAG Systems in Customer Support in Automotive Applications
Authors: Luis Wagner, Gayane Sedrakyan and Jos Van Hillegersberg
TBD
16:00 – 16:30 Coffee Break – Organizers Collecting/Organizing Discussion Idea  
16:30 – 16:45 Discussion Idea Presentation and Group Assignment TBD
16:45 – 17:30 Breakout Discussion Attendees
17:30 – 17:50 Group Report Back Attendees
17:50 – 18:00 Closing TBD

Keynote: A Journey Through Domain-Specific RAG and Agent Evaluation

Speaker: Fabio Petroni

Abstract

In this talk, I will share the evaluation journey at Samaya over the past two and a half years—starting from factoid questions we handcrafted internally, to evaluating complex, real-world predictions about the future. I’ll highlight the evolution of our methodologies and the growing challenges of assessing systems operating in specialized domains, with an eye toward actionable insights and open questions for the community.

Bio

Fabio Petroni is the Co-Founder and CTO of Samaya AI, specializing in the intersection of AI and knowledge. He holds a Ph.D. in Engineering of Computer Science from Sapienza University of Rome and has conducted research in leading industrial labs, including the FAIR team at Meta AI and the R&D department at Thomson Reuters. Fabio is known for his work on knowledge-intensive NLP, with awards such as first place in the NeurIPS Efficient Open-Domain Question Answering competition (2020) and the Google Best Paper Award at AKBC (2020). His research contributions include several high-impact publications, including Language Models as Knowledge Bases? and the original RAG paper.

Presentation Topics

To this end, we call for oral presentations in the workshop to help spawn discussion and share perspectives about RAG evaluation. The call covers but is not limited to:

Interested presenters should submit a one-page extended abstract in ACM two-column conference format (with unlimited reference) as the presentation proposal. The extended abstract should cover the relevancy of the presentation to the workshop, especially for presenting a published or accepted work(s). Both published and unpublished work are welcomed as long as the presentation is relevant to the workshop. Accepted extended abstracts will be given a short dedicated time slot to present their perspective in the workshop. We encourage authors of relevant accepted papers at the main conference to submit an extended abstract to present the work again at the workshop with a focus on evaluation.

Submissions will undergo a lightweight single-blind review, i.e., the author’s identities are visible to reviewers but not the other way around, by the program committee and workshop organizers.

Steering and Program Committee

Organizers