Retrieval-augmented generation (RAG) systems have presented new challenges in terms of evaluation. The Eval4RAG Workshop at ECIR 2025 aims to help the community grapple with these challenges and reconcile the variety of evaluation proposed protocols. We ultimately aim to push towards a common mindset or conceptual framework for the evaluation of RAG systems that considers the diverse viewpoints of the community.
Presentation Topics
To this end, we call for oral presentations in the workshop to help spawn discussion and share perspectives about RAG evaluation. The call covers but is not limited to:
- Desired evaluation aspects or qualities for RAG systems;
- Unification of RAG task structure, e.g., TREC topic structure and Cranfield paradigm;
- Bridging gaps between IR and other RAG-related fields, e.g., summarization, question answering, etc.;
- Proposed or published relevant evaluation methods or datasets for RAG;
- Automation of evaluation methods;
- Reproducibility for RAG evaluation;
- Reasons to abandon evaluation for RAG, i.e., we don’t need evaluation;
- Other RAG evaluation spicy topics.
Interested presenters should submit a one-page extended abstract in ACM two-column conference format as the presentation proposal. The extended abstract should cover the relevancy of the presentation to the workshop, especially for presenting a published or accepted work(s). Accepted extended abstracts will be given a short dedicated time slot to present their perspective in the workshop. We encourage authors of relevant accepted papers at the main conference to submit an extended abstract to present the work again at the workshop with a focus on evaluation.
Submissions will undergo a lightweight single-blind review, i.e., the author’s identities are visible to reviewers but not the other way around, by the program committee and workshop organizers.
- Submission portal: Easychair
- Abstract Submission Deadline: February 21, 2025 (AoE)
- Notification of Acceptance: March 7, 2025
- Workshop Date: April 10, 2025
Steering and Program Committee
- Ian Soboroff, National Institute of Standards and Technology, USA
- Jimmy Lin, University of Waterloo, Canada
- Charlie Clarke, University of Waterloo, Canada
- Mark Smucker, University of Waterloo, Canada
- Jaap Kamps, University of Amsterdam, Netherlands
- Tetsuya Sakai, Waseda University, Japan
- Dina Demner-Fushman, National Library of Medicine, USA
- Dawn Lawrie, Johns Hopkins University, USA
- James Mayfield, Johns Hopkins University, USA
- Doug Oard, University of Maryland, USA
Organizers
- Eugene Yang, Human Language Technology Center of Excellence, Johns Hopkins University, USA
- Ronak Pradeep, University of Waterloo, Canada
- Dake Zhang, University of Waterloo, Canada
- Sean MacAvaney, University of Glasgow, UK
- Maria Maistro, University of Copenhagen, Denmark
- Mohammad Aliannejadi, University of Amsterdam, Netherlands