EVENTA Grand Challenge - ACM Multimedia 2025
Dublin, Ireland, Oct. 27-31, 2025

The EVENTA 2025 Grand Challenge focuses on image-caption retrieval and generation. We aim to integrate contextual details and event-related information to create comprehensive, narrative-driven captions that go beyond simple visual analysis. Therefore, the captions provide richer, more comprehensive information about an image. These captions go beyond simple visual descriptions by offering deeper insights, including the names and attributes of objects, the timing, context, outcomes of events, and other crucial details - information that cannot be gleaned from merely observing the image.


🚀 Top-ranked teams will be invited to submit a paper to ACM Multimedia and present at the conference, subject to peer review.

To participate in the EVENTA 2025 Grand Challenge, please first register by submitting the form.

Track 1: Event-Enriched Image Retrieval and Captioning

This track aims to generate captions that provide richer, more comprehensive information about an image. These captions go beyond simple visual descriptions by offering deeper insights, including the names and attributes of objects, the timing, context, outcomes of events, and other crucial details—information that cannot be gleaned from merely observing the image. Given an image, participants are required to search relevant articles in a provided external database and extract necessary information to enrich the image caption. This retrieval augmentation generation track facilitates the creation of more coherent and detailed narratives, capturing not only the visible elements but also the underlying context and significance of the scene, ultimately offering a more complete understanding of what the image represents. More details here.

Track 2: Event-Based Image Retrieval

Given a realistic caption, participants are required to retrieve corresponding images from a provided database. This retrieval task is a fundamental task in computer vision and natural language processing that requires learning a joint representation space where visual and textual modalities can be meaningfully compared. Image retrieval with textual queries is widely used in search engines, medical imaging, e-commerce, and digital asset management. However, challenges remain, such as handling abstract or ambiguous queries, improving retrieval efficiency for large-scale datasets, and ensuring robustness to linguistic variations and biases in training data. This image retrieval track aims to tackle issues of realistic information from events in real life. More details here.

Challenge Rules
  • Participants are not allowed to annotate the test sets.
  • External datasets are permitted. However, participants may only use publicly accessible datasets and pre-trained models. The use of private datasets or pre-trained models is strictly prohibited.
  • Open sources are allowed. However, commercial tools, libraries, APIs, etc. are strictly prohibited.
  • Final scores are calculated based on performance on the Private Test set.
  • Participants must make their source code publicly available on GitHub to ensure reproducibility.
  • Participants must submit a detailed paper through the official challenge platform before the deadline to validate their solutions.
  • Late submissions will not be accepted.
  • Only registered teams that submit papers are eligible to win, but all participants’ scores will be recognized.
Dataset

EVENTA 2025 Grand Challenge uses OpenEvents V1 dataset.

Paper Submission

We accept papers of up to 6 pages of content in the ACM MM format, plus up to 2 additional pages for references only. Paper submissions must conform with the “double-blind” review policy. Submission policies adhere to the ACM MM 2025 submission policies.

We recommend participants to cite the Challenge Overview paper written by the organizers. This paper contains all the necessary information on the challenge definition and the dataset. Therefore, participants do not need to repeat to describe the challenge or the dataset in details. Instead, participants can exclusively present the motivation for their approach, explaining their method, showing and analyzing their results, and giving an outlook on future work, etc.


Submit your paper here.
Important Dates
  • Challenge opened: Apr. 01, 2025
  • Team registration opened: Apr. 01, 2025
  • Training set released: Apr. 18, 2025
  • Public-test set released: May. 05, 2025
  • Team registration deadline: Jun. 09, 2025
  • Private-test set released: Jun. 10, 2025
  • Challenge closed: Jun. 24, 2025
  • Paper submission deadline: Jul. 01, 2025
  • Acceptance notification: Jul. 24, 2025
  • Camera-ready deadline: Aug. 26, 2025
  • Challenge date: Otc. 27-31, 2025
Accepted Papers
ACM Multimedia 2025 Proceedings:
  1. Cerebro Team, "ENRIC: EveNt-AwaRe Captioning with Image Retrieval via UnCertainty-Guided Re-ranking and Semantic Ensemble Reasoning". [PDF]
  2. SodaBread Team, "ReCap: Event-Aware Image Captioning with Article Retrieval and Semantic Gaussian Normalization". [PDF]
  3. NoResources Team, "EVENT-Retriever: Event-Aware Multimodal Image Retrieval for Realistic Captions". [PDF]
Unachieved Papers:
  1. Re:zero Slavery Team, "Beyond Vision: Contextually Enriched Image Captioning with Multi-Modal Retrieval". [PDF]
  2. ITxTK9 Team, "ZSE-Cap: A Zero-Shot Ensemble for Image Retrieval and Prompt-Guided Captioning". [PDF]
  3. HCMUS-NoName Team, "Hierarchical Multi-Modal Retrieval for Knowledge-Grounded News Image Captioning". [PDF]
  4. 23Trinitrotoluen Team, "A Hybrid Dense-Sparse Multi-Stage Re-ranking Framework for Event-Based Image Retrieval". [PDF]
  5. LastSong Team, "Hierarchical Article-to-Image: Leveraging Multi-Granularity Text Representations for Article Ranking and Text-Visual Similarity for Image Retrieval". [PDF]
  6. Sharingan Retrievers Team, "Leveraging Lightweight Entity Extraction for Scalable Event-Based Image Retrieval". [PDF]
  7. ZJH-FDU Team, "A Pretrained Model-Based Pipeline for Event-Driven News-to-Image Retrieval". [PDF]
Challenge Schedule
TBA
Organizers
Trung-Nghia Le
Trung-Nghia Le
University of Science, Vietnam
Minh-Triet Tran
Minh-Triet Tran
University of Science, Vietnam
Tam Nguyen
Tam Nguyen
University of Dayton, US
Thien-Phuc Tran
Thien-Phuc Tran
University of Science, Vietnam
Minh-Quang Nguyen
Minh-Quang Nguyen
University of Science, Vietnama
Trong-Le Do
Trong-Le Do
University of Science, Vietnam
Duy-Nam Ly
Duy-Nam Ly
University of Science, Vietnam
Viet-Tham Huynh
Viet-Tham Huynh
University of Science, Vietnam
Khanh-Duy Le
Khanh-Duy Le
University of Science, Vietnam
Mai-Khiem Tran
Mai-Khiem Tran
University of Science, Vietnam

Contact: ltnghia@fit.hcmus.edu.vn