To participate in the EVENTA 2025 Grand Challenge, please first register by submitting the form.
Given a realistic caption, participants are required to retrieve corresponding images from a provided database. This retrieval task is a fundamental task in computer vision and natural language processing that requires learning a joint representation space where visual and textual modalities can be meaningfully compared. Image retrieval with textual queries is widely used in search engines, medical imaging, e-commerce, and digital asset management. However, challenges remain, such as handling abstract or ambiguous queries, improving retrieval efficiency for large-scale datasets, and ensuring robustness to linguistic variations and biases in training data. This image retrieval track aims to tackle issues of realistic information from events in real life.

Participants must submit a CSV file named using the following format: TeamName_EVENTA2025_Track2.csv
. This file must be compressed into a ZIP archive named submission.zip
before uploading to CodaLab.
The CSV file should include predictions for all descriptions in the query set. It must contain 11 columns, separated by ;
(semicolons), with the following structure:
- Column 1: Query text ID
- Columns 2–11: Top-10 retrieved image IDs, listed in descending order of relevance (from top-1 to top-10). If an image cannot be retrieved, use
#
as a placeholder.
CSV Row Format Template:
<query_id>;<image_id_1>;<image_id_2>;...;<image_id_10> <query_id>;<image_id_1>;#;...;#
There is no requirement to sort the rows by query ID—this will be handled automatically during evaluation.
We provide a submission example:
12312;56712;56723;56734;56745;56756;56767;56778;56789;56790;56701 12334;56712;#;#;#;#;#;#;#;#;# 12345;56712;56723;56734;56745;56756;#;#;#;#;#
Participants also require to submit a detailed paper through the official challenge platform to validate their solutions.
The platform will be made available in the coming days. We kindly ask for your patience in the meantime.