WAT2024 English-Hausa Multi-Modal Translation Task
The Workshop on Asian Translation 2024 (WAT2024) added a new multimodal task of English-to-Hausa translation which is the first multimodal translation task for any African language. The task relies on our “Hausa Visual Genome,” a multimodal dataset of text and images suitable for English-Hausa machine translation tasks and multimodal research.
Timeline
-
XXX: Translations need to be submitted to the organizers
-
XXX, System description paper submission deadline
-
XXX: Review feedback for system description
-
XXX: Camera-ready
-
XXX: WAT2024 takes place
Task Description
The setup of the WAT2024 task is as follows:
-
Inputs:
-
An image,
-
A rectangular region in that image
-
A short English caption of the rectangular region.
-
Output:
-
The caption is translated to Hausa.
Types of Submissions Expected
The setup of the WAT2024 task is as follows:
-
Text-only translation
-
Hausa-only image captioning
-
Multi-modal translation (uses both the image and the text)
Training Data
The Hausa Visual Genome consists of:
-
29k training examples
-
1k dev set
-
1.6k evaluation set
Evaluation
WAT2024 Multi-Modal Task will be evaluated on:
-
1.6k evaluation set of Hausa Visual Genome
-
1.4k challenge set of Hausa Visual Genome
Means of evaluation:
-
Automatic metrics: BLEU, CHRF3, and others
-
Manual evaluation, subject to the availability of Hausa speakers
Participants of the task need to indicate which track their translations belong to:
-
Text-only / Image-only / Multi-modal
-
Domain-Aware / Domain-Unaware
-
Whether or not the full (English) Visual Genome was used in training.
-
Constrained / Non-Constrained
-
29k training segments from the Hausa Visual Genome
-
(English-only) Visual Genome [submitting a domain-aware run]
-
Non-constrained submissions may use other data but need to specify what data was used.
Download Link
Submission Requirement
The system description should be a short report (4 to 6 pages) submitted to WAT 2024 describing the method(s).
Each participating team can submit at most two systems for each task (e.g., Text-only, Hausa-only image captioning, multimodal translation using text and image). Please submit through the submission link available on the WAT2024 website and select the task for submission.
Paper and References
Please refer to the below papers:
[paper] : https://aclanthology.org/2022.lrec-1.694.pdf
[arxiv] : https://arxiv.org/abs/2205.01133
Organizers
-
Shantipriya Parida (Silo AI, Finland)
-
Ondřej Bojar (Charles University, Czech Republic)
-
Idris Abdulmumin (University of Pretoria)
-
Shamsuddeen Hassan Muhammad (Bayero University Kano)
Contact
email: wat-multimodal-task@ufal.mff.cuni.cz
License
The data is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International.
Acknowledgment
This shared task is supported by the below projects/grants from Charles University (Czech Republic).
-
Grantová agentura České republiky, Project code: 19-26934X, Project name: Neural Representations in Multi-modal and Multi-lingual Modelling