Welcome to IroSvA (Irony Detection in Spanish Variants) the first shared task fully dedicated to identify the presence of irony in short messages (tweets and news comments) written in Spanish. This task will be organised within the Iberian Languagues Evaluation Forum (IberLEF 2019) which will be co-located with the SEPLN Conference. The conference will be held in Bilbao, Spain in September 2019. You can join the official mailing group of the task. We will be sharing news and important information about the task in that group. You can contact us via email firstname.lastname@example.org
Irony is a peculiar case of figurative devices frequently used in real life communication. As human beings, we appeal to irony for expressing in implicit way an opposite meaning to the literal sense of the utterance . Thus, understanding irony requires a more complex set of cognitive and linguistics abilities than literal meaning. Due to its nature, irony has important implications in sentiment analysis and other related tasks. Considering that, detecting irony automatically from textual messages is an important issue to improve the performance in sentiment analysis [3,5,6] and it is still an open research problem. Recently, automatic irony detection has gained importance in the research community, paying special attention to Social Media content in English. However, for Spanish, the availability of corpora is scarce, which limits the amount of research done for this language.
This year we propose the new task (IroSvA) which aims at investigating whether a short message, written in Spanish language, is ironic or not with respect to a given context. In particular, we aim at studying the way irony changes in distinct Spanish variants. Concretely, we focus on Spanish from Spain, Mexico and Cuba. The task will be structured into three subtasks, each one for predicting whether messages are ironic or not in one of the three Spanish variants. The main difference with previous tasks on irony detection ( SemEval 2018 Task 3  and IronITA 2018 ) is that messages are not considered as isolated texts but together with a given context (e.g. news or a topic).
This year we encourage the participation of NLP researchers, industrial teams and students in three subtasks:
Subtask A: Irony detection in Spanish tweets from Spain
Subtask B: Irony detection in Spanish tweets from Mexico
Subtask C: Irony detection in Spanish news comments from Cuba
The three subtasks aim to the same goal: participants should determine whether a message is ironic or not according to an specified context (by assigning a binary value 1 or 0). The main differences between them are the textual genre (tweets for subtasks A and B and short news comments for subtask C) and the Spanish variants.
The following statements show examples of an ironic and non-ironic news comments written in Cuban variant of Spanish.
Given de CONTEXT: ETECSA informa sobre nuevos servicios de telefonía móvil para clientes prepago.
Participating teams will be provided with training and test datasets for each Spanish variants. Standard evaluation metrics (precision, recall, and F1) will be used for assessing the performance of the participating systems. The three measures will be calculated per class label and macro-averaged. The submissions will be ranked according to F1-AVG which implies that all class labels have equal weight in the final score.
Participating teams may submit only one run for each subtask. We will make no distinction between constrained and unconstrained systems, but the participants will be asked to report what additional resources and corpora they have used for each submitted run.
10th February 2019, 23:00 UTC: Call for Participation and Website of the task.
30th March 2019, 23:00 UTC: Training set available.
16th April 2019, 23:00 UTC: Testing set available.
6th May 2019, 23:00 UTC: Submission of runs.
13th May 2019, 23:00 UTC: 10th Jun 2019, 23:00 UTC: Notification of results.
30th May 2019, 23:00 UTC: Submission of Working Notes by participants.
10th June 2019, 23:00 UTC: Reviews to participants (peer-reviews).
20th June 2019, 23:00 UTC: Camera-ready submissions due.
The corpus consist of 9,000 short messages about different topics written in Spanish –3,000 from Cuba,
3,000 from Mexico and 3,000 from Spain- and annotated with irony. Approximately, 80% of the corpus will be
used for training purposes, while the remaining 20% will be used for testing.
The corpus is password protected. To obtain the password, send an email to: email@example.com
Your system must generate for each subtask a corresponding .CSV file. The file must contain two columns separated by comma: the message id and the prediction label (0: no ironic; 1: ironic). For example:
The naming of the output files will be composed by the TeamName concatenated with the underscore char (_) and the indicator of the subtask (es, mx, cu). For example:
TeamName_es.txt for Spanish variants of irony.
TeamName_mx.txt for Mexican variants of irony.
TeamName_cu.txt for Cuban variants of irony.
All subtasks output files must be compressed into a .ZIP file with the name of the Team (e.g., TeamName.zip). Notice that, only one ouput file for each task is permitted. In case of more than one output file will be submitted for the same subtask, we only consider the last one. However, in the case a team is interested in investigating different methods or features, it will be possible to submit two runs per language variety.
The zip file must contain also a brief explanation of the authors' system. Concretely, for each subtask, the authors should explain if they carried out some kind of data preprocessing, the features used to represent the texts and the machine learning approach.
Submissions should be sent to irosva19 (at) gmail (dot) com
Participants will be given the opportunity to write a paper that describes their system, resources used, results, and analysis
that will be part of the official IberLef-2019 proceedings.
The paper should use Springer style (https://www.springer.com/gp/computer-science/lncs/conference-proceedings-guidelines).
The minimum length of a regular paper should be 5 pages. Papers must be written in English.
|Pos||Team||(CU) Cuba||(ES) Spain||(MX) Mexico||Average|