Conference Report: The 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2023)`\footnote{Published as blog post at \url{}}`{=latex}

8 minute read

Last week, May 2-6, 2023, the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL) took place in Dubrovnik. I think this has been my first EACL conference, and the program was very interesting, with a good mixture of recent state-of-the-art research on fashionable topics while maintaining a good diversity across various research fields. It’s been one of my favorite conferences that I attended so far.

Plenary Session

The conference hat 281 papers accepted, out of which there were 229 long and 41 short papers. The acceptance rate was 24,1%. Such a comparably low rate leads to many good papers not finding a space in the conference. EMNLP invented the model of the Findings of ACL/EMNLP/EACL/NAACL, in which papers are included that might not fit in the main conference but are still worth publishing. In some conferences these papers are presented as posters, sometimes invited to be presented in workshops, and sometimes presented only as videos online. For EACL, all 201 Findings papers (149 long) were presented in a video on the Underline platform, and some papers were additionally presented in workshops.

I am wondering if the invitations to present in workshops shaped the perception that Findings papers are somehow between the main conference and the workshops - I heard this opinion multiple times at the conference. Personally, my perception is more that Findings papers are in the same category as main conference papers and not more similar to workshop papers. Workshop papers are not worse – they are more focused on specific topics. Findings papers often lack this special focus and therefore they are not submitted to a dedicated workshop but to the main conference.

Poster Session

Next to many oral presentations, there were very nice poster sessions. At *ACL/EMNLP conferences, poster papers are considered to be of the same quality as those presented as talks: there is no difference in the proceedings.

The program of the conference, with talks and posters, has been complemented by tutorials and so-called Birds-of-a-feather (BoF) sessions, in which people interested in a specific topic introduced each other, as a networking event. This format has been introduced during the COVID-online conference times (I think) and still exists. It’s actually quite nice to get in touch with a subcommunity that one did not know yet. I did participate in such sessions for the first time and can only recommend it.

Contributions from University of Stuttgart


The IMS at the University of Stuttgart had many contributions, and it felt very nice to be at the conference with so many nice colleagues. I did not have this experience as a PhD student (where I was essentially the only person targeting ACL conferences for publications in the group), and I really appreciated it. It is so much easier to meet new people if you already know many.

The IMS contributed two tutorials. I was part of the Emotion Analysis Tutorial [@StajnerKlinger2023], offered together with Sanja Štajner. It’s been my first tutorial, as usual, I did not plan enough time for the material that I wanted to cover. Thanks to Sanja, who was flexible enough with her timing, we did not overrun too much.

EA Tutorial

Another tutorial with substantial involvement by IMS people was given by Gabriella Lapesa, Eva Maria Vecchi, Serena Villata, and Henning Wachsmuth [@lapesa-etal-2023-mining]. The topic was argument mining, and it unfortunately took place in parallel, which was a pity because the same people might have been interested in both. Luckily the tutorials were recorded and will be online on the underline platform.


We further had a set of papers in the main conference, the workshops, and Findings.

@wuehrl-etal-2023-entity proposed a real-world pipeline for biomedical fact-checking, based on the idea that reformulated claims can be better checked against scientific text than the original formulation of a claim as it occurs in social media. @miletic-schulte-im-walde-2023-systematic showed how compositionality information can be extracted from BERT. @eichel-etal-2023-made investigate how LLM can be prompted for plausability with applications in the material sciences. @falk-lapesa-2023-bridging show how adapters can be used to efficiently predict argument quality, based on a large set of datasets and quality dimensions. @nikolaev-pado-2023-representation study representation biases in sentence transformers, @gaser-etal-2023-exploring explore segmentation approaches for neural machine translation with code-switching, and @vath-etal-2023-conversational show how dialog systems allow for more complex tree-like conversations with intelligent agents.

My Favorite Contributions

In addition to the IMS contributions to the conference, I found a set of talks and papers very interesting. This only reflects my personal opinion, and that I do not mention a particular paper probably only means that I did not have the time to go to its presentation. There were many interesting papers in the program, I did not go through all of them yet.

Invited Talks

Before I say something about the papers I liked a lot, I would like to point out two of the three invited talks. Joyce Chai talked about embodied AI. As a student of computer science, I often heard the phrase “intelligence needs a body”, and I must say, I never really understood. Now, with this talk, and the nice demonstration videos that Joyce showed, I finally got a grasp of what’s behind this phrase. Full understanding of the whole context is only possible in multimodal interactions. That does not mean that every researcher needs to work on multimodal interaction analysis, but there needs to be such integration efforts to not miss important aspects. I found that very intuitive.

Keynote Joyce Chai

The other keynote, given by Edward Grefenstette included discussions on the efficiency of LLM and their future use. He mentioned work by @Lyle2020 who studied why LLM can actually generalize. Apparently, it is crucial to only have one epoch during pre-training. On a more entertaining side, he pointed out that LLM currently mostly fail with pragmatics (“Have you seen my phone” – “Yes, I have seen your phone.”)

Keynote Grefenstette


My Favorite papers:

  • @eisenschlos-etal-2023-winodict study how LLM can learn new words in-context at inference time and develop a method to measure such word acquisition (by prompt-based coreference resolution with new words). This paper also won a best paper award.
  • @ishibashi-etal-2023-evaluating analyze the robustness of prompts by prompt pertubation. One interesting finding is that manual prompts are more robust than automatically learned prompts in few-shot settings. Very interesting study to get a better understanding what “good prompts” are.

Ishibashi Poster

  • We know that LLM tend to hallucinate content. This can also happen during machine translation (I did not know that, and it sounds pretty scary!). Understanding such hallucinations is the topic of the work by @guerreiro-etal-2023-looking. They also propose a method to mitigate the issue by regularization during inference.

Guerreiro Poster

  • @govindarajan-etal-2023-people point out that there is no such thing as unbiased language! They look at interpersonal bias and emotion.
  • @zhong-etal-2023-extracting study a task that sounds like it should be super-straight-forward to solve: extract mentions of counts from social media (here: victim counts). Apparently, the task is really difficult, because models need to understand enumerations, implicit references, next to actual mentions of numbers. This is a very interesting paper, because it shows another case where general models fail and that specifically developed models for particular tasks are important.
  • @mohammad-2023-best wrote a paper about how to use emotion lexicons and how to build them. The paper style is worth pointing out: it’s written in a question-answer style, and I think that this is very accessable.
  • Most of our language models are huge these days, and luckily, there are methods to compress them, such that they can not only be used on GPU clusters. @du-etal-2023-robustness show that such compression comes with disadvantages: it reinforces biases.
  • Recently, several methods have been proposed to automatically find well-performing prompts (for instance @shin-etal-2020-autoprompt, @ding-etal-2022-openprompt). In their paper, @prasad-etal-2023-grips focus on instruction tuning without a need to calculate gradients.
  • @narayanan-venkit-etal-2023-nationality look into nationality bias (instead of a bias towards particular languages).
  • @parmar-etal-2023-dont (also outstanding paper) study if instructions in crowd-sourcing tasks create biases (unfortunately, the answer is yes).
  • @cortal-etal-2023-emotion build on top of previous work that we published, on the emotion component process model and appraisal theories for emotion analysis, particularly @casel-etal-2021-emotion. They build on top of another appraisal theory that focuses on the cognitive component of emotions and create a corpus in French, with a focus on emotion regulation. This is the first work that I have seen that puts emotion regulation into focus in NLP work.


I won’t go into detail regarding the paper awards, these papers have already been evaluated by other people to be interesting, I’ll just list them here:

Outstanding papers:

Best papers:

Venue and Place

At the end of this blog post, I’d like to comment on the location. Personally, I like to categorize conference locations into three types:

  1. Nice conference centers in some city with hotels around, without a hotel directly associated to the venue. An example was COLING 2018 in a very nice place downtown Santa Fe. The advantage is: such conference locations are nice! The disadvantage is: participants do not tend to hang out at the venue.
  2. Conference hotels downtown of some big city. An example was NAACL-HLT 2019 in Minneapolis or ACL 2014 in Baltimore. If one can effort the conference hotel, that’s nice because people hang around also outside of the conference schedule. Unfortunately, these hotels are sometimes prohibitively expensive, and then people are just elsewhere - and in contrast to (1), the conference center is typically not even nice but arbitrary.
  3. Conference hotels somewhere where nothing else is that might motivate one to be elsewhere. This has been the setup of RANLP 2009 and RANLP2011 (not sure about later). Of course both places had a lot to offer outside of the conference hotel, but given that these places were pretty empty outside of the respective season, one ran into conference people everywhere.

The conference hotel Valamar Lacroma Dubrovnik Hotel of EACL 2023 was something in-between. It was slightly too expensive such that all particants would decide to stay there, but downtown was sufficiently far away such that people did not go elsewhere during breaks, even if they had the accomodation elsewhere. I must say that I found this to be a very good setup. The city of Dubrovnik was beautiful, and in the evening we ran into hundreds of EACL people. But the conference venue/hotel had enough to offer that one could also stay there and talk.

Conference Dinner

Conference Room


[Download this post as PDF]