The Hungarian Research Centre for Linguistics launches language understanding benchmark kit named HuLu

19 Apr 2022

The Language Technology Research Group of the Hungarian Research Centre for Linguistics has created a corpus collection called the Hungarian Language Understanding Evaluation Benchmark Kit (HuLu) to measure, evaluate and compare the language comprehension of neural language models. In recent years, artificial intelligence has gained a prominent role in language technology. More and more language models are created with increasingly advanced performance in a variety of tasks ranging from machine translation to summary generation.


The number of language models available for Hungarian (e.g, HuBERT, HILBERT or the experimental language models developed in the framework of the HILANCO project) is growing rapidly. Benchmark databases have been created to analyse and examine these language models. These databases are often corpus collections that measure the performance of the models on a variety of tasks. The first instances of such collections - the English GLUE and SuperGLUE benchmarks - were soon followed by their French, Spanish, or Russian counterparts, and XGLUE, which focuses on evaluating multilingual language models. The creation of the Hungarian benchmark corpus was based on the experiences of these similar collections. The long-term goal is to create a well-functioning collection of corpora that can be used to evaluate several aspects of language understanding, such as robustness.


HuLu contains a constantly increasing number of subcorpora. Some of these are translated and adapted versions of English benchmarks while others were created specifically for Hungarian. As of now, the corpora that make up HuLU are:

  • HuCOLA: 9,076 sentences with their grammaticality judgements (grammatical / ungrammatical)

  • HuCoPA: 1,000 premises with a pair of alternatives to choose as the cause / consequence of that premise

  • HuSST: 11,680 sentences with sentiment labels (positive, negative, neutral)

  • HuRC: 88,000-article corpus of comprehension testing with a masked NP at the end of each article to define

  • HuWS: Hungarian translation of the Winograd schema collection


The corpora are made available by HRIL on the following sites:


HuLu was first presented at the Hungarian Conference for Computational Linguistics in 2022. The article (pp. 431-446) is available here: