Seminar on Speech and Language Technology Tools
A HunCLARIN event for the promotion of using language technology software tools and corpora for humanities and social sciences research
Szeged, 19 October 2018
Organizing team
Tamás Erdei
Kinga Jelencsik-Mátyus
Eszter Simon
Program
Posters
Posters were also presented to introduce CLARIN, HunCLARIN and some important projects using or developing the HunCLARIN infrastructure in Szeged.
Kinga Jelencsik-Mátyus | HunCLARIN - an Introduction |
Kinga Jelencsik-Mátyus | About CLARIN |
Veronika Vincze | Language Resources at the University of Szeged |
Ildikó Hoffmann | Detecting Mild Cognitive Impairment by Exploiting Linguistic Information from Transcripts |
Csilla Horváth | Open corpora of Uralic Languages |
Goal of the event
In Hungary there are several centers at universities and research institutions where digital linguistic corpora and softwares are developed (see the members of HunCLARIN). However, outside these centers digital linguistic services are still not very well known.
The main goal of the present event is to introduce the digital linguistic services to researchers, teachers and students already working on linguistic projects. The conference is planned to be followed by similar events in other cities as well as more practical workshops in Szeged.
Our aim is to show that, on the one hand, there are several well-developed, state-of-the-art corpora and software tools in the fields of historical linguistics, child linguistics (that are very popular and well-researched in Szeged), and other fields of HSS, to introduce the very basics of using these services, and, on the other, that at HunCLARIN most of these services can be found and professional help on how to use them is also available.
In the first section linguistic corpora, types and possibilities of usage are introduced, then practical examples of running simple and more complex searches in corpora is represented. In the second part, three presentations are dedicated to software tools, like e-magyar (digital language processing toolchain), MAXQDA (a content analysis software) and sometools specially developed for HSS research.
Lecturers
Veronika Vincze
Researcher at the HAS-SZTE Research Group on Artificial Intelligence and at the University of Szeged, Human Language Technology Group
Veronika Vincze supervises and coordinates the linguistic aspects of the projects in the groups she works in. Also takes part in the famous research conducted in Alzheimer's disease, aiming at detecting the earliest signs of AD from speech. Her research interests include corpus and ontology building, word sense disambiguation and the NLP treatment of multiword expressions. She is also interested in computational morphology, parsing and information extraction.
Bálint Sass
PhD research fellow at the Research Group for Language Technology, Research Institute for Linguistics, Hungarian Academy of Sciences
The fields of research of Bálint Sass are corpus query interfaces and corpus creation. He is one of the creators of the Hungarian National Corpus, the Old Hungarian Corpus and the corpus of Budapest Sociolinguistic Interview as well as the creator of query interfaces for all these corpora. He is also interested in predicate-argument structure (built the Verb Argument Browser), in computer aided dictionary creation and in Hungarian contracted Braille script. He has held several NLP courses for university students.
Eszter Simon
Senior research fellow at the Research Group for Language Technology, Research Institute for Linguistics, Hungarian Academy of Sciences
Her fields of research include named entity recognition, morphological analysis, corpus building, annotation, development of historical corpora and computational linguistics for Uralic languages. She supervised and coordinated the computational linguistic works of several large-scale projects, like the Hungarian Generative Diachronic Syntax and the Syntax of Uralic Languages. She has held numerous university course in NLP.
Iván Mittelholcz
Software engineer at the Research Group for Language Technology, Research Institute for Linguistics, Hungarian Academy of Sciences
Ivan Mittelholcz started to deal with human language technology in 2006. He has worked on software development tasks in several research projects. His research areas include tokenization, spell checking, ontology building and supervised machine learning. Recently he has held several university courses on logic, human language technology and programming.
Valéria Juhász
PhD, Head of the Department of Hungarian and Applied Linguistics
Her fields of research interests include language and speech training, sociolinguistics, content analysis in media and computer assisted communication. She is a member of the Hungarian Reading Association (HUNRA), focusing on research in the field of reading skills development, and the promotion of the culture of reading in Hungary. Lately she has been working on and promoting MAXQDA, a content analysis software.
Anna Babarczy
Senior research fellow at the Research Group for Psycholinguistics, Research Institute for Linguistics, Hungarian Academy of Sciences and lecturer at the Department of Cognitive Science at BUTE
Anna Babarczy is a renowned researcher and lecturer of child language and pragmatics, with special interest in corpus linguistic methods in these fields. Her fields of research interests include experimental pragmatics, psycholinguistics of abstraction and automatic identification of literal versus non-literal meaning. She has been the leader of several large-scale projects.
Róbert Péter
Associate Professor in the Department of English at the University of Szeged
He is the general editor of the five-volume primary resource collection entitled British Freemasonry, 1717-1813 (New York: Routledge, 2016) that contains hitherto unexplored and rare masonic texts. He also has a strong interest in digital humanities, in particular the development and use of quantitative methods for exploring long-scale trends and patterns in historical and cultural processes by analysing the bibliographic and metadata of a vast number of texts.
If you plan to organise a similar event
We had a really good experience with using a simple google form for registration. It helped a lot not only to keep track of the audience (of course), but also made it easier to keep in touch with them to inform them about the availabilty of the training materials (slides, posters added to the event page, proceedings, videos, next events).
However, our choice of the date was not so fortunate, as it was the last day before a long weekend, thus only some very dedicated students turned up.
Adverstising the event at all the major universites and institutes dealing with linguistics in Hungary was a very good choice, as we had quite a few participants from other cities. What is more, seeing the success of this event, the University of Debrecen has already started organising a very similar seminar.
For more pictures of the event click here.