
On June 22–23, 2026, the 16th edition of the workshop titled “CLARIN in Research Practice” took place at Jan Kochanowski University in Kielce. The event was organized by the CLARIN-PL Language Technology Center (based at Wrocław University of Science and Technology) in cooperation with the Institute of Literary Studies and Linguistics at Jan Kochanowski University, as well as CLARIN-PL consortium partners: the Institute of Computer Science PAS, the Institute of Slavic Studies PAS, the Polish-Japanese Academy of Information Technology, the University of Łódź, and the University of Wrocław. The event drew significant interest from the scientific community, with approximately 150 participants from across the country attending the panels and practical sessions.
Researchers from the Institute of Slavic Studies PAS (IS PAN) actively participated in the workshop program, presenting research results conducted in collaboration with partners from other academic centres. The substantive presentations were delivered by researchers affiliated with our Institute and the University of Warsaw, all of which align with broader activities within the CLARIN-PL infrastructure.
Presentations by the IS PAS Semantics and Computational Linguistics Team
One of the main highlights of the program was a series of presentations concerning work with low-resource languages (languages with limited digital representation).
A team consisting of Danuta Roszko (UW), Roman Roszko (IS PAN), and Piotr Szatkowski (IS PAN) presented a methodology for creating datasets for the Masurian reading material. This work is closely linked to the ongoing CLARIN-PL-BIZ-Bis (FENG) project, which is a consortium undertaking. The presentation addressed the technical aspects of building data processing pipelines necessary for preparing Masurian texts for the training, evaluation, and testing of Large Language Models (LLMs).
Another presentation, delivered by a team comprising Danuta Roszko (UW), Roman Roszko (IS PAN), and Valéry TrânThiên (IS PAN), focused on machine translation testing within language groups with limited digital resources (including Slavic and Baltic languages). The speakers placed the issues of contemporary AI models within a broader historical and theoretical context, referencing classic works of translation theory. They noted that even in the age of artificial intelligence, the principles formulated by Étienne Dolet in 1540 in La Manière de Bien Traduire d’une Langue en Aultre remain relevant. His five rules – ranging from a perfect understanding of the author’s intention to avoiding literal “word-for-word” translation – serve as a crucial reference point for assessing the quality of today’s machine-generated translations.
As a striking example of how a translation error can impact the development of science, the team cited the issue raised by Daniel Hoek (2023) in the article Forced Changes Only: A New Take on the Law of Inertia. Linguistic analysis revealed that an imprecise English translation of the Latin lexeme quatenus, performed by Andrew Motte in 1729 in his translation of Newton’s Principia, led to a fundamental misunderstanding of Newton’s First Law of Motion. This seemingly minor linguistic error distorted the interpretation of a scientific principle for nearly three centuries, serving as an extremely important warning for contemporary researchers testing the precision and semantic vigilance of machine translation models.
Expert Consultations
A key element of the IS PAS researchers’ presence at the workshop was the opportunity to provide direct support to participants. Prof. Roman Roszko (IS PAS) conducted individual and group expert consultations, assisting researchers in implementing modern technologies into their daily scientific work. The consultation topics included:
- the use of artificial intelligence in text translation processes,
- the analysis of multilingual manuscripts using AI tools,
- advanced tagging and annotation of linguistic resources.
The participation of our researchers among such a large group of attendees confirms the growing role of the Institute of Slavic Studies PAS in projects that bridge traditional humanities with modern natural language processing (NLP) technologies.
CLARIN-PL Consortium Members: Wrocław University of Science and Technology (leader), Institute of Computer Sciences PAS, Institute of Slavic Studies PAS, Polish-Japanese Academy of Information Technology, University of Łódź, University of Wrocław.





























