ue logo

 

 

 

CONTACT

Instytut Slawistyki PAN
ul. Bartoszewicza 1b/17
00-337 Warszawa
tel./fax: 22 826 76 88
22 828 44 75
sekretariat@ispan.waw.pl

 

Clarin-PL is a part of the CLARIN pan-European scholarly infrastructure (Common Language Resources & Technology Infrastructure).
The goal of Clarin-PL is the development of a research infrastructure which would yield tangible results in the fields of humanist and social sciences, the extension and the maintenance of the Polish node of the Clarin ERIC infrastructure, the extension of the fundamental technologies for the Polish language and research in the field of linguistic engineering and computer linguistics.

 

Detailed information about the project, the tasks and the tools is available at the Clarin-PL website.

 

***

 

The Institute of Slavic Studies of the Polish Academy of Sciences together with the Wrocław University of Technology (the coordinating institution), the Instytut Podstaw Informatyki PAN, Polish-Japanese Academy of Information Technology, the University of Łódź and the University of Wrocław constitutes the Polish part of the Clarin structure.

Heretofore the employees of the Institute of Slavic Studies of the Polish Academy of Sciences created multilingual bases of contemporary texts for the following languages: Polish, Bulgarian, Lithuanian and Russian.

 

The bases whose total size is 18 MB are available in the Clarin-PL digital repository:

 

Polish-Bulgarian-Russian Parallel Corpus
Polish-Lithuanian Parallel Corpus

 

The following people participated in the work on the Polish-Bulgarian-Russian base of contemporary texts: Anna Kisiel, Violetta Koseska-Toszewa, Natalia Kotsyba, Joanna Satoła-Staśkowiak and Wojciech Sosnowski. The following people participated in the work on the Polish-Lithuanian base of contemporary texts: Danuta Roszko and Roman Roszko.

Currently the team of the Institute of Slavic Studies of the Polish Academy of Sciences which includes Maksim Duškin, Joanna Satoła-Staśkowiak, Danuta Roszko, Roman Roszko, Wojciech Sosnowski and Roman Tymoshuk is developing an extended version of a multilingual annotated corpus whose bulk consists of the Polish language. The main aim of the team of the Institute of Slavic Studies, Polish Academy of Sciences for the years 2016-2018 is to integrate the multilingual resources built in the Clarin-PL with the world resources, further development of these resources and the adaptation of their functionality for the purposes of manual and machine translation, comparative linguistic research and multilingual pursuit of information within the framework of an open, multifaceted Multilingual Platform (Platforma Wielojęzyczna).