The LangGener language corpus is a multimedia linguistic corpus. Its main application is to investigate the frequency and variants of morphosyntactic calques in two generations of Polish-German bilinguals. The corpus with annotated morphosyntactic calques will contribute to studies into the linguistic outcomes of language contact. It can also be used in glottodidactics.
The corpus gives access to recordings of speech produced by bilinguals. The sounds can be studied in phonetic research, including in experimental phonetics.
The corpus is also a source for researching dialects of German, as it includes endangered varieties of German used in northern and western Poland.
The corpus can also be used for sociolinguistic studies, as it features fragments concerning the speakers’ language biographies. Researching language biographies can stimulate the development of cultural anthropology and ethnology, in particular studies on the identity of ethnic, cultural, and language minorities.
The project was conducted within the BEETHOVEN 2 program and jointly financed by the Deutsche Forschungsgemeinschaft (DFG), project no. HA 2659/9-1, and National Science Centre (Narodowe Centrum Nauki), project no. 2016/23/G/HS2/04369. It was affiliated with the Institute of Slavic Studies, the Polish Academy of Sciences (PAN), and the University of Regensburg. The LangGener corpus is stored on the server of the Institute of Polish Language, the Polish Academy of Sciences.