CONTEMPORARY CONTRASTIVE STUDIES OF POLISH , BULGARIAN AND RUSSIAN NEOLOGISMS VERSUS LANGUAGE CORPORA

In the field of Slavonic linguistics contrastive studies of neologisms occupy little place, the newest words are insufficiently described and classified. The aim of this article is to draw attention to the need for contrastive description of the newest lexis and checking exclusively one of many possibilities of obtaining Polish, Bulgarian and Russian neologisms. Language corpora, as this possibility is in question, are not the only source from which the author obtains her research material, yet a growing interest in corpora has inspired her to also use this method. The author wants to show the reader to what degree language corpora can help in building the thesaurus of Polish, Bulgarian and Russian neologisms. Making an attempt to confront a collection of neologisms of contemporary Polish, Bulgarian and Russian language, the author points out the need to standardize the description (identical for each of the analysed languages), which she intends to propose in another publications on neologisms in Polish, Bulgarian and Russian language. The application of contrastive method to three different but related languages from the Slavonic group will help, in her opinion, to discover more mechanisms of new words coming into existence and examine the newest derivative processes and their productivity.


Introduction
We encounter the newest words every day, irrespective of the level of education or participation in culture.They appear in all the styles and concern every language carrier.Recently part of my research has been connected with the newest Slavonic lexis, with special consideration given to neologisms in three languages listed in the title of this article which represent well the western, eastern and southern Slavism.Each of them has a rich literary tradition and clearly formulated norms.Despite the social consciousness of the existence of newer and newer lexical units called neologisms, I have not come across any publications which would systematize the issues of contrastive description of the newest lexemes in the languages: Polish, Bulgarian and Russian.I can see here for me and other linguists scope for search and studies which, I hope, will fill the current gap in working out this issue.
In this article I describe only a chosen fragment of the issues concerning neologisms.I deliberately do not focus on the origin of individual words in the analysed languages, on their history.
I do not indicate the directions of development of contemporary Slavonic languages (Polish, Bulgarian and Russian languages).I also realise the fact that language corpora are only one of many other sources of excerpts of the newest lexis but a very interesting one.That is why it is worth presenting as an essential but not the only contemporary research tool.

2.3.
The words analysed here come from the end of 20th century at the earliest, although a particular emphasis is put on the year 2000 and the following years.
In the year 2001 S. Grabias (2001, p. 247) indicated that neosemantisms constitute as much as 50% of all the words in the Dictionary of Students' Slang, which is the evidence of young people's exceptional formative abilities and their need to depict the world ina vivid way.It is also proved by the popularity of dictionaries such as the Cool Dictionary of the Youngest Polish Language (Wyczesany słownik najmłodszej polszczyzny ) by B. Chaciński (2001) or the Smashing Dictionary of the Youngest Polish Language (Wypasiony słownik najmłodszej polszczyzny) (2003) by the same author.
In the Bulgarian language4 according to the authors of a dictionary entitled Речник на новите думи в българския език published in 2010 the newest lexis constitues about 5000 words (that is the number of entries that the above mentioned dictionary contains) out of which 4300 are completely new lexical units, 700 are neosemantisms whereas 600 neologisms are specialists terms connected with specific fields of life and 150 are idioms.A characteristic feature of the newest Bulgarian language is (as in the case of the Polish and Russian language) a large number of words coming from youth language and an accumulation of technical terminology.
In the Russian language5 , similarly as in Polish and Bulgarian, after the year 2000 we observe a fast growth of lexis, mainly, as mentioned above, specialist terminology and the language of subcultures.We can also observe a process of assimilating new lexemes to the system and replacing with them the ones which already exist in the language.This is the topic on which the linguists describing an influence e.g. of anglicisms on the contemporary Russian language focus.Different factors affect the formation of neologisms: expression also has great significance and is the reason for creating new words as well as the development of technology and the need for naming situations, events and objects which have not been named before.Small dictionaries of Russian neologisms are very popular on the Internet (cf.http://www.tutoronline.ru/blog/kratkij-slovar-neologizmov.aspx).
2.4.The origin of the largest number of neologism is known.Thanks to the linguists' research6 we know that many of them are formed in technical and medical fields (that is the names of equipment and tools), in economics, but the most in youth language and the language of subcultures.One of the most important factors in the formation of the newest lexis is expression whose examples can be found not only among the above mentioned environmental groups but also in groups connected with artistic activity.

Language corpora
What is a language corpus?At present many promoters of the newest linguistic technologies express their opinions about it, among others Piotrowski (2003, p. 146) who defines a corpus as: " a collection of texts recorded in a digital form which constitute a representative sample of a given language.What makes a corpus different from any given collection of texts is the fact that it constitutes a certain well-thought-out whole.A distinctive feature of computer corpora is also their size".
Among the creators and users of corpora two quite different views are evident.Some of them claim that corpora have to contain a representative sample of a language (that is a large amount of text in its different language variants) in electronic form, others believe that even the smallest collection of texts, properly prepared, has the right to become a corpus (cf.McEnery & Wilson, 1996, p. 2;Podhajecka, 2006, p. 338-339).
A number of the newest specialist texts define the notion of a corpus much more precisely, giving details necessary to build such a collection of texts called "language corpus".These are, among other things (according to John Sinclair (1996)): quantity, quality, simplicity, documentation and (according to McEnery & Wilson (1996)): representative character, finite quantity, electronic format and standard reference.The author considers the representativeness of the texts collected in a corpus as its most important feature, which is of great significance in seeking neologisms.4.0.Neologisms and language corpora 4.1.Language corpora Language corpora, due to the rapid development of corpus linguistics and computer technologies, have become important research material, used also in applied linguistics (cf. Hunston, 2002) as well as in lexicography (cf. Ooi, 1998).
Thanks to language corpora a given lexical unit is represented in a wider context, which enables the researchers to perform a linguistic analysis, assess the kind of text, its quality and, what is extremely important on account of research on neologisms, the number a given lexem is used to be able to state if it really exists in the system of a given language and is a fully accepted lexical unit or if it is a language phenomenon with a very restricted time of existence, as e.g.occasionalisms7 .
Occasionalisms constitute an exceptional group of new words formed in special contextual conditions.They are not as significant as neologisms because the time of their existence is most often strongly restricted by a specific communicative situation and the time of the ending of the emergent phenomenon which has been a source of inspiration for the emergence of a neologism.Among the latest interesting Polish examples we can count new lexems which came into existence in recent months.They are all connected with the European Football championships Euro 2012 which took place in Poland and the Ukraine, e.g.piłkoszał (the football craze) 'delight at everything connected with football and the championships, collective football euphoria' (From television and radio advertisement broadcast in Poland during the championships Euro 2012); eurogedon 'the prediction of a difficult time for the Poles during the championships Euro 2012, among other things, transport problems' (From a broadcast of Polish Radio Programme Three, just before the championships Euro 2012); eurooszołomstwo (eurofanaticism) 'going crazy about only one topic and a lack of contact with the people who are not interested in it' (From a statement by a journalist Jarosław Kuźniar published on his blog) (19th June 2012) .Let us then use the PWN corpus to look for a word bankomat (a cash machine) and the surrounding context: • Bank nie odpowiada za zatrzymanie karty w bankomacie z przyczyn technicznych lub w wyniku wadliwej obsługi bankomatu przez Posiadacza karty.(The bank is not responsible for keeping the card in the cash machine for technical reasons or as a result of the incorrect operation of the cash machine by the card holder.) • Jednocześnie dla kart PBK START został wprowadzony dzienny Limit Wypłat Gotówki w bankomacie, co oznacza, iż kwota dokonanych przez Państwa wypłat gotówki w bankomacie/bankomatach w ciągu jednego dnia może wynosić maksymalnie: (900 zł) (At the same time for PKB START cards a daily Limit of Cash Withdrawal from a cash machine has been introduced, which means that a total amount of cash withdrawals made from a cash machine / cash machines within one day can come to maximally: (900 zloty)) • Dzięki Karcie Emerytalnej Commercial Union, jako członek naszego funduszu, będziesz mógł w bankomacie sprawdzić wysokość wpłacanej przez pracodawcę składki.(Thanks to the Commercial Union Pension Card, as a member of our fund, you will be able to check in the cash machine the amount of an insurance fee paid by the employer) • Każda wypłata w bankomacie lub zakupy mogą dojść do skutku tylko wówczas, gdy na koncie posiadacza karty są pieniądze.(Every withdrawal from a cash machine or purchases can take place only when there is money in the holder's account.)Let us look at a Russian term клонирование: • Самой затратной оказалась Москва -здесь на клонирование образа Беспалова ушло около $ 400 тысяч.
4.1.4.The use of language corpora with resources which include exclusively one language to obtain neologisms and the examples of their usage is very interesting.Nonetheless, in contrastive linguistics such a method can cause many problems and become time-consuming despite the modern way of obtaining material.After all, the knowledge of a language presented in a corpus and the knowledge of a word one is looking for are required.Therefore those who just want to expand their linguistic skills and find out about the existence of a specific lexical unit in language B have a more difficult task.Language corpora can only confirm the use of a given lexem in a wider context.They are therefore intended for those who know what they are looking for.The collections of contexts increase the possibilities of research into neologisms.They indicate clearly the scope of the occurrence of neologisms, frequency and their stylistic plane.

Parallel corpora
Parallel corpora are an important tool of contemporary comparative linguistics.They allow a quick and simple way of obtaining rich and reliable material for research in the field of the theory of translation and thanks to the diversity of the collected material also for research in other fields: cultural, sociological, sociolinguistic and concerning political science.
Because of the diversity of the material they are an excellent source of obtaining neologisms.People learning a language can find parallel corpora enormously helpful as they enable them to compare the language they know (A) with the language / languages they are just learning (B) / (C).The number of parallel corpora have been increasing in the consecutive years, which is a sign of their great popularity and some of them also include Slavonic languages.Such corpora can facilitate linguists' work and present in a quick way the paralled material in two, three and a larger number of languages.Polish language: Nie wolno dotykać ekranu, ponieważ może to doprowadzić do uszkodzenia powłoki albo niektórych pikseli służących do generowania obrazu.(Touching the screen is not allowed as it may lead to damaging the coating or some pixels used for generating the picture.) Bulgarian language: Винаги избягвайте да докосвате екрана, тъй като това може да доведе до повреда на екрана или в някои пиксели, които служат за изграждане на изображенията.
Commission in February 2012.The founders of Clarin ERIC are Austria, Bulgaria, the Czech Republic, Denmark, Estonia, Germany, Holland and Poland.CLARIN is a project from so-called roadmap ESFRI (European Roadmap for Research Infrastructures, European Strategy Forum on Research Infrastructures).The main aim of the project is to combine the resources and language tools for European languages into one common standardized network which is to become an important tool of work for researchers from very broadly understood humanistic branches of science.

Polish language:
Nieobsługiwane pliki są wyświetlane w podglądzie tylko w postaci ikon.(The files which are not supported are displayed in preview only in the form of icons.)Nietypowe pliki są wyświetlane w postaci map bitowych.(Untypical files are displayed in the form of bit maps.) Bulgarian language: Неподдържаните файлове се показват само чрез предварително определена икона.Файловете с необичаен формат се показват във вид на растерно изображение Russian language: Неподдерживаемые файлы отображаются при предварительном просмотре только в виде значка.Неправильные файлы отображаются в виде растрового изображения.4.2.2.Apart from the advantages connected with the use of resources of parallel corpora, I can also observe some technical problems.The main one is the fact that a substantial part of submitted parallel corpora10 (two-or multilingual) is under preparation and the language material they include is not known to a wider audience since at this stage it is not available in the Network.It poses a real obstacle for the users of corpora.
Paraller corpora can only be helpful in contrastive studies on neologisms when their resources include the newest texts from the 21st century -one of the corpora which fulfils such criteria is The Bulgarian-Polish Corpus which is being prepared by L. Dimitrowa and V. Koseska-Toszewa and which includes, e.g. the documents of the EU.

Briefly on the methodology of excerption of neologisms by means of corpora.
Not all the new words we hear can be called neologisms.The process of searching the corpora resources should be based on certain rules of 'checking' new lexems to make sure that they are rooted in the system of a given natural language.The following facts can show whether a given lexical unit has entered permanently into the grammatical system of the languages compared in my research: 5.1.In Polish these are inflectional endings added to a new lexical unit, inflection of new lexems and multiplication of words within the range of formative family.A Polish example of a borrowing which has entered the language system is blog: • Otóż ukończywszy w pocie czoła olbrzymią kampanię dla naszych graczy, która przeprowadzi ich przez dwadzieścia poziomów podziemi i pół znanego w danym systemie świata, zakładamy blog, który będzie nas wspomagał.(So, having completed by the sweat of our brows a huge campaign for our players, which will lead them through twenty levels of the underworld and half of the world known in a given system, we're setting up a blog, which will support us.) • Zajrzałem na jego bloga.(I took a look through his blog.) • Są dwa sposoby: a) jednostkowy, b) wspólnotowy.Pierwszy to założenie własnego bloga lub strony internetowej i zamieszczenie tam napisanego dzieła.
(As an attentive observer of your doings from the very beginning on this portal I wish to state that you are the most frequently visited blog among the columnists and journalists writing their blogs here.) • Antoni Bielewicz nie tylko będzie blogował o tym, co ciekawego dzieje się na SAP TechEd, lecz także przeprowadzi szereg podcastowych wywiadów z wybranymi uczestnikami konferencji -dodaje Michał Kreczmar, wydawca serwisów online w International Data Group Poland S.A. Konferencja SAP TechEd odbędzie się w dniach 18-20 października.(Antoni Bielewicz will not only blog about what's interesting at SAP TechEd but he will also conduct a number of podcast interviews with the chosen participants of the conference -adds Michał Kreczmar, an editor of online services in the International Data Group Poland S.A.The SAP TechEd conference will take place on 18-20 th October.) • Najbardziej wpływowy bloger 2012 roku.(The most influential blogger 2012.) Another example of a word 'assimilated' on the Polish ground can be laptop or hit: • Włączyłem laptop.Wpatrywałem się w monitor, jakby to z niego, a nie z mojej głowy wyłaniały się pytania.(I switched on my laptop.I stared at the monitor, as if the questions were emerging from it and not from my head.) • Włączył laptopa, poczekał, aż uruchomią się programy, i kliknął na ikonę wyszukiwarki.(He switched on his laptop, waited until the programs started and clicked on the icon of a search engine.) • W laptopie włączył bostoński koncert zespołu Fleetwood Mac, z czasów gdy Peter Green nie popadł jeszcze w obłęd i był geniuszem gitary.(He turned on the Boston concert of Fleetwood Mac in his laptop, from the times when Peter Green hadn't gone mad yet and was a guitar genius.) • Wiadomo, że baterie litowo-jonowe Sony są używane także w laptopach Lenovo, ale inżynierzy tej firmy są pewni, że nie dojdzie do wypadków dzięki odpowiedniej konfiguracji.(It is known that Sony lithium-ionic batteries are also used in Lenovo laptops but the engineers of this company are sure that accidents will not happen thanks to appropriate configuration.) • Alfabetyczne hity ostatniego roku.(Alphabetical hits last year.) • Wiele teatrów przyjeżdża z teatralnymi hitami.(Many theaters come from theatrical hits.) • Co miesiąc proponujemy Ci Płytę Miesiąca -prawdziwy hit .(Every month, we offer you the disc Month -a real hit.) • Nie, zaczął opowiadać, że to listy hitów.(He began to tell, it's a list of hits.)
5.3.In the Russian language, similarly to Polish, full acceptance of a lexeme by the grammatical system of a natural language takes place when inflectional endings develop as well as when a newly created word undergoes declensional process as other units of the same language.One can also observe the multiplication of words within the created derivational family and case inflection of new lexemes.

Summary
6.1.In some countries (France can be the best example of such conduct.)neologisms are created in the most important circles of language councils which out of concern for the clarity of their language look for new terms to avoid borrowings from foreign languages.In the three Slavonic languages (Polish, Bulgarian and Russian) which I analyse in the article a different situation takes place where new lexical units begin to exist in the language system as a result of inventing a new word and its frequent use, thanks to borrowing words from foreign languages or as a result of using an existing word with a new meaning.One can observe many ways of enriching and diversifying vocabulary within the range of the languages compared in the article.Among other things, the most popular are: derivation, borrowings and neosemantization.The processes of creating new words or the processes of neosemantization, observed among other things thanks to the corpora, are dependent today on the strong influences of the English language.Native words are driven out by internationalisms (whose numerous examples can also be found in this text) which facilitate communication on the Internet forums, using computer software, modern mobile phones, etc.It is one of many noticeable globalisation processes of contemporary times.These processes are unavoidable and they result from a need for faster interpersonal, social, etc. communication.
6.2.Thanks to corpora the users receive reliable information about the frequency and range of use of a given language element.In order to obtain reliable contrastive material, all kinds of language corpora are useful.Certainly, parallel corpora are more practical, 'handy', because in one place, as it were next to each other, we have units which are of interest to us.What is more, they are in the same or similar contexts.However, there is no scientific evidence that parallel corpora are better than the ones which are 'based' on one language.Thanks to the latter ones researchers will not omit these aspects of units (neologisms) which do not have a parallel character (e.g. a different degree of semantic assimilation of polysemic units).
Diverse limitations are a big problem: the necessity to pay subscription for access online to the whole corpus or severely restricted access to corpus on account of works it is undergoing (as it is the case with parallel corpora with Slavonic languages descibed in this article) or a foreign language of operation of a given corpus, as in the case of the corpus described above (cf.Intercorp).Very often, though lack of other materials, the Internet resources, available thanks to search engines, have to be used in the role of a corpus.

6.3.
The corpora which will support my research on neologisms have been presented here as the collections of contexts.The linguistic context confirms the suitability of contrastive meanings of lexemes.It is of particular importance in situations where a new word is not an anglosemantism and is not overused and yet it is present in the system of the examined language and the verification of its meaning and frequency of its use are necessary.
6.4.In the Polish, Bulgarian and Russian languages, as the language corpora used for comparison confirm, the areas (specialised vocabulary: technical, computer, youth vocabulary, etc.), the ways (semantic borrowings, metonymy, metaphor, opposites, abbreviating, generalising) and the reasons for creating new meanings language fashion, a need for naming new things, phenomena or a need for adding variety to the language of communication (e.g.expressiveness of a language), a desire to exist in virtual or media reality often coincide.All the three languages described in the article assimilate many new words from the field of economics, law, medicine, technical sciences and popular colloquial expressions.The quantity of words created every year and the number of lexemes borrowed from English in each of the languages is similar although not identical.In spite of a huge level of activity in creating and borrowing new words by all the languages described in the article, the users of Bulgarian (the result of the confrontation of a bigger number of neologisms) seem to be a little less resistant to English-language influences whereas the users of Russian display less interest in linguistic borrowings.As the preliminary contrastive research shows Polish is situated between the two above mentioned Slavonic languages although it has many features similar to Bulgarian.An attentive observer can notice in the area of compuer terminology of the Bulgarian language an exceptional popularization e.g. of words such as: даунлоад, даунлоадвам, атачмънт, блутуyт, etc., whereas in the Polish language words which do not have their Polish equivalent are more frequently assimilated, most often when we describe new devices, such as: laptop, tablet and palmtop.In other cases words are used which have already existed in the system of the Polish language, cf.pobierz (download), pobierać, załącznik (attachment), but also bluetooth (although with their original spelling preserved, which may prove a lack of acceptance in the system of the Polish language) etc.In the Russian language.Equivalents such as: скачай, скачивать, вложение (присоединенный файл) and блютус show tendencies similar to the Polish language.
English-language sports terminology is accepted and used in the Bulgarian language more than in Polish and Russian, e.g.Bulg.байк, байкът, байкове; даунхил.In the Polish language sports terminology, if it is possible, remain native, cf.Pol.rower (górski ) (mountain bike), góral (climber); kolarstwo górskie (mountain biking); Russ.горный велосипед ; горный велоспорт; маунтинбайкин The increased assimilation of anglicisms in the Bulgarian and Russian languages is noticeable in technical terminology.The Polish language is characterized by 'higher frequency of native conventional lexis and abstinenece from borrowings', (cf.J. Mazurkiewicz-Sułkowska, 2013, p. 6)  This regularity is confirmed by the neologisms I am studying in the language corpora.6.5.The summary of the article concerning neologisms in three different, although related Slavonic languages and their excerption thanks to the newest linguistic tools should be considered as preliminary contrastive deliberations which will become the origin of a bigger dissertation on neologisms.The source of obtaining the newest lexis will become not only the corpora but first of all book and electronic dictionaries, the Internet, subject literature and field research.I believe that intensive research of scholars on language corpora will enrich my studies with a lot of interesting and reliable language materials which, through confrontation, will most probably allow me to show interesting conclusions.
4.1.1.The most popular corpora can be helpful in search of the newest Polish lexis: The Corpus of Polish IPI PAN (300 m. segments), The Corpus of Polish PWN (full net version 40 m.segments, demonstration version 7 250 000 segmets), which is a part of The National Corpus of Polish (over one and a half billion words), The Reference Corpus of Polish 'PELCRA' (100 m. words) 4.2.1.Below there are examples of neologisms excerpted from 'Parallel Polish-Bulgarian-Russian Corpus' which is currently in preparation by: V. Koseska-Toszewa, J. Satoła-Staśkowiak, W. Sosnowski and A. Kisiel and K. Staśkowiak as part of the European Project Clarin 9 .
.1.2.When studying Bulgarian neologisms it is worth using The National Corpus of Bulgarian 8 .Below some examples with a Bulgarian word одит.In case of research on the newest Russian lexis The National Corpus of Russian, open for general use, is one of the corpora that can facilitate work.