Bulgarian sense-annotated corpus – between the tradition and novelty
DOI:
https://doi.org/10.11649/cs.2012.012Keywords:
corpus studies, corpus annotation, annotation principlesAbstract
Bulgarian sense-annotated corpus – between the tradition and novelty
The Bulgarian Sense-annotated Corpus (BulSemCor) is compiled according to the general methodology established by the SemCor project. It is a subset of the Brown Corpus of Bulgarian semantically annotated with a corresponding synonym set (synset) in the Bulgarian wordnet. Unlike the bulk of sense-annotated corpora where only (sets of) content words are annotated, in BulSemCor each lexical unit has been assigned a sense. The main contributions achieved in the work on BulSemCor are briefly decides in the presented paper: definition of an annotation schema, compilation of an input corpus, development of a sense-annotated corpus, Bulgarian wordnet enlargement.
Downloads
Published
Issue
Section
License
Copyright (c) 2015 Svetla Koeva
This work is licensed under a Creative Commons Attribution 3.0 Unported License.