The Corpus of Beserman Udmurt, Kielipankki Version


Currently available versions of this resource

ShortnameName and metadataLicenseLocationCiteResource group and helpApplyPublication yearSupport level
ShortnameName and metadataLicenseLocationCiteResource group and helpApplyPublication yearSupport level

Resource information

The Corpus of Beserman Udmurt comprises 65 000 tokens. The Beserman dialect of Udmurt is used in daily communication approximately by 2 000 speakers (according to the 2010 census). The Beserman live in the basin of the Cheptsa river in the Republic of Udmurtia and in the Kirov Oblast of the Russian Federation. In the scientific literature Beserman is considered to be a dialect of the Udmurt language which is characterized by an unusual combination of specifically Beserman phenomena (concentrated in vocabulary and phonetics) with certain traits of Northern and Southern Udmurt dialects, mostly morphological and phonological. The dialect remains the main means of everyday communication in Beserman villages, at least for the older generation.

The texts contained in the corpus have been collected in the villages of Shamardan (109 texts of 117), Vortsa (4 of 117), Malaya Yunda (1 of 117) and Zhuvam (3 of 117) in the Republic of Udmurtia in the years 2003-2015. There are 33 informants in total. The texts have been recorded, transcribed and grammatically annotated in the SIL FieldWorks software. The corpus contains narratives, life stories, dialogues, recipes, and recordings of psycholinguistic experiments. Each sentence is provided with interlinear glossing (according to the Leipzig Glossing Rules) and translation. Both the full text version with audio files and the corpus version are available at http://beserman.ru/corpus/search/?interface_language=en

License and access

  • Some versions of this resource are available publicly (PUB), whereas others might require you to log in as an academic user (ACA) or to apply for individual access rights (RES).
  • Click on the license image to see the resource-specific license text.
  • Some/all versions of this resource may contain personal data (license condition +PRIV). The license may then include additional data protection terms and conditions that you must follow. If processing personal data, maintain a public Privacy Notice regarding your project and provide the link to the Language Bank of Finland, see instructions.
  • Of this language corpus different versions/subcorpora are (or might be in the future) published in the Language Bank of Finland. The versions are available through the Language Bank Download Service and/or through the Korp concordance tool. The links to the different versions can be found from the list above.

Detailed information on the content of each version, user rights and licenses can be found from it’s specific metadata record.


This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021052406

Last modified on 2025-02-21