These parallel corpora consist of Biblical verses (historical and contemporate, 1821–2019) from Erzya, Moksha, Olonets-Karelian (Livvi), Dvina-Karelian (North Karelian), Khanty, Komi-Permyak, Komi-Zyrian, Mansi, Udmurt and Veps, the majority of which, in reference to newer translations, come from the Institute for Bible Translation in Helsinki, Finland as originally organized for the University of Helsinki Language Corpus Server (UHLCS). Finnish and Russian translations are also included.

The purpose of these parallel corpora is to further the studies of translation in Uralic minority languages. Simultaneously, it provides an opportunity to follow changes in lexical and syntactic strategies used in different versions of Biblical verses in one language or compare lexicon and structure between languages. Lemmatization and morphological analyses are provided for all but Dvina-Karelian, Khanty, Veps and Russian, and the accuracy in the remaining languages should be developed as disambiguation resources. The Finnish texts have been analyzed with TNPP (Turku Neural Parser Pipeline), which includes lemmatization, morphological analysis as well as syntactic annotation. The texts in Erzya and Moksha also have lemmatization, morphological analysis and syntactic annotation.

The 27 books of the New Testament are included for the following languages:

  • fin (1938)
  • koi (2019)
  • kpv (2008)
  • krl (2011)
  • mdf (2016)
  • myv (2006)
  • olo (2003)
  • rus (1876)
  • udm (1997)
  • vep (2013)

Additionally, the following books are included:

  • kca (2013): MRK
  • kca (2017): GEN
  • kca (2018): JON
  • koi (1996): MRK
  • kpv (1995): MRK
  • kpv (1997): JHN
  • mdf (1995): MRK
  • mns (2000): MRK
  • myv (1821): MAT
  • myv (1910): MAT, MRK, LUK, JHN
  • myv (1995): MRK
  • myv (1996): LUK, ACT
  • myv (1998): MAT
  • myv (2011): PSA
  • myv (2021): RUT, ECC, SNG, JON
  • vep (2013): JON

This resource group page has a Persistent Identifier:

