Parallel Bible Verses for Uralic Studies

Latest versions/subcorpora:  

Parallel Bible Verses for Uralic Studies, Korp
icon-info-circle Metadata and license
icon-quote-right Attribution instructions

Select the corpus in Korp
Search for all versions in META-SHARE  

These parallel corpora consist of Biblical verses (historical and contemporate, 1821–2019) from Erzya, Moksha, Olonets-Karelian (Livvi), Dvina-Karelian (North Karelian), Khanty, Komi-Permyak, Komi-Zyrian, Mansi, Udmurt and Veps, the majority of which, in reference to newer translations, come from the Institute for Bible Translation in Helsinki, Finland as originally organized for the University of Helsinki Language Corpus Server (UHLCS). Finnish and Russian translations are also included.

The purpose of these parallel corpora is to further the studies of translation in Uralic minority languages. Simultaneously, it provides an opportunity to follow changes in lexical and syntactic strategies used in different versions of Biblical verses in one language or compare lexicon and structure between languages. Lemmatization and morphological analyses are provided for all but Dvina-Karelian, Khanty, Veps and Russian, and the accuracy in the remaining languages should be developed as disambiguation resources. The Finnish texts have been analyzed with TNPP (Turku Neural Parser Pipeline), which includes lemmatization, morphological analysis as well as syntactic annotation. The texts in Erzya and Moksha also have lemmatization, morphological analysis and syntactic annotation.

The 27 books of the New Testament are included for the following languages:

  • fin (1938)
  • koi (2019)
  • kpv (2008)
  • krl (2011)
  • mdf (2016)
  • myv (2006)
  • olo (2003)
  • rus (1876)
  • udm (1997)
  • vep (2013)

Additionally, the following books are included:

  • kca (2013): MRK
  • kca (2017): GEN
  • kca (2018): JON
  • koi (1996): MRK
  • kpv (1995): MRK
  • kpv (1997): JHN
  • mdf (1995): MRK
  • mns (2000): MRK
  • myv (1821): MAT
  • myv (1910): MAT, MRK, LUK, JHN
  • myv (1995): MRK
  • myv (1996): LUK, ACT
  • myv (1998): MAT
  • myv (2011): PSA
  • myv (2021): RUT, ECC, SNG, JON
  • vep (2013): JON

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2023030902

The Helsinki Korp JRC-Acquis Bilingual Parallel Corpora

The Helsinki Korp JRC-Acquis Bilingual Parallel Corpora are:

The Helsinki Korp JRC-Acquis Finnish-English Corpus
The Helsinki Korp JRC-Acquis Finnish-Swedish Corpus
The Helsinki Korp JRC-Acquis Finnish-German Corpus
The Helsinki Korp JRC-Acquis Finnish-French Corpus
The Helsinki Korp JRC-Acquis Finnish-Spanish Corpus
The Helsinki Korp JRC-Acquis Finnish-Italian Corpus
The Helsinki Korp JRC-Acquis Finnish-Hungarian Corpus
The Helsinki Korp JRC-Acquis Finnish-Polish Corpus

The corpora contain texts of the JRC-Acquis Multilingual Parallel Corpus. The Acquis Communautaire (AC) is the total body of European Union (EU) law applicable in the the EU Member States.

Please note, that The Helsinki Korp JRC-Acquis Finnish-Estonian Corpus is currently unavailable because of technical reasons, and therefore removed from the list above.

For more information on the JRC-Acquis Multilingual Parallel Corpus see http://urn.fi/urn:nbn:fi:lb-20140730162 or https://ec.europa.eu/jrc/en/language-technologies/jrc-acquis

Latest versions/subcorpora:  
The Helsinki Korp JRC-Acquis Bilingual Parallel Corpora
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Select the corpus in Korp
The Finnish Sub-corpus of the JRC-Acquis Multilingual Parallel Corpus
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Select the corpus in Korp
The Finnish Sub-corpus of the JRC-Acquis Multilingual Parallel Corpus, Downloadable Version
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Download the resource
Search for all versions in META-SHARE  

Of this language corpus different versions/subcorpora are (or might be in the future) published in the Language Bank of Finland. The versions are available through the Language Bank Download Service and/or through the Korp concordance tool. The links to the different versions can be found from the list above.

Detailed information on the content of each version, user rights and licenses can be found from it’s specific metadata record in META-SHARE.

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021052404

The Helsinki Korp Europarl Bilingual Corpora

The Helsinki Korp Europarl Bilingual Corpora are:

The Helsinki Korp Europarl Finnish-English Corpus
The Helsinki Korp Europarl Finnish-Swedish Corpus
The Helsinki Korp Europarl Finnish-German Corpus
The Helsinki Korp Europarl Finnish-French Corpus
The Helsinki Korp Europarl Finnish-Spanish Corpus
The Helsinki Korp Europarl Finnish-Estonian Corpus

The corpora contain texts of the Europarl Parallel Corpus v7.

The Europarl parallel corpus is extracted from the proceedings of the European Parliament. The goal of the extraction and processing was to generate sentence aligned text for statistical machine translation systems. For this purpose matching items were extracted and labeled with corresponding document IDs. By using a preprocessor, sentence boundaries were identified. The data was sentence aligned by using a tool based on the Church and Gale algorithm.

For more information on the Europarl Parallel Corpus see http://urn.fi/urn:nbn:fi:lb-20140730195 and http://www.statmt.org/europarl/

Latest versions/subcorpora:
The Helsinki Korp Europarl Bilingual Corpora
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Select the corpus in Korp
Search for all versions in META-SHARE

Of this language corpus different versions/subcorpora are (or might be in the future) published in the Language Bank of Finland. The versions are available through the Language Bank Download Service and/or through the Korp concordance tool. The links to the different versions can be found from the list above.

Detailed information on the content of each version, user rights and licenses can be found from it’s specific metadata record in META-SHARE.

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021052403

Opus, Helsinki Korp Version

The Helsinki Korp version of the Opus open parallel corpus (http://opus.lingfil.uu.se/), containing scrambled sentences, has been published in Kielipankki.

Latest versions/subcorpora:
Opus, Helsinki Korp Version
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Select the corpus in Korp
Search for all versions in META-SHARE

The subcorpora of Opus, Helsinki Korp Version are:

OPUS Finnish–Czech
OPUS Finnish–Danish
OPUS Finnish–Dutch
OPUS Finnish–English
OPUS Finnish–Estonian
OPUS Finnish–French
OPUS Finnish–German
OPUS Finnish–Greek
OPUS Finnish–Hungarian
OPUS Finnish–Italian
OPUS Finnish–Polish
OPUS Finnish–Portuguese
OPUS Finnish–Russian
OPUS Finnish–Swedish
OPUS Finnish–Spanish
OPUS Finnish–Turkish

Of this language corpus different versions/subcorpora are (or might be in the future) published in the Language Bank of Finland. The versions are available through the Language Bank Download Service and/or through the Korp concordance tool. The links to the different versions can be found from the list above.

Detailed information on the content of each version, user rights and licenses can be found from it’s specific metadata record in META-SHARE.

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021052402

The ”Hallituskausi” Translation Memory

The ”Hallituskausi” translation memory is intended for those translating administrative texts between Finnish and English. It includes key policy reports published by the Finnish ministries on their websites.

The translation memory runs in a SDL Trados Studio programme.

Latest versions/subcorpora:
The ”Hallituskausi 2007–2011” Translation Memory
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Download the resource
The ”Hallituskausi 2011–2015” Translation Memory
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Download the resource
Search for these versions in META-SHARE

Of this language corpus different versions/subcorpora are published in the Language Bank of Finland. The versions are available through the Language Bank Download Service and/or through the Korp concordance tool. The links to the different versions can be found from the list above.

Detailed information on the content of each version, user rights and licenses can be found from it’s specific metadata record in META-SHARE.

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-202104143

ParFin 2016, ParRus 2016, Finnish-Russian / Russian-Finnish Parallel Corpus of Literary Texts

The corpus contains the sub-corpora ParFin 2016, Finnish-Russian Parallel Corpus of Literary Texts and ParRus 2016, Russian-Finnish Parallel Corpus of Literary Texts.

The sub-corpus ParRus2016 contains Russian literary texts (classical literature & 20th century) and their translations into Finnish aligned at paragraph level.

The sub-corpus ParFin2016 contains Finnish literary texts from 1990-2010 and their translations into Russian aligned at sentence level.

Latest versions/subcorpora:
ParFin 2016, Finnish-Russian Parallel Corpus of Literary Texts
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Select the corpus in Korp
The Finnish Sub-corpus of ParFin 2016, Finnish-Russian Parallel Corpus of Literary Texts
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Select the corpus in Korp
The Russian Sub-corpus of ParFin 2016, Finnish-Russian Parallel Corpus of Literary Texts
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Select the corpus in Korp
ParRus 2016, Russian-Finnish Parallel Corpus of Literary Texts
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Select the corpus in Korp
The Finnish Sub-corpus of ParRus 2016, Russian-Finnish Parallel Corpus of Literary Texts
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Select the corpus in Korp
The Russian Sub-corpus of ParRus 2016, Russian-Finnish Parallel Corpus of Literary Texts
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Select the corpus in Korp
Search for these versions in META-SHARE

Of this language corpus different versions/subcorpora are published in the Language Bank of Finland. The versions are available through the Language Bank Download Service and/or through the Korp concordance tool. The links to the different versions can be found from the list above.

Detailed information on the content of each version, user rights and licenses can be found from it’s specific metadata record in META-SHARE.

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021092405

MULCOLD, Multilingual Parallel Corpus of Legal Texts

The corpus contains international conventions and treaties arranged as a parallel corpus aligned on paragraph level. The languages covered are English, Finnish, German, Russian and Swedish.

The old name of the corpus is MultiJur: Multilingual Parallel Corpus of Legal Texts.

Latest versions/subcorpora:
MULCOLD, Multilingual Parallel Corpus of Legal Texts
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Select the corpus in Korp
The English Sub-corpus of MULCOLD, Multilingual Parallel Corpus of Legal Texts
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Select the corpus in Korp
The German Sub-corpus of MULCOLD, Multilingual Parallel Corpus of Legal Texts
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Select the corpus in Korp
The Russian Sub-corpus of MULCOLD, Multilingual Parallel Corpus of Legal Texts
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Select the corpus in Korp
Search for these versions in META-SHARE

Of this language corpus different versions/subcorpora are published in the Language Bank of Finland. The versions are available through the Language Bank Download Service and/or through the Korp concordance tool. The links to the different versions can be found from the list above.

Detailed information on the content of each version, user rights and licenses can be found from it’s specific metadata record in META-SHARE.

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021032305

KOTUS Finnish-Swedish Parallel Corpus

The corpus contains corporate press releases, surveys, reports, laws and regulations, as well as Government proposals to Parliament. The corpus was compiled at the Institute for the Languages of Finland (Kotus) and was completed in 2005. The main author of the corpus is Alexander Paile (Kotus). Also Mickel Grönroos and Eva Orava (Master’s Innovations Ltd.) have contributed to the project.

In most cases the source language texts (see a list of all texts below) are in Finnish and the translations in Finland Swedish, but the corpus has also some Finland Swedish texts that have been translated into Finnish. The Finnish and Finland Swedish texts are linked to one another on sentence level. The sentence order is not always the same as in the source language texts. Some of the sentences of the original texts have been omitted, so the corpus is primarily suitable for context-free study of sentence pairs.

Latest versions/subcorpora:
The Helsinki Korp Version of the KOTUS Finnish-Swedish Parallel Corpus
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Select the corpus in Korp
Helsinki Korp Version of the KOTUS Finnish-Swedish Parallel Corpus, Finnish Sub-corpus
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Select the corpus in Korp
Helsinki Korp Version of the KOTUS Finnish-Swedish Parallel Corpus, Swedish Sub-corpus
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Select the corpus in Korp
Search for these versions in META-SHARE

Of this language corpus different versions/subcorpora are published in the Language Bank of Finland. The versions are available through the Language Bank Download Service and/or through the Korp concordance tool. The links to the different versions can be found from the list above.

Detailed information on the content of each version, user rights and licenses can be found from it’s specific metadata record in META-SHARE.

The corpus contains the following texts:

Aktia Säästöpankki Oyj (1999–2004)
• Pörssitiedotteita 1999–2004 = Börsmeddelanden 1999–2004
• Osavuosikatsauksia 1999–2004 = Delårsrapporter 1999–2004
• Talouskatsauksia 1999–2002 = Ekonomiska översikter 1999–2002
• Lehdistötiedotteita 1999–2004 = Pressmeddelanden 1999–2004

Fennia (2003–2005)
• Esite 2005 = Broschyr 2005
• Vakuutusehtoja 2005 = Försäkringsvillkor 2005
• Toimintakertomus 2003 = Verksamhetsberättelse 2003

Neste (1993–1997)
• Taloustekstejä 1993–1997 = Ekonomiska texter 1993–1997

Stockmann (2000–2004)
• Lehdistötiedotteita 2000–2004 = Pressmeddelanden 2000–2004

Finlex
• Kirjanpitolaki 30.12.1997/1336 = Bokföringslag 30.12.1997/1336
• Laki takauksesta ja vierasvelkapanttauksesta 19.3.1999/361 = Lag om borgen och tredjemanspant 19.3.1999/361
• Asuntokauppalaki 23.9.1994/843 = Lag om bostadsköp 23.9.1994/843
• Laki asuinhuoneiston vuokrauksesta 31.3.1995/481 = Lag om hyra av bostadslägenhet 31.3.1995/481
• Laki oikeudenkäynnistä rikosasioissa 11.7.1997/689 = Lag om rättegång i brottmål 11.7.1997/689
• Laki holhoustoimesta 1.4.1999/442 = Lag om förmyndarverksamhet 1.4.1999/442
• Hallintolaki 6.6.2003/434 = Förvaltningslag 6.6.2003/434
• Hallintolainkäyttölaki 26.7.1996/586 = Förvaltningsprocesslag 26.7.1996/586
• Suomen perustuslaki 11.6.1999/731 = Finlands grundlag 11.6.1999/731
• Laki saatavien perinnästä 22.4.1999/513 = Lag om indrivning av fordringar 22.4.1999/513
• Maakaari 12.4.1995/540 = Jordabalk 12.4.1995/540
• Kuntalaki 17.3.1995/365 = Kommunallag 17.3.1995/365
• Konkurssilaki 20.2.2004/120 = Konkurslag 20.2.2004/120
• Yhdenvertaisuuslaki 20.1.2004/21 = Lag om likabehandling 20.1.2004/21
• Maankäyttö- ja rakennuslaki 5.2.1999/132 = Markanvändnings- och bygglag 5.2.1999/132
• Markkinaoikeuslaki 28.12.2001/1527 = Marknadsdomstolslag 28.12.2001/1527
• Kansalaisuuslaki 16.5.2003/359 = Medborgarskapslag 16.5.2003/359
• Ympäristönsuojelulaki 4.2.2000/86 = Miljöskyddslag 4.2.2000/86
• Laki viranomaisten toiminnan julkisuudesta 21.5.1999/621 = Lag om offentlighet i myndigheternas verksamhet 21.5.1999/621
• Järjestyslaki 27.6.2003/612 = Ordningslag 27.6.2003/612
• Henkilötietolaki 22.4.1999/523 = Personuppgiftslag 22.4.1999/523
• Laki velan vanhentumisesta 15.8.2003/728 = Lag om preskription av skulder 15.8.2003/728
• Kielilaki 6.6.2003/423 = Språklag 6.6.2003/423
• Laki valtioneuvostosta 28.2.2003/175 = Lag om statsrådet 28.2.2003/175
• Valtioneuvoston ohjesääntö 3.4.2003/262 = Reglemente för statsrådet 3.4.2003/262

Eduskunta / Riksdagen
• Hallituksen esitys Eduskunnalle kirjanpitolaiksi sekä laeiksi osakeyhtiölain 11 ja 12 luvun ja osuuskuntalain 79 c §:n muuttamisesta (HE 173/1997) = Regeringens proposition till Riksdagen med förslag till bokföringslag samt till lagar om ändring av 11 och 12 kap. lagen om aktiebolag och 79 c § lagen om andelslag (RP 173/1997)
• Hallituksen esitys Eduskunnalle takausta ja vierasvelkapanttausta koskevaksi lainsäädännöksi (HE 189/1998) =
Regeringens proposition till Riksdagen med förslag till lagstiftning om borgen och tredjemanspant (RP 189/1998)
• Hallituksen esitys Eduskunnalle asuntokauppaa koskevaksi lainsäädännöksi (HE 14/1994) = Regeringens proposition till Riksdagen med förslag till lagstiftning om bostadsköp (HE 14/1994)
• Hallituksen esitys Eduskunnalle uusiksi huoneenvuokralaeiksi ja niihin liittyvien lakien muuttamisesta (HE 304/1994) = Regeringens proposition till Riksdagen med förslag till nya hyreslagar och till ändring av lagar i anslutning till dem (RP 304/1994)
• Hallituksen esitys Eduskunnalle rikosasioiden oikeudenkäyntimenettelyn uudistamista alioikeuksissa koskevaksi lainsäädännöksi (HE 82/1995) = Regeringens proposition till Riksdagen med förslag till lagstiftning om revision av rättegångsförfarandet i brottmål i underrätterna (RP 82/1995)
• Hallituksen esitys Eduskunnalle holhouslainsäädännön uudistamiseksi (HE 146/1998) = Regeringens proposition till Riksdagen med förslag till ny förmynderskapslagstiftning (RP 146/1998)
• Hallituksen esitys Eduskunnalle hallintolaiksi ja laiksi hallintolainkäyttölain muuttamisesta (HE 72/2002) =
Regeringens proposition till Riksdagen med förslag till förvaltningslag och till lag om ändring av förvaltningsprocesslagen (RP 72/2002)
• Hallituksen esitys Eduskunnalle laiksi hallintolainkäytöstä ja siihen liittyväksi lainsäädännöksi (HE 217/1995) = Regeringens proposition till Riksdagen med förslag till förvaltningsprocesslag och till lagstiftning som har samband med den (RP 217/1995)
• Hallituksen esitys Eduskunnalle uudeksi Suomen Hallitusmuodoksi (HE 1/1998) = Regeringens proposition till Riksdagen med förslag till ny Regeringsform för Finland (RP 1/1998)
• Hallituksen esitys Eduskunnalle laiksi saatavien perinnästä ja laeiksi eräiden siihen liittyvien lakien muuttamisesta (HE 199/1996) = Regeringens proposition till Riksdagen med förslag till lag om indrivning av fordringar och lagar om ändring av vissa lagar som har samband med den (RP 199/1996)
• Hallituksen esitys Eduskunnalle maakaareksi ja eräiksi siihen liittyviksi laeiksi (HE 120/1994) = Regeringens proposition till Riksdagen med förslag till jordabalk och vissa lagar i samband med den (RP 120/1994)
• Hallituksen esitys Eduskunnalle kuntalaiksi (HE 192/1994) = Regeringens proposition till Riksdagen med förslag till kommunallag (RP 192/1994)
• Hallituksen esitys Eduskunnalle konkurssilainsäädännön uudistamiseksi (HE 26/2003) = Regeringens proposition till Riksdagen med förslag till revidering av konkurslagstiftningen (RP 26/2003)
• Hallituksen esitys Eduskunnalle laiksi yhdenvertaisuuden turvaamisesta sekä eräiden siihen liittyvien lakien muuttamisesta (HE 44/2003) = Regeringens proposition till Riksdagen med förslag till lag om tryggande av jämlikhet samt ändring av vissa lagar som har samband med den (RP 44/2003)
• Hallituksen esitys Eduskunnalle rakennuslainsäädännön uudistamiseksi (HE 101/1998) = Regeringens proposition till Riksdagen med förslag till reform av byggnadslagstiftningen (RP 101/1998)
• Hallituksen esitys Eduskunnalle markkinaoikeuslaiksi ja siihen liittyväksi lainsäädännöksi (HE 105/2001) = Regeringens proposition till Riksdagen med förslag till marknadsdomstolslag samt lagstiftning som har samband med den (RP 105/2001)
• Hallituksen esitys Eduskunnalle kansalaisuuslaiksi (HE 235/2002) = Regeringens proposition till Riksdagen med förslag till medborgarskapslag (RP 235/2002)
• Hallituksen esitys Eduskunnalle ympäristönsuojelu- ja vesilainsäädännön uudistamiseksi (HE 84/1999) = Regeringens proposition till Riksdagen med förslag till revidering av miljöskydds- och vattenlagstiftningen (RP 84/1999)
• Hallituksen esitys Eduskunnalle laiksi viranomaisten toiminnan julkisuudesta ja siihen liittyviksi laeiksi (HE 30/1998) = Regeringens proposition till Riksdagen med förslag till lag om offentlighet i myndigheternas verksamhet samt till lagar som har samband med den (RP 30/1998)
• Hallituksen esitys Eduskunnalle laiksi turvallisuuden edistämistä yleisillä paikoilla koskevien säännösten uudistamiseksi (HE 20/2002) = Regeringens proposition till Riksdagen med förslag till revidering av bestämmelserna för främjande av säkerheten på allmänna platser (RP 20/2002)
• Hallituksen esitys Eduskunnalle henkilötietolaiksi ja eräiksi siihen liittyviksi laeiksi (HE 96/1998) = Regeringens proposition till Riksdagen med förslag till personuppgiftslag och till vissa lagar som har samband med den (RP 96/1998)
• Hallituksen esitys Eduskunnalle velan vanhentumista ja julkista haastetta koskevan lainsäädännön uudistamisesta (HE 187/2002) = Regeringens proposition till Riksdagen med förslag till en reform av lagstiftningen om preskription av skulder och offentlig stämning (RP 187/2002)
• Hallituksen esitys Eduskunnalle uudeksi kielilaiksi ja siihen liittyväksi lainsäädännöksi (HE 92/2002) = Regeringens proposition till Riksdagen med förslag till ny språklag och lagstiftning som har samband med den (RP 92/2002)
• Hallituksen esitys Eduskunnalle laiksi valtioneuvostosta ja eräiksi siihen liittyviksi laeiksi (HE 270/2002) = Regeringens proposition till Riksdagen med förslag till lag om statsrådet och vissa lagar i samband med den (RP 270/2002)

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-201406036

Search the Language Bank Portal:
Lotta Leiwo
Researcher of the Month: Lotta Leiwo

 

Upcoming events


Contact

The Language Bank's technical support:
kielipankki (at) csc.fi
tel. +358 9 4572001

Requests related to language resources:
fin-clarin (at) helsinki.fi
tel. +358 29 4129317

More contact information