Proto-Indo-European Lexicon

Proto-Indo-European Lexicon is the generative etymological dictionary of Indo-European languages.

The current version, PIE Lexicon Pilot 1.1, presents digitally generated data of hundred most ancient Indo-European languages with three hundred new etymologies for Old Anatolian languages, Hitttite, Palaic, Cuneiform Luwian and Hieroglyphic Luwian, arranged under two hundred Indo-European roots.

More information on the lexicon: http://pielexicon.hum.helsinki.fi/

Latest versions/subcorpora:
Proto-Indo-European Lexicon
icon-info-circle Metadata and license
Open the website
Look for other versions of this resource in META-SHARE

Of this language corpus different versions/subcorpora are published in the Language Bank of Finland. The versions are available through the Language Bank Download Service and/or through the Korp concordance tool, or they are offered by another member organisation of FIN-CLARIN. The links to the different versions can be found from the list above.

Detailed information on the content of each version, user rights and licenses can be found from it’s specific metadata record in META-SHARE.

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021093004

Dictionaries of Neahttadigisánit

This resource offers a collection of free digital dictionaries for small languages. The dictionaries are based on lexical material from a large array of dictionaries and written literature from the 20th and 21th centuries.

Latest versions/subcorpora:
The Erzya-Finnish-Erzya Dictionaries of Neahttadigisánit
icon-info-circle Metadata and license
Open the website
The Hill Mari-Finnish-Hill Mari Dictionaries of Neahttadigisánit
icon-info-circle Metadata and license
Open the website
The Livonian-Finnish-Livonian Dictionaries of Neahttadigisánit
icon-info-circle Metadata and license
Open the website
The Livvi Karelian (Olonets)-Finnish-Livvi Karelian (Olonets) Dictionaries of Neahttadigisánit
icon-info-circle Metadata and license
Open the website
The Moksha-Finnish-Moksha Dictionaries of Neahttadigisánit
icon-info-circle Metadata and license
Open the website
The Nenets-Finnish-Nenets Dictionaries of Neahttadigisánit
icon-info-circle Metadata and license
Open the website
The Northern Sami-Finnish-Northern Sami Dictionaries of Neahttadigisánit
icon-info-circle Metadata and license
Open the website
Look for other versions of this resource in META-SHARE

Of this language corpus different versions/subcorpora are published in the Language Bank of Finland. The versions are available through the Language Bank Download Service and/or through the Korp concordance tool, or they are offered by another member organisation of FIN-CLARIN. The links to the different versions can be found from the list above.

Detailed information on the content of each version, user rights and licenses can be found from it’s specific metadata record in META-SHARE.

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021093003

Etymological Reference Database

This resource is offered by Kotus, Kotimaisten kielten keskus, the Institute for the Languages of Finland. It consists of an Etymological Reference Database, which includes references to texts on the etymology of Finnish words.

Latest versions/subcorpora:
Etymological Reference Database
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Open the website
Look for other versions of this resource in META-SHARE

Of this language corpus different versions are (or might be in the future) published in the Language Bank of Finland. The versions are available through the Language Bank Download Service and/or through the Korp concordance tool, or they are offered by another member organisation of FIN-CLARIN. The links to the different versions can be found from the list above.

Detailed information on the content of each version, user rights and licenses can be found from it’s specific metadata record in META-SHARE.

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021093002

Online Lexicon of Veps Language

This resource is offered by Kotus, Kotimaisten kielten keskus, the Institute for the Languages of Finland.

The resource contains the Online Lexicon of Veps Language from Lauri Kettunen’s (1885-1963) handwritten dictionary and notes. Kettunen travelled twice to Veps areas, in 1917-1918 and in 1934 with Lauri Posti and Paavo Siro.

The lexicon, which is based on the field notes, has been digitized.

Latest versions/subcorpora:
Online Lexicon of Veps Language
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Open the website
Open the resource in Sanat
Search for these versions in META-SHARE

Of this language corpus different versions/subcorpora are published in the Language Bank of Finland. The versions are available through the Language Bank Download Service and/or through the Korp concordance tool, or they are offered by another member organisation of FIN-CLARIN. The links to the different versions can be found from the list above.

Detailed information on the content of each version, user rights and licenses can be found from it’s specific metadata record in META-SHARE.

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021092001

Etymological Database of the Sami Languages

This resource is offered by Kotus, Kotimaisten kielten keskus, the Institute for the Languages of Finland.

The database is built on the ca. 25,000 North Saami entries of Nielsen’s Pohjoissaamen sanakirja [North Saami Dictionary]. The words were incorporated in the database in cooperation with the Finno-Ugric Department of the University of Helsinki. The database includes the variants, as well as the etymological and derivational references at the end of the word articles, given in Nielsen’s dictionary. Each word is also provided with English-language meanings based on the data in Nielsen’s dictionary; Finnish meanings based on Sammallahti’s Saamelais-suomalainen sanakirja [Saami-Finnish dictionary]; and German meanings based on Sammallahti and Nickel’s Saamisch-deutsches Wörterbuch.

More information on the Álgu project

Latest versions/subcorpora:
Etymological Database of the Sami Languages
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Open the website
Search for these versions in META-SHARE

Of this language corpus different versions/subcorpora are published in the Language Bank of Finland. The versions are available through the Language Bank Download Service and/or through the Korp concordance tool, or they are offered by another member organisation of FIN-CLARIN. The links to the different versions can be found from the list above.

Detailed information on the content of each version, user rights and licenses can be found from it’s specific metadata record in META-SHARE.

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021091417

Dictionary of Old Literary Finnish

The Dictionary of Old Literary Finnish presents from the point of view of both meaning and usage as exhaustively as possible all the words of the Finnish literary sources from 1543-1810.
This language resource was published on 21st November 2014 and is to be continuously updated.
The Dictionary of Old Literary Finnish is the product of a joint project between the Institute for the Languages of Finland and the Kone Foundation.

Latest versions/subcorpora:
Dictionary of Old Literary Finnish
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Open the website
Search for these versions in META-SHARE

Of this language corpus different versions/subcorpora are published in the Language Bank of Finland. The versions are available through the Language Bank Download Service and/or through the Korp concordance tool, or they are offered by another member organisation of FIN-CLARIN. The links to the different versions can be found from the list above.

Detailed information on the content of each version, user rights and licenses can be found from it’s specific metadata record in META-SHARE.

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021091416

Dictionary of Finnish Dialects

This resource is offered by Kotus, Kotimaisten kielten keskus, the Institute for the Languages of Finland. The purpose of the Dictionary of Finnish Dialects is to present the vocabulary of the Finnish dialects based on a large amount of data collected.

The first part of the dictionary, which was planned to contain 20 volumes, was published in 1985. By 2010 eight approximately 1000 pages long volumes were published. These volumes covered the words from the letter “a” to the word “kurvottaa”. In 2010, it was decided that the dictionary should be published from then on in an electronic form as an online service that can be used free of charge. The first part of the online dictionary (kus-kyntsöttää) was published in early 2012, after which the dictionary is to be enlarged with both new and already printed alphabetical parts about once a year. On 22 October 2014 the online dictionary was enlarged to contain the words from ”kala” to ”käävätä.”

The dictionary presents the entire vocabulary of all the Finnish dialects. It does not make a distinction based on whether the word is dialectal or standard / literary. Such a distinction would be to a certain degree impossible, since the vocabulary of standard Finnish is mostly based on the vocabulary of different Finnish dialects.

Latest versions/subcorpora:
Dictionary of Finnish Dialects
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Open the website
Search for these versions in META-SHARE

Of this language corpus different versions are (or might be in the future) published in the Language Bank of Finland. The versions are available through the Language Bank Download Service and/or through the Korp concordance tool, or they are offered by another member organisation of FIN-CLARIN. The links to the different versions can be found from the list above.

Detailed information on the content of each version, user rights and licenses can be found from it’s specific metadata record in META-SHARE.

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021091415

Dictionary of Contemporary Finnish

The Dictionary of Contemporary Finnish is a dictionary of standard Finnish made by the Institute for the Languages of Finland. It is based on an extensive, constantly expanding word archive of contemporary Finnish. The dictionary provides information on the meanings, usage and nuances of style of contemporary Finnish words, as well as about their inflection and spelling. The information provided by the dictionary is based on the decisions of the Finnish Language Board. The dictionary contains over 100 000 lemmas.

More information

Latest versions/subcorpora:
Dictionary of Contemporary Finnish
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Open the web page
Search for these versions in META-SHARE

Of this language corpus different versions are (or might be in the future) published in the Language Bank of Finland. The versions are available through the Language Bank Download Service and/or through the Korp concordance tool, or they are offered by another member organisation of FIN-CLARIN. The links to the different versions can be found from the list above.

Detailed information on the content of each version, user rights and licenses can be found from it’s specific metadata record in META-SHARE.

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021091412

Finnish Wikipedia 2017

The Finnish Wikipedia 2017 source material corpus contains all Finnish articles from the online encyclopedia Wikipedia available in 1 January 2018. The text parts of the articles have been extracted from Wikipedia Dumps with WikiExtractor.

Latest versions/subcorpora:
Finnish Wikipedia 2017, source
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Download the resource
Search for these versions in META-SHARE

Of this language corpus different versions are (or might be in the future) published in the Language Bank of Finland. The versions are available through the Language Bank Download Service and/or through the Korp concordance tool, or they are offered by another member organisation of FIN-CLARIN. The links to the different versions can be found from the list above.

Detailed information on the content of each version, user rights and licenses can be found from it’s specific metadata record in META-SHARE.

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021091411



Karelian Dictionary

The six volumes of the Karelian dictionary were published in 1968-2005 by the Institute for the Languages of Finland and the Finno-Ugrian Society.

The online dictionary is a project of the Insitute for the Languages of Finland. It is updated according to necessity and resources.

More information on the dictionary: http://kaino.kotus.fi/kks

Website: https://kaino.kotus.fi/cgi-bin/kks/karjala.cgi

Downloadable in XML format: http://kaino.kotus.fi/kks/lataa/kksxml.zip

Latest versions/subcorpora:
Karelian Dictionary
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Download the resource
Headword List of the Karelian Dictionary
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Open web page
Search for these versions in META-SHARE

Of this language corpus different versions are (or might be in the future) published in the Language Bank of Finland. The versions are available through the Language Bank Download Service and/or through the Korp concordance tool, or they are offered by another member organisation of FIN-CLARIN. The links to the different versions can be found from the list above.

Detailed information on the content of each version, user rights and licenses can be found from it’s specific metadata record in META-SHARE.

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021091405

ANEE lexical portals of Akkadian

Team 1 of the Centre of Excellence in Ancient Near Eastern Empires (ANEE) has created lexical portals that function as a graphic semantic dictionary. Via these portals the user can explore semantic networks for one (or multiple) words that one is interested in. By following the links, one can also trace attestations back to the dataset in Korp and from there to Oracc.  We have used the methods of language technology to trace paradigmatic and syntagmatic relationships in a large corpus of the Akkadian language.

The language technological tool fastText can be used to explore paradigmatic relations. The method called Pointwise Mutual Information (PMI) is able to capture the nuances of syntagmatic relations. The visualization of lexemes and their relationships as networks have proved to be the most fruitful approach to analyzing semantic domains created by our methods.

The full dataset for ANEE lexical portals (both PMI and fastText graphs) can be found in a Zenodo-repository: http://doi.org/10.5281/zenodo.4646662.

Latest versions/subcorpora:  
ANEE lexical portal of Akkadian: fastText
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Open the website
ANEE lexical portal of Akkadian: PMI
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Open the website
ANEE lexical portal of Akkadian: dataset
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Open the website
Look for all versions in META-SHARE  

Of this language corpus different versions are published in the Language Bank of Finland. The versions are available through the Language Bank Download Service and/or through the Korp concordance tool, or they are offered by another member organisation of FIN-CLARIN. The links to the different versions can be found from the list above.

Detailed information on the content of each version, user rights and licenses can be found from it’s specific metadata record in META-SHARE.

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021082001