ANEE Idiolect Network Portal

This resource contains a portal with over 105,621 pages linked together. The pages contain lists of most similar neighbours, ranked by Double Mutual Rank (DOMUR) similarity measure, for 105,621 cuneiform texts exported from Oracc.

Latest versions/subcorpora:
ANEE Idiolect Network Portal Metadata and license	Open the website

Word embeddings trained with word2vec (wordvec)

Name: ASTIN: Language Technologies in the Nordic Countries 2026
Start: 2026-11-10T00:00:00+02:00
End: 2026-11-11T23:59:59+02:00
Location: Helsinki

Suomeksi

Currently available versions of this resource

Shortname	Name and metadata	License	Location	Cite	Resource group and help	Apply	Publication year	Support level
Shortname	Name and metadata	License	Location	Cite	Resource group and help	Apply	Publication year	Support level

Upcoming versions of this resource

These resource versions are not yet available in the Language Bank of Finland.

Shortname	Name and metadata	License	Formats	Support level	Contact Person	Resource group and help	Location	Other information
Shortname	Name and metadata	License	Formats	Support level	Contact Person	Resource group and help	Location	Other information

Resource information

This resource collection contains word embeddings trained with word2vec from various corpora.

The embedding file is in a simple and easily parsed textual format produced by word2vec. The first line in the file gives the vocabulary size and dimension. Each line after that begins with a vocabulary item, followed by a space, followed by 128 floating point numbers (represented textually) each followed by a space.

License and access

All versions of this resource are available publicly (PUB).
Click on the license image to see the resource-specific license text.

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2022041401

FinnONTO

Latest versions/subcorpora:
FinnONTO – ONKI Metadata and license	Open the website

The ONKI service contains Finnish and international ontologies, vocabularies and thesauri needed for publishing content cost-efficiently on the Semantic Web. ONKI is published and maintained by Semantic Computing Research Group SeCo. It is part of the on-going project to build a national semantic web infrastructure to Finland (FinnONTO).

The service offers various ontologies under different categories like:
– General upper ontology
– Museum artifacts
– Music
– Design
– Health
– Photography
– Agriculture
– Government
– Literature
– Linguistics
– Literary research
– Cultural research
– Economics
– Seafaring
– Military

All ontologies are being merged into one ontology covering all the categories called The Finnish Collaborative Holistic Ontology (KOKO).

Most of the ontologies are multilingual. In the General upper ontology the names of concepts are in Finnish, Swedish and English, while for example in the Linguistics ontology the languages used are Finnish, Swedish, English, German and Estonian.

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021093001

The Helsinki Term Bank for the Arts and Sciences

Suomeksi

The Helsinki Term Bank for the Arts and Sciences (HTB) is a multidisciplinary project which aims to gather a permanent terminological database for all fields of research in Finland. The project has created this Semantic MediaWiki platform, which offers a collaborative environment. This means that anyone can freely use it and also participate in the discussion about terms.

The Helsinki Term Bank for the Arts and Sciences
Metadata and license
Attribution instructions

Open the website

Detailed information on the content, user rights and licenses can be found from the metadata record.

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021092002

FinnWordNet

Suomeksi

Currently available versions of this resource

Shortname	Name and metadata	License	Location	Cite	Resource group and help	Apply	Publication year	Support level
Shortname	Name and metadata	License	Location	Cite	Resource group and help	Apply	Publication year	Support level

Upcoming versions of this resource

These resource versions are not yet available in the Language Bank of Finland.

Shortname	Name and metadata	License	Formats	Support level	Contact Person	Resource group and help	Location	Other information
Shortname	Name and metadata	License	Formats	Support level	Contact Person	Resource group and help	Location	Other information

Resource information

The Finnish WordNet is a lexical database for Finnish. It is a part of the FIN-CLARIN infrastructure project.

FinnWordNet is licensed under the Creative Commons Attribution (CC-BY) 3.0 licence. As a derivative of the Princeton WordNet, FinnWordNet is also subject to the Princeton WordNet licence.

FinnWordNet contains words (nouns, verbs, adjectives and adverbs) grouped by meaning into synonym groups representing concepts. These synonym groups are linked to each other with relations such as hyponymy and antonymy, creating a semantic network.

FinnWordNet can be used in language technology research and applications. It can also be used interactively as an electronic thesaurus.

The first version of FinnWordNet has been created by having the words of the original English (Princeton) WordNet (version 3.0) translated into Finnish by professional translators.

Detailed information: http://www.kielipankki.fi/corpora/finnwordnet/

License and access

All versions of this resource are available publicly (PUB).
Click on the license image to see the resource-specific license text.

This page has a persistent identifier: http://urn.fi/urn:nbn:fi:lb-2014052714

Finnish FrameNet (finnframenet)

Suomeksi

Currently available versions of this resource

Shortname	Name and metadata	License	Location	Cite	Resource group and help	Apply	Publication year	Support level
Shortname	Name and metadata	License	Location	Cite	Resource group and help	Apply	Publication year	Support level

Upcoming versions of this resource

These resource versions are not yet available in the Language Bank of Finland.

Shortname	Name and metadata	License	Formats	Support level	Contact Person	Resource group and help	Location	Other information
Shortname	Name and metadata	License	Formats	Support level	Contact Person	Resource group and help	Location	Other information

Resource information

The database of Finnish semantic frames is based on the original English language FrameNet housed at the International Computer Science Institute in Berkeley, California. The Finnish FrameNet project started by collecting 90,592 examples of different frame examples from the original Berkeley FrameNet. The examples represented 866 different frames and the elements that evoke them.

The FinnFrameNet project is a part of the FIN-CLARIN consortium.

License and access

All versions of this resource are available publicly (PUB).
Click on the license image to see the resource-specific license text.

Publications

Krister Lindén, Heidi Haltia, Juha Luukkonen, Antti O. Laine, Henri Roivainen & Niina Väisänen. 2017. FinnFN 1.0: The Finnish frame semantic database. Nordic Journal of Linguistics, 40(2), 1–25. http://hdl.handle.net/10138/228879
Krister Lindén, Heidi Haltia, Antti Laine, Juha Luukkonen, Jussi Piitulainen & Niina Väisänen. 2019. FinnTransFrame: translating frames in the FinnFrameNet project. Language Resources and Evaluation 53(1), 141–171. http://hdl.handle.net/10138/266770

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021091403

FinEst BERT (finestbert)

Suomeksi

Currently available versions of this resource

Shortname	Name and metadata	License	Location	Cite	Resource group and help	Apply	Publication year	Support level
Shortname	Name and metadata	License	Location	Cite	Resource group and help	Apply	Publication year	Support level

Upcoming versions of this resource

These resource versions are not yet available in the Language Bank of Finland.

Shortname	Name and metadata	License	Formats	Support level	Contact Person	Resource group and help	Location	Other information
Shortname	Name and metadata	License	Formats	Support level	Contact Person	Resource group and help	Location	Other information

Resource information

FinEst BERT is a Bidirectional Encoder Representations from Transformers (BERT) multilingual model trained from scratch, covering three languages: Finnish, Estonian, and English. Used for various NLP classification tasks on the mentioned three languages, supporting both monolingual and multilingual/crosslingual (knowledge transfer) tasks. Whole-word masking used during data preparation and training; trained for 40 epochs with sequence length 128 and another 4 epochs with sequence length 512. FinEst BERT model published here is in pytorch format.

Corpora used:
Finnish – STT articles, CoNLL 2017 shared task, Ylilauta downloadable version
Estonian – Ekspress Meedia articles, CoNLL 2017 shared task
English – English wikipedia

License and access

This resource is available publicly (PUB).
Click on the license image to see the resource-specific license text.

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021091402

ANEE Lexical Networks

Team 1 of the Centre of Excellence in Ancient Near Eastern Empires (ANEE) has created a lexical portal that functions as a graphic semantic dictionary. Via this portal the user can explore semantic networks for one (or multiple) words that one is interested in. By following the links, one can also trace attestations back to the datasets in Korp (Oracc, Achemenet, and BALT) and from there to Open Richly Annotated Cuneiform Corpus (Oracc) and other resources.

Latest versions/subcorpora:
Neo-Babylonian Lexical Networks Metadata and license Attribution instructions	Open the website
Neo-Babylonian Lexical Networks – the dataset Metadata and license Attribution instructions	Open the website
ANEE Lexical Networks v. 2.0 Metadata and license Attribution instructions	Open the website
ANEE Lexical Networks v. 2.0 – the dataset Metadata and license Attribution instructions	Open the website
Archived versions:
ANEE lexical portal of Akkadian: fastText Metadata and license Attribution instructions	Open the website
ANEE lexical portal of Akkadian: PMI Metadata and license Attribution instructions	Open the website
ANEE lexical portal of Akkadian: dataset Metadata and license Attribution instructions	Open the website

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021082001

Last modified on 2025-05-22

Search the Language Bank Portal:

Researcher of the Month: Mari Myllylä

Contact

The Language Bank's technical support:
kielipankki (at) csc.fi
tel. +358 9 4572001

Requests related to language resources:
fin-clarin (at) helsinki.fi
tel. +358 29 4129317

More contact information

ANEE Idiolect Network Portal

Word embeddings trained with word2vec (wordvec)

Currently available versions of this resource

Upcoming versions of this resource

Resource information

License and access

FinnONTO

The Helsinki Term Bank for the Arts and Sciences

FinnWordNet

Currently available versions of this resource

Upcoming versions of this resource

Resource information

License and access

Finnish FrameNet (finnframenet)

Currently available versions of this resource

Upcoming versions of this resource

Resource information

License and access

Publications

FinEst BERT (finestbert)

Currently available versions of this resource

Upcoming versions of this resource

Resource information

License and access

ANEE Lexical Networks

News

Contact