ANEE Idiolect Network Portal

This resource contains a portal with over 105,621 pages linked together. The pages contain lists of most similar neighbours, ranked by Double Mutual Rank (DOMUR) similarity measure, for 105,621 cuneiform texts exported from Oracc.

Latest versions/subcorpora:  
ANEE Idiolect Network Portal
icon-info-circle Metadata and license
Open the website

Word embeddings trained with word2vec (wordvec)

Suomeksi


Currently available versions of this resource

ShortnameName and metadataLicenseLocationCiteResource group and helpApplyPublication yearSupport level
ShortnameName and metadataLicenseLocationCiteResource group and helpApplyPublication yearSupport level

Upcoming versions of this resource

These resource versions are not yet available in the Language Bank of Finland.

ShortnameName and metadataLicenseFormatsSupport levelContact PersonResource group and helpLocationOther information
ShortnameName and metadataLicenseFormatsSupport levelContact PersonResource group and helpLocationOther information

Resource information

This resource collection contains word embeddings trained with word2vec from various corpora.

The embedding file is in a simple and easily parsed textual format produced by word2vec. The first line in the file gives the vocabulary size and dimension. Each line after that begins with a vocabulary item, followed by a space, followed by 128 floating point numbers (represented textually) each followed by a space.

License and access

  • All versions of this resource are available publicly (PUB).
  • Click on the license image to see the resource-specific license text.

 


This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2022041401

FinnONTO

Latest versions/subcorpora:  
FinnONTO – ONKI
icon-info-circle Metadata and license
Open the website

The ONKI service contains Finnish and international ontologies, vocabularies and thesauri needed for publishing content cost-efficiently on the Semantic Web. ONKI is published and maintained by Semantic Computing Research Group SeCo. It is part of the on-going project to build a national semantic web infrastructure to Finland (FinnONTO).

The service offers various ontologies under different categories like:
– General upper ontology
– Museum artifacts
– Music
– Design
– Health
– Photography
– Agriculture
– Government
– Literature
– Linguistics
– Literary research
– Cultural research
– Economics
– Seafaring
– Military

All ontologies are being merged into one ontology covering all the categories called The Finnish Collaborative Holistic Ontology (KOKO).

Most of the ontologies are multilingual. In the General upper ontology the names of concepts are in Finnish, Swedish and English, while for example in the Linguistics ontology the languages used are Finnish, Swedish, English, German and Estonian.


This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021093001

The Helsinki Term Bank for the Arts and Sciences

Suomeksi

The Helsinki Term Bank for the Arts and Sciences (HTB) is a multidisciplinary project which aims to gather a permanent terminological database for all fields of research in Finland. The project has created this Semantic MediaWiki platform, which offers a collaborative environment. This means that anyone can freely use it and also participate in the discussion about terms.

The Helsinki Term Bank for the Arts and Sciences
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Open the website

 

Detailed information on the content, user rights and licenses can be found from the metadata record.

 


This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021092002

 

Finnish WordNet (FinnWordNet)

Suomeksi


Currently available versions of this resource

ShortnameName and metadataLicenseLocationCiteResource group and helpApplyPublication yearSupport level
ShortnameName and metadataLicenseLocationCiteResource group and helpApplyPublication yearSupport level

Upcoming versions of this resource

These resource versions are not yet available in the Language Bank of Finland.

ShortnameName and metadataLicenseFormatsSupport levelContact PersonResource group and helpLocationOther information
ShortnameName and metadataLicenseFormatsSupport levelContact PersonResource group and helpLocationOther information

Resource information

The Finnish WordNet is a lexical database for Finnish. It is a part of the FIN-CLARIN infrastructure project.

FinnWordNet is licensed under the Creative Commons Attribution (CC-BY) 3.0 licence. As a derivative of the Princeton WordNet, FinnWordNet is also subject to the Princeton WordNet licence.

FinnWordNet contains words (nouns, verbs, adjectives and adverbs) grouped by meaning into synonym groups representing concepts. These synonym groups are linked to each other with relations such as hyponymy and antonymy, creating a semantic network.

FinnWordNet can be used in language technology research and applications. It can also be used interactively as an electronic thesaurus.

The first version of FinnWordNet has been created by having the words of the original English (Princeton) WordNet (version 3.0) translated into Finnish by professional translators.

Detailed information: http://www.kielipankki.fi/corpora/finnwordnet/

License and access

  • All versions of this resource are available publicly (PUB).
  • Click on the license image to see the resource-specific license text.

 


This page has a persistent identifier: http://urn.fi/urn:nbn:fi:lb-2014052714

Finnish FrameNet (finnframenet)

Suomeksi


Currently available versions of this resource

ShortnameName and metadataLicenseLocationCiteResource group and helpApplyPublication yearSupport level
ShortnameName and metadataLicenseLocationCiteResource group and helpApplyPublication yearSupport level

Upcoming versions of this resource

These resource versions are not yet available in the Language Bank of Finland.

ShortnameName and metadataLicenseFormatsSupport levelContact PersonResource group and helpLocationOther information
ShortnameName and metadataLicenseFormatsSupport levelContact PersonResource group and helpLocationOther information

Resource information

The database of Finnish semantic frames is based on the original English language FrameNet housed at the International Computer Science Institute in Berkeley, California. The Finnish FrameNet project started by collecting 90,592 examples of different frame examples from the original Berkeley FrameNet. The examples represented 866 different frames and the elements that evoke them.

The FinnFrameNet project is a part of the FIN-CLARIN consortium.

License and access

  • All versions of this resource are available publicly (PUB).
  • Click on the license image to see the resource-specific license text.

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021091403

FinEst BERT (finestbert)

Suomeksi


Currently available versions of this resource

ShortnameName and metadataLicenseLocationCiteResource group and helpApplyPublication yearSupport level
ShortnameName and metadataLicenseLocationCiteResource group and helpApplyPublication yearSupport level

Upcoming versions of this resource

These resource versions are not yet available in the Language Bank of Finland.

ShortnameName and metadataLicenseFormatsSupport levelContact PersonResource group and helpLocationOther information
ShortnameName and metadataLicenseFormatsSupport levelContact PersonResource group and helpLocationOther information

Resource information

FinEst BERT is a Bidirectional Encoder Representations from Transformers (BERT) multilingual model trained from scratch, covering three languages: Finnish, Estonian, and English. Used for various NLP classification tasks on the mentioned three languages, supporting both monolingual and multilingual/crosslingual (knowledge transfer) tasks. Whole-word masking used during data preparation and training; trained for 40 epochs with sequence length 128 and another 4 epochs with sequence length 512. FinEst BERT model published here is in pytorch format.

Corpora used:
Finnish – STT articles, CoNLL 2017 shared task, Ylilauta downloadable version
Estonian – Ekspress Meedia articles, CoNLL 2017 shared task
English – English wikipedia

License and access

  • This resource is available publicly (PUB).
  • Click on the license image to see the resource-specific license text.

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021091402

ANEE Lexical Networks

Team 1 of the Centre of Excellence in Ancient Near Eastern Empires (ANEE) has created a lexical portal that functions as a graphic semantic dictionary. Via this portal the user can explore semantic networks for one (or multiple) words that one is interested in. By following the links, one can also trace attestations back to the datasets in Korp (Oracc, Achemenet, and BALT) and from there to Open Richly Annotated Cuneiform Corpus (Oracc) and other resources.

Latest versions/subcorpora:  
Neo-Babylonian Lexical Networks
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Open the website
Neo-Babylonian Lexical Networks – the dataset
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Open the website
ANEE Lexical Networks v. 2.0
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Open the website
ANEE Lexical Networks v. 2.0 – the dataset
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Open the website
Archived versions:  
ANEE lexical portal of Akkadian: fastText
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Open the website
ANEE lexical portal of Akkadian: PMI
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Open the website
ANEE lexical portal of Akkadian: dataset
icon-info-circle Metadata and license
icon-quote-right Attribution instructions
Open the website

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021082001

Last modified on 2025-05-22

Search the Language Bank Portal:
Heini Kallio
Researcher of the Month: Heini Kallio

 

Upcoming events


Contact

The Language Bank's technical support:
kielipankki (at) csc.fi
tel. +358 9 4572001

Requests related to language resources:
fin-clarin (at) helsinki.fi
tel. +358 29 4129317

More contact information