PLEASE NOTE: FinnWordNet information moved to this location on 2019-09-16. The URLs of FinnWordNet demos and file downloads have changed, and the current URLs are not necessarily final. The demos and file downloads were not available between March and June 2018. Some demos still do not work, and the feedback form is unavailable. We aplogize for the situation. (Updated 2019-09-16.)
FinnWordNet – the Finnish WordNet is a lexical database for Finnish. It is a part of the FIN-CLARIN infrastructure project.
FinnWordNet contains words (nouns, verbs, adjectives and adverbs) grouped by meaning into synonym groups representing concepts. These synonym groups are linked to each other with relations such as hyponymy and antonymy, creating a semantic network.
FinnWordNet can be used in language technology research and applications. It can also be used interactively as an electronic thesaurus.
The first version of FinnWordNet has been created by having the words of the original English (Princeton) WordNet (version 3.0) translated into Finnish by professional translators.
The most recent version of FinnWordNet is 2.0, released in October 2012. The persistent identifier of this version is urn:nbn:fi:lb-2014052714.
Even though FinnWordNet is not currently being actively developed, you can send feedback on it to fin-clarin (at) helsinki.fi.
Please note that the name of the resource is FinnWordNet (with a double n), not FinWordNet.
FinnWordNet data can be searched or viewed in a couple of different search interfaces or demos:
Dictionaries as HFST transducers: The FinnWordNet content has been encoded in various HFST finite-state transducers working as thesauri and translation dictionaries:
Please note that the transducers used in these demos are still based on FinnWordNet version 1.1.2, not the most recent 2.0.
The FinnWordNet data can be downloaded from the download service of the Language Bank of Finland as a ZIP package that contains the data in a couple of different formats:
For more information, please see the README file.
In addition, the package contains the WordNet 3.0 Grind program modified to support FinnWordNet data. Since the downloadable package contains the compiled database, you probably do not need the modified Grind unless you modify the FinnWordNet data (lexicographer files). Compiling the program requires a Unix, Linux or similar environment and a C compiler; please see the associated README file for more information.
Please also note that searching the FinnWordNet data files with the
wn search program requires a version patched by Debian.
Please note that the transducers are currently not available for download. We apologize for the situation. (2019-09-13)
The FinnWordNet (and Princeton WordNet) data is also used in the HFST finite-state transducers that work as Finnish or English thesauri or Finnish–English or English–Finnish translation dictionaries. The transducers recognize inflected forms of words, and the thesauri have variants generating synonyms in the same inflected form as the input word. More information about the transducers is available in the README file.
To use the transducers, you need either the full HFST library and tools (version 3.2.0 or later), the stand-alone HFST optimized lookup (version 1.3 or later) or the Java implementation of the optimized lookup (2011-05-23 or later).
The FinnWordNet data is also downloadable in the WN-LMF (WordNet Lexical Markup Framework) and Lemon (The Lexicon Model for Ontologies) XML formats from the Open Multilingual Wordnet site: download data ZIP package.
Frankie Robertson has made some technical corrections and changes to FinnWordNet data to make it work with the NLTK and extJWNL libraries. The corrected version is available in GitHub. The corrections are yet to be integrated to the official FinnWordNet data.
The FinnWordNet development project ran in 2010–2012. The development of FinnWordNet was funded by the FIN-CLARIN and META-NORD projects. The META-NORD project received funding from the European Union’s ICT Policy Support Programme as part of the Competitiveness and Innovation Framework Programme under grant agreement no. 270899.
The following people participated in the FinnWordNet project:
The FinnWordNet information page has been moved to the Portal of the Language Bank of Finland, and the download location to the Download service of the Language Bank of Finland. The information page has also been updated. Updating the demos and transducers is still partly in progress.
Version 2.0 of FinnWordNet data has been released with thousands of new word senses added and hundreds of existing ones corrected. The data is downloadable and in use in the Web search interface. The search interface no longer asks you to rate the synonymy of random words.
As of version 2.0, FinnWordNet has been extended beyond being a translation of Princeton WordNet by adding new synsets as hyponyms of existing synsets (without glosses and English translations). The new synsets correspond to senses of common Finnish compound words.
The primary data format is now a relational format. See the corresponding README file in the downloadable package for more information.
FinnWordNet 2.0 contains 120,449 synsets (2,790 more than version 1.1.2), 208,645 word senses (16,845 more), 140,515 unique words (9,251 more) and 244,742 translation relations (14,695 more). Some of the additions and corrections are based on the suggestions received from users of FinnWordNet. All feedback is welcome: fin-clarin (at) helsinki.fi.
For more information, please see the NEWS file in the downloadable package.