Mink

At kielipankki.fi/future/mink, a browser-based tool called Mink is available, where users logged in via Haka can upload their own text materials for processing. The file formats supported by Mink include plain text (UTF-8), XML (where the analysis pipeline preserves the structures), Microsoft Word (.docx), Open Document (.odt), PDF, and CoNLL-U.

You can perform advanced searches on your own text corpora within the Korp environment, which is accessible through the Mink service. If necessary, texts can first be automatically parsed and annotated in Mink, which improves Korp’s search capabilities. For now, the Mink platform supports lemmatization (i.e., the reduction of words to their base forms) as well as morphological and dependency-based syntactic analysis for Finnish, Swedish, and English text, and the recognition of named phrases in English text. In addition to Korp, you can also save the results of the analysis to your own computer.

With Mink, users can prepare, test, and explore their own Korp corpus. For now, only the user themselves can access the materials they have transferred to the Mink’s Korp environment. However, separate arrangements may be made to make the corpus available to other researchers through the Language Bank’s shared Korp service. At a later stage, the plan is to make it possible to share the data stored in Mink, for example, with members of one’s own research group.

For now, more detailed instructions on how to use Mink can be found on the Swedish Språkbanken website. Please note that the Mink environment developed by Språkbanken has been slightly adapted for users of the Finnish Kielipankki, so not all features may work in exactly the same way in both Mink services.

The Mink platform is currently being further developed, and the Language Bank welcomes feedback on its functionality; see contact information.

Access Mink

Mink (Språkbanken Text)

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2026042421

Mink – omien aineistojen analysointi ja vienti Korpiin

In English

Osoitteessa kielipankki.fi/future/mink on käytettävissä selainpohjainen työkalu Mink, johon Haka-kirjautuneet käyttäjät voivat viedä käsiteltäviksi omia tekstiaineistojaan. Minkissä tuetut aineistoformaatit ovat muotoilematon teksti (utf-8), XML (jonka sisältämät rakenteet analyysiputki säilyttää), Microsoft Word (.docx), Open Document (.odt), PDF ja CoNLL-U.

Omista tekstiaineistoista voi tehdä edistyneitä hakuja Mink-palvelun sisällä näkyvässä Korp-ympäristössä. Tekstit voi tarvittaessa ensin automaattisesti jäsentää ja annotoida Minkissä, jolloin Korpin hakumahdollisuudet paranevat. Mink-alusta tukee toistaiseksi lemmatisointia (eli sanojen perusmuotoistusta) sekä morfologista ja dependenssisyntaktista analyysia suomen-, ruotsin- ja englanninkieliselle tekstille sekä nimettyjen ilmausten tunnistusta englanninkieliselle tekstille. Korpin lisäksi analyysin tulokset voi myös tallentaa takaisin omalle koneelle.

Minkin avulla käyttäjä voi siis valmistella, kokeilla ja tutkia omaa Korp-korpustaan. Toistaiseksi vain käyttäjä itse pääsee käyttämään Minkin Korp-ympäristöön siirtämäänsä aineistoa. Erikseen voidaan kuitenkin sopia korpuksen toimittamisesta muiden tutkijoiden saataville Kielipankin yhteisen Korp-palvelun kautta. Myöhemmässä vaiheessa on tarkoitus, että Minkissä olevaa aineistoa olisi mahdollista jakaa esimerkiksi oman tutkimusryhmän jäsenten kanssa.

Tarkempia ohjeita Minkin käyttöön löytyy toistaiseksi ruotsalaisen Språkbankenin sivuilta. Huomaa, että Språkbankenissa kehitettyä Mink-ympäristöä on jonkin verran sovitettu suomalaisen Kielipankin käyttäjiä varten, joten kaikki ominaisuudet eivät välttämättä toimi samalla tavalla molemmissa Mink-palveluissa.

Mink-ympäristöä kehitetään edelleen ja Kielipankki ottaa vastaan palautetta Minkin toimivuudesta, ks.yhteystiedot.

Avaa Mink

Mink (Språkbanken Text)

Tämän aineistoryhmäsivun pysyvä tunniste: http://urn.fi/urn:nbn:fi:lb-2026042422

COMEDI

COMEDI is a Web-based editor for CMDI-conformant metadata, as adopted by CLARIN, hosted by the CLARINO Bergen CLARIN Centre.

With COMEDI, you can interactively create new CMDI Metadata records, or upload and modify existing metadata. A metadata record in COMEDI can be exported as a CMDI XML file. It can also be harvested with OAI-PMH.

In order to use COMEDI, you have to sign in. Login is necessary to write protect your metadata records from other users.

The metadata for all resources provided by the Language Bank of Finland is maintained and provided using COMEDI.

For further information on how to use COMEDI, please see the Documentation

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2026012901

Lakitutka

Lakitutka, ”the Law Radar”, compiles public documents generated during the legislative process into a single search service, covering the entire preparatory process. At the same time, Lakitutka offers everyone better access to background material on legislation that affects their everyday lives.

Lakitutka’s document materials are retrieved from The Government Project Register Hankeikkuna, Eduskunta Open Data Online Service and Finlex Open Data service.

This content search tool was developed at the University of Turku.

Access the website

More information about the project

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2026012903

Lääketutka

Lääketutka, ”the Medicine Radar”, a real-time, open web service, provides analytics about health, medicine and symptom-related discussions in the Suomi24 discussion forum. It allows anyone to discover connections between drugs, symptoms and dosages – as they appear in the discussion data.

This content search tool was developed within a data science project by Futurice’s Chilicorn Fund and Citizen Mindscapes with data provided by Aller.

Access the website

More information about the project

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021101305

CLARIN Federated Content Search

This tool allows to run a centralized query from all the resources provided by CLARIN centers.

The Aggregator application is a part of the CLARIN-FCS common federated content search infrastructure. It serves as a user interface to perform queries to CLARIN- resources and display search results. The Aggregator communicates with components called endpoints, which are provided as a service by all centres who participate in the federated content search. Each endpoint provides access to one or more searchable resources. The user can select a specific resource or resources, based on the resource name or on the language, or search through all of them. The content of these resources is searched with the query supplied to the endpoint. The endpoint returns results to this query and the aggregator collects the responses from all the endpoints and displays them to the user.

Access the FCS Aggregator

Content Search Tutorial

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021101304

Text reuse in the Swedish-language press, 1645-1918

The search engine for searching and analyzing text reuse clusters in the Swedish-language press from 1645 to 1918 covers material from Finland, Sweden, and also the United States. The search engine is offered by the Society of Swedish Literature in Finland (SLS).

Find here the Guidelines for using the search engine.

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021101302

Texthammer

Texthammer is a search and analysis toolkit for parallel corpora provided by the University of Tampere.

For more details please see the user manual (pdf).

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021101111

Korp, Kielipankki version

Korp is a web-based concordance search service developed by Språkbanken (Dept. of Swedish, University of Gothenburg, Sweden). It can be used for corpus queries based on morphosyntactic analysis and various other features. Korp, Kielipankki version, gives its users access to extensive collections of texts in Finnish and Finland Swedish.

Manual for the version of Korp used by Kielipankki, the Language Bank of Finland

Latest version:
Korp version 9.1.0 Metadata and license Attribution instructions	Access to Korp

This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2021101103

Last modified on 2026-02-18

Search the Language Bank Portal:

Researcher of the Month: Milla Uusitupa

Contact

The Language Bank's technical support:
kielipankki (at) csc.fi
tel. +358 9 4572001

Requests related to language resources:
fin-clarin (at) helsinki.fi
tel. +358 29 4129317

More contact information