New corpus version (beta): The Finnish Sub-corpus of the Newspaper and Periodical Corpus of the National Library of Finland, version 2

The Finnish Sub-corpus of the Newspaper and Periodical Corpus of the National Library of Finland version 2, Korp (klk-fi-v2-korp) is now available in Korp as a beta test version. The resource is part of the resource group ”KLK” in Kielipankki.

The corpus contains newspapers and periodicals of the digital collections of the National Library of Finland from the years 1771–2021. The corpus contains over 22 billion words in total, which is over four times as many as in the previous version of the corpus. In absolute terms, a large amount of new material is from the years 1911–1945 in particular, in relative terms also from 1946–2000, and as completely new from 2001–2021. The corpus has been split into subcorpora by the year of publication in the Korp corpus selector.

Resource metadata: http://urn.fi/urn:nbn:fi:lb-202009152
Open the corpus in Korp

Good to know

Please note that Korp searches may be slow because of the large size of the corpus, so you should at least at first try searching from a small subset of the corpus. If choosing a subset (years) of the corpus in the Korp corpus selector appears slow, you can try to restart the browser or try using a different browser.

The word picture feature for this corpus in Korp is not yet available for all years. The data for the word pictures are being imported, but it will take a few more weeks for the process to complete.

The language of the sentences of the corpus has been identified with the HeLI-OTS language identifier. The corpus contains those texts (pages of publications) on which at least one sentence has been identified as Finnish from the following sources: The Finnish Sub-corpus of the Newspaper and Periodical Corpus of the National Library of Finland (version 1); The Swedish Sub-corpus of the Newspaper and Periodical Corpus of the National Library of Finland (version 1); and material produced by the National Library of Finland after 2013 that has not previously been available in the Language Bank of Finland.

The links in the corpus to page images and PDF files lead to the digital collections of the National Library of Finland. For most publications since 1940, page images and PDF files may only be used for research purposes, and accessing them requires logging in to the service of the National Library and submitting an application.

As always, feedback on the corpus is welcome, either via the feedback form or by email to fin-clarin (at) helsinki.fi.

 

A trend diagram of the hits for different forms of the word 'korona' in the corpus klk-fi-v2 available via Korp.

Concordance view of different forms of the word 'korona' in the corpus klk-fi-v2, available via Korp.