Finnish OpenSubtitles 2017 and Finnish Wikipedia 2017 (source materials) in the download service

The corpora Finnish OpenSubtitles 2017, source and Finnish Wikipedia 2017, source are available in the download service at Kielipankki in korp.csc.fi/download.

Finnish OpenSubtitles 2017 contains Finnish subtitles in movies and TV-series from Opensubtitles.org. Resource metadata: http://urn.fi/urn:nbn:fi:lb-2019110801

Finnish Wikipedia 2017 contains all Finnish articles from the online encyclopedia Wikipedia available in 1 January 2018. Resource metadata: http://urn.fi/urn:nbn:fi:lb-2019110803

Both corpora are part of Tatu Huovilainen’s research data and he has parsed the copora in 2018 with the Turku Dependency Treebank (TDT) parser developed by the Turku BioNLP-group and deposited the data to Kielipankki as source material. See Tatu Huovilainen’s interview at: https://www.kielipankki.fi/uutiset/kuukauden-tutkija-tatu-huovilainen/

The corpora are also part of the Finnish word frequency coprus Psycholinguistic Descriptives (http://urn.fi/urn:nbn:fi:lb-2018081601).