Lataukset
Downloads
home
up
Location:
Logged in as: (null)
Name
Size
Description
Digilib/
-
HFST-SweNER/
-
snowfrog/
-
SFNET/
-
SFNET Corpus
ELFA/
-
ELFA corpus
FBC/
-
Finnish Broadcast Corpus
FTC-B/
-
Finnish Text Collection
FTC/
-
Finnish Text Collection
UHLCS/
-
U Helsinki Language Corpus Server
taajuussanasto9996/
-
Frequency Lexicon of the Finnish Newspaper Language
LAS2/
-
Advanced Finnish Learners’ Corpus
giellagas-north/
-
Pohjoissaamen näytekorpus
ScotsCorr/
-
Helsinki Corpus of Scottish Correspondence
GeM-HTB/
-
A Multimodal Corpus of Tourist Brochures
helpuhe/
-
The Longitudinal Corpus of Finnish Spoken in Helsinki
Fenno-Ugrica/
-
Fenno-Ugrica
HCS/
-
Helsinki Corpus of Swahili 2.0
FinnWordNet/
-
FinnWordNet
la-murre/
-
The Finnish Dialect Syntax Archive
SSDC/
-
Skolt Saami Documentation Corpus
SNC1/
-
Kansalliskirjaston lehtikokoelman ruotsinkieliset n-grammit 1770-1940
FNC1/
-
Kansalliskirjaston lehtikokoelman suomenkieliset n-grammit 1820-2000
reittidemo/
-
Reitti A-siipeen
AMPH/
-
amph Corpus
DSPCON/
-
Aalto University DSP Course Conversation Corpus
italian-letters/
-
Italian Letters from the 16. Century
SKN/
-
Samples of spoken Finnish
Ylilauta/
-
Ylilauta Corpus
CEAL/
-
CEAL corpus
acquis-ftb3/
-
The Finnish Sub-corpus of the JRC-Acquis Multilingual Parallel Corpus
Suomi24/
-
The Suomi 24 Corpus
coca/
-
Corpus of Contemporary American English
coha/
-
Corpus of Historical American English
glowbe/
-
Corpus of Global Web-Based English
fvcc_v1/
-
Finnish Verbal Colorative Constructions
finka/
-
Raja-Karjalan korpus
opusparcus/
-
Open Subtitles Paraphrase Corpus for Six Languages
nlfcl/
-
Classics Library of the National Library of Finland
avoid/
-
Corpus of Age-related Voice Disguise
psychlingdesc/
-
Psycholinguistic Descriptives
rel-freq-fi-lit/
-
Relative frequencies in native and translated Finnish literary prose
opensubtitles-fi/
-
Finnish OpenSubtitles
wikipedia-fi/
-
Finnish Wikipedia
ORACC/
-
Open Richly Annotated Cuneiform Corpus
AI2D-RST/
-
A multimodal corpus of 1000 primary school science diagrams
finestbert/
-
FinEst BERT
montint-src/
-
Yves Montand in the USSR interviews, source
kipo/
-
Suomen viittomakielten kielipoliittinen ohjelma
aku-egg/
-
Puheen ja EGG:n samanaikaiset tallenteet
seuruu/
-
Follow-up Study of Dialects of Finnish, downloadable version
finntreebank/
-
Finnish TreeBank (FTB)
hc/
-
Helsinki Corpus of English Texts
termforum-src/
-
Terminology Forum Glossaries (selection), source
fi-parliament-asr/
-
Aalto Finnish Parliament ASR Corpus 2008-2020
wordvec/
-
Word embeddings trained with word2vec
tallvocabl2fi/
-
Measurements of 15 L2 Finnish learners' vocabularies
puhelahjat/
-
Donate Speech Corpus
coronavirus-ecorg/
-
The Coronavirus Corpus
movie-ecorg/
-
The Movie Corpus
uspenskij-4bat-par/
-
Parallel Corpus of the book "Four Battles", written by L. Uspenskij
textreuse-sv-src/
-
Text reuse clusters in the Swedish-language press 1645-1918
xmas-gospel-tts/
-
Christmas Gospel text-to-speech in four Uralic languages
tboneslim/
-
T-Bone Slim Corpus
digitala/
-
DigiTala (2019–2023)
ccmh-src/
-
Corpus of Old Church Slavonic Texts, source
KKS/
-
Karjalan kielen sanakirja (XML)
iijoki/
-
Iijoki Collection
YLE/
-
Archives of the Finnish Broadcasting Company
Wanca/
-
Collection of Uralic Languages
lonnrot/
-
Elias Lönnrot's letters
finsen/
-
FinnSentiment
lehdet90ff/
-
Finnish Magazines and Newspapers from the 1990s and 2000s
STT/
-
Finnish News Agency
finnish-tagtools/
-
Finnish Tagtools
finchat/
-
Finnish conversational chat corpus
semfinlex/
-
Finnish parliament and court documents
hfst-morphologies/
-
HFST morphologies for various languages
hallituskausi/
-
Hallituskausi Translation Memories
klk/
-
Kansalliskirjaston lehtikokoelma
eduskunta/
-
Plenary Sessions of the Parliament of Finland
TSK/
-
Sanastokeskus TSK
cfinsl/
-
Sign Language Corpora (in Finland)