22.12.2025
Newsletter of the Language Bank of Finland 2/2025
Suomeksi
Researchers of the Month in 2025
- Tamás Grósz – automatic speech recognition (ASR)
- Marko Jouste – Giellagas Corpus of Spoken Sámi languages
- Simo Määttä – translation studies, interpreting, critical sociolinguistics
- Pekka Posio – Ibero-Romance Languages, CoLaGe Corpus
- Daniela Piipponen – historical linguistics, Digisvenska project
- Jörg Tiedemann – language technology, OPUS ecosystem
- Rea Peltola – semantics of animacy
- Idastiina Valtasalmi – linguistic affect, Easy Language
- Inka Rantakallio – Finnish female and non-binary rap artists
- Dejan Porjazovski – systems for spoken language understanding (SLU)
- Krista Ojutkangas – corpus-based Finnish language research
- Satu Siltaloppi – Finland-Swedish Sign Language
All previous researchers of the month can be found in the archive.
Do you know researchers who use the Language Bank of Finland and who might be good candidates for Researcher of the Month? Would you be one of them? Inform us: https://www.kielipankki.fi/support/contact-us/
New, updated or extended corpora in 2025
- The Parsed Corpus of Scottish Correspondence, source
- Achemenet Babylonian texts – Kielipankki version 2020-12, Korp
- BALT: Babylonian Administrative and Legal Texts – Kielipankki version 2025-02, Korp
- The extension of the Suomi24 Corpus covering the years 2021-2023, as well as an augmented version of the years 2001-2020 in Korp
- The extended and updated corpus collection Suomi24 covering the years 2001-2023 available in VRT format via the download service
- Neo-Babylonian Lexical Networks and the underlying dataset
- Oracc 2024 (Open Richly Annotated Cuneiform Corpus, Korp version, November 2024)
- The Donera Prat corpus of Finland-Swedish speech, for academic research use
- New resource: The Corpus of Finnish Sign Language, part 3
- The Corpus of Finland-Swedish Sign Language: conversations, part 2
- Yle Finnish News Archive 2022-2024, source
- Linguistic Landscape of Bishkek, source
- Corpus for the study of Language and Gender in Mexico and Spain (CoLaGe), source and text-only versions
- The Corpus of Sociolinguistic Variation in the Province of Satakunta, source
- Citation Database of Fennistic Dialect Dissertations, open version, source
- Feminist women and non-binary rappers in Finland, source
- Yle Finnish News Archive 2022–2024 (release candidate)
Would you like to offer your own resource to be distributed via Kielipankki?
Submit the basic details about your own resource to the Language Bank of Finland: http://urn.fi/urn:nbn:fi:lb-2021121422
Corpora from the Language Bank are now available for Federated Content Search
The Language Bank of Finland is connected to CLARIN Federated Content Search (FCS). Via FCS, you can perform uniform queries from various text corpora in the CLARIN repositories in different countries. From the Language Bank, the Finnish and the Swedish subcorpora of The Newspaper and Periodical Corpus of the National Library of Finland as well as the latest Suomi24 corpus are included at the moment. Try Federated Content Search
Updates coming soon: Korp and Mink
- A new version of the Korp concordancing service will be made available for beta testing in spring 2026. The main goal of the upgrade is to synchronize the Korp version in the Language Bank of Finland with the most recent version of the Korp platform developed and provided by Språkbanken (the Language Bank in Sweden). Please note that it will not be possible to make all corpora with all the features instantly available on the new Korp platform. During the transition period, the new beta version and the old Korp version will be available in parallel. Please follow the Language Bank website for further information on the schedule.
- The Mink platform will also be made available for the users of the Language Bank of Finland in spring 2026. In Mink, users can login with their home university user account, upload their own texts, annotate them with automatic tools if required, and query the resulting private corpus via Korp. Read more about Mink in Språkbanken.
Events in 2025
- The Origins of Emesal workshop (Helsinki, June 2025) was attended by about 50 Assyriologists both on-site and via livestream. Several resources produced by the ANEE Centre of Excellence are available via the Language Bank: Achemenet, BALT, ORACC, and the ANEE Lexical Portal.
- The LAREINA workshop Accessing Data for Large Language-based Text and Speech Models (Helsinki, November 2025) brought together experts from industry, academia, and public administration. The workshop addressed issues of data access and related legislation, and showcased examples of language technology solutions that can be built in collaboration with language minorities, by leveraging archives, or to meet public sector service needs.
- The FIN-CLARIAH plenary meeting Annotating Social Data (Helsinki, November 2025) discussed the current practices in annotating datasets across humanities and social sciences, the possibilities for processing sensitive data, and the responsible use of AI tools in the annotation pipeline.
The Language Bank is a beta tester of the Language Data Space Connector
The European Language Data Space (LDS) aims at building a data marketplace for the exchange of language resources in the public and the private sectors, in line with the EU Data Strategy. The Language Bank of Finland is involved in testing the LDS Connector software that is being developed by the LDS project.
Courses and training materials
The online course Corpus Linguistics and Statistical Methods (5 ECTS) will be offered again 12.1.–27.2.2026. The course is open free of charge to university degree students from all universities, given that the maximum group size is not exceeded.
News from the FIN-CLARIAH research infrastructure
FIN-CLARIAH Roadshow
The FIN-CLARIAH project will be presented in different locations through a series of roadshow events where you can learn more about current research, methods, tools and services in the humanities and social sciences. Of course, the Language Bank of Finland will also be there! The next event is currently being planned for early 2026 at Aalto University. More information about the roadshow events will be updated on the Language Bank website.
CLARIN funding opportunities
Did you know that CLARIN offers grants for, e.g., researcher and teacher mobility, events and training activities? Check out the funding opportunities and current calls: https://www.clarin.eu/funding
The Language Bank of Finland is on vacation during 23.12.2025–6.1.2026
We wish you a relaxing holiday season!
Mietta Lennes and Wilhelmina Dyster
Project Planners
fin-clarin@helsinki.fi
Subscribe/unsubscribe to this newsletter: https://www.kielipankki.fi/language-bank/newsletter-subscription/
See also CLARIN Newsflash: https://www.clarin.eu/content/newsflash
