FIN-CLARIAH Deliverables (2024-2025)

<< FIN-CLARIAH Deliverables

This page outlines the project deliverables for 2024-2025 (completed).

FIN-CLARIAH Funding period 2024-2025

Module 1: Natural Language Processing (NLP)

W1.1 Text processing and annotation environments

D1.1.1 Named-entity annotation 2024-09
D1.1.2 Ingesting new unstructured resources 2025-11

W1.2 Speech processing and annotation

D1.2.1 Data collection for minority languages 2024-09
D1.2.2 Transcription service for minority languages 2025-09 2025-11

W1.3 Video processing and annotation

D1.3.1 Tools and guidelines for video processing 2025-06

Module 2: Language Research Infrastructure (LRI)

W2.1 Personal and Copyrighted Research Data

D2.1.1 Integrate environment for personal data 2024-09
D2.1.2 Framework for processing copyrighted data for verification of research 2025-09 2025-11

W2.2 Training environments

D2.2.1 Transformer training for specialised data 2024-12 2025-06
D2.2.2 Transformer adaptation for specialised data 2025-12

W2.3 Translation and Interpretation

D2.3.1 Remote access to text data repositories 2024-12 2025-09
D2.3.2 Remote access to video data repositories 2025-12

W2.4 Terminology

D2.4.1 Term definition discovery procedures 2024-09
D2.4.2 Initializing terminology collections 2025-12

Module 3: Structuring Data

W3.1 Data Management

D3.1.1 Comprehensive data versioning 2024-09
D3.1.2 Workflow automation and version syncing 2025-09

W3.2 Data Ingestion

D3.2.1 Ingestion of structured data from Finna (NLF) 2025-03 2025-06
D3.2.2 Ingestion of heritage and societal data from Sampo 2025-06
D3.2.3 Ingestion of multimodal societal data from the Web 2025-12

W3.3 Enrichment

D3.3.1 Automated metadata of archival data from NAF 2025-03 2025-06
D3.3.2 Automated harmonisation and enrichment of metadata 2024-12 2025-03
D3.3.3 Machine-learning -based enrichment of social media 2025-06 2025-09
D3.3.4 Machine-learning -based enrichment of textual and audio-visual social media contents 2025-11
D3.3.5 Forensic linguistics corpus and search interface C.R.I.M.E 2025-09
D3.3.6 Reliable image labelling with computer vision 2025-09

Module 4: Analyzing Structured Data

W4.1 Analytical Support for computational SSH

D4.1.1 Analysis of video stream interactions with AI solutions 2025-06 2025-09
D4.1.2 Analysis Tools for Multimodal Born-digital Social Media 2024-12
D4.1.3 Advanced analytic social media tools and data 2025-12
D4.1.4 Analysis of multimodal properties of naturalistic speech 2025-12
D4.1.5 Analysis of multimodal cultural heritage 2025-12
D4.1.6 Enrich survey data with register data and unstructured text 2025-06

Module 5: Information Interaction (IIA)

W5.1 Evidence-Based Infrastructure Development

D5.1.1 Community engagement: multim. societal data researchers 2024-09
D5.1.2 Community engagement: multim. heritage researchers 2025-06
D5.1.3 Evidence-based infrastructure development 2024-12
D5.1.4 Educational resource development 2025-12

 


Top of page

<< FIN-CLARIAH Deliverables

Last modified on 2026-06-05

Vastaa

Search the Language Bank Portal:
Minna Sääskilahti
Researcher of the Month: Minna Sääskilahti

 

Upcoming events


Contact

The Language Bank's technical support:
kielipankki (at) csc.fi
tel. +358 9 4572001

Requests related to language resources:
fin-clarin (at) helsinki.fi
tel. +358 29 4129317

More contact information