
This page showcases the project deliverables (see template and instructions for reporting).
FIN-CLARIAH Funding period 2024-2025 (in progress)
FIN-CLARIAH Funding period 2026-2029 (forthcoming)
FIN-CLARIAH Funding period 2022-2023 (completed)
| D1.1.1 | Named-entity annotation | 2024-09 |
| D1.1.2 | Ingesting new unstructured resources | 2025-11 |
| D1.2.1 | Data collection for minority languages | 2024-09 |
| D1.2.2 | Transcription service for minority languages |
| D1.3.1 | Tools and guidelines for video processing | 2025-06 |
| D2.1.1 | Integrate environment for personal data | 2024-09 |
| D2.1.2 | Framework for processing copyrighted data for verification of research |
| D2.2.1 | Transformer training for specialised data | |
| D2.2.2 | Transformer adaptation for specialised data | 2025-12 |
| D2.3.1 | Remote access to text data repositories | |
| D2.3.2 | Remote access to video data repositories | 2025-12 |
| D2.4.1 | Term definition discovery procedures | 2024-09 |
| D2.4.2 | Initializing terminology collections | 2025-12 |
| D3.1.1 | Comprehensive data versioning | 2024-09 |
| D3.1.2 | Workflow automation and version syncing | 2025-09 |
| D3.2.1 | Ingestion of structured data from Finna (NLF) | |
| D3.2.2 | Ingestion of heritage and societal data from Sampo | 2025-06 |
| D3.2.3 | Ingestion of multimodal societal data from the Web | 2025-12 |
| D3.3.1 | Automated metadata of archival data from NAF | |
| D3.3.2 | Automated harmonisation and enrichment of metadata | |
| D3.3.3 | Machine-learning -based enrichment of social media | |
| D3.3.4 | Machine-learning -based enrichment of textual and audio-visual social media contents | 2025-11 |
| D3.3.5 | Forensic linguistics corpus and search interface C.R.I.M.E | 2025-09 |
| D3.3.6 | Reliable image labelling with computer vision | 2025-09 |
| D4.1.1 | Analysis of video stream interactions with AI solutions | |
| D4.1.2 | Analysis Tools for Multimodal Born-digital Social Media | 2024-12 |
| D4.1.3 | Advanced analytic social media tools and data | 2025-12 |
| D4.1.4 | Analysis of multimodal properties of naturalistic speech | 2025-12 |
| D4.1.5 | Analysis of multimodal cultural heritage | 2025-12 |
| D4.1.6 | Enrich survey data with register data and unstructured text | 2025-06 |
| D5.1.1 | Community engagement: multim. societal data researchers | 2024-09 |
| D5.1.2 | Community engagement: multim. heritage researchers | 2025-06 |
| D5.1.3 | Evidence-based infrastructure development | 2024-12 |
| D5.1.4 | Educational resource development | 2025-12 |
Completed
| D1.1.1 | Updating LBF resource selection | 2022-09 |
| D1.1.2 | Ingesting new unstructured resources | 2023-12 |
| D1.2.1 | Forced-Alignment Service | 2022-09 |
| D1.2.2 | Transcription Service for Finnish Interviews | 2023-09 |
| D1.3.1 | Corpora of non-standard language | 2022-09 |
| D1.3.2 | System for detecting toxic language | 2023-06 |
| D1.3.3 | Models for retrieving QA pairs from the web | 2023-09 |
| D1.3.4 | QA pair corpora | 2023-12 |
| D2.1.1 | Licensing agreements for personal data | 2022-09 |
| D2.1.2 | Licensing agreements for special categories | 2023-06 |
| D2.2.1 | Speech recognition for L2 | 2022-12 |
| D2.2.2 | Speech recognition for L2 update | 2023-12 |
| D2.3.1 | Licensing interpretation sessions | 2022-12 |
| D2.3.2 | Aligning and retrieving | 2023-12 |
| D2.4.1 | Term discovery procedures | 2022-09 |
| D2.4.2 | Terminology application | 2023-06 |
| D2.4.3.1 | Initializing terminology collections | 2022-09 |
| D2.4.3.2 | Initializing terminology collections | 2023-06 |
| D2.4.3.3 | Initializing terminology collections | 2023-12 |
| D2.5.1 | Test performances storage | 2022-12 |
| D2.5.2 | Analysis and annotation tools for learner performances | 2023-12 |
| D3.1.1 | Initial NLF data | 2022-09 |
| D3.1.2 | Ingestion framework | 2022-12 |
| D3.1.3 | Versioning support | 2023-06 |
| D3.1.4 | Incremental update process | 2023-12 |
| D3.2.1 | Pipeline for transferring archival data | |
| D3.2.2 | Annotation & analysis tools for NARC data | 2023-12 |
| D3.3.1 | Qualitative survey data concept network | 2022-09 |
| D3.3.2 | R package for data concept network |
| D3.4.1 | Livestream data collector | 2022-12 |
| D3.5.1 | Text network analysis of political texts | |
| D3.5.2 | Text network analysis of political texts |
| D4.1.1 | Harmonized FNB | 2022-09 |
| D4.1.2 | Harmonization code | 2022-12 |
| D4.1.3 | Visualisation workflow | 2023-06 |
| D4.1.4 | R/Python module | 2023-12 |
| D4.2.1 | LDF knowledge extraction tools | 2022-12 |
| D4.2.2 | Parliament of Finland Ontology | 2023-12 |
| D4.3.1 | Subsetting tool | 2022-09 |
| D4.3.2 | Statistical overviews and bias detection | 2023-06 |
| D4.3.3 | Representative Twitter dataset | 2023-12 |
| D5.1.1 | User experience questionnaire | 2022-09 |
| D5.1.2 | Log data collection and analysis | 2023-06 |
| D5.1.3 | Protocol for collecting workshop data | 2023-12 |
| D5.2.1 | Actor network | 2022-12 |
| D5.2.2 | Educational material | 2023-12 |
Last modified on 2025-11-27
