<< List of all deliverables

D3.1.1: Initial NLF Data

Grant agreement: Academy of Finland no. 345610
Start date: 01-01-2022
Duration: 24 months

WP 3.1: Report on Initial NLF Data
Date of reporting: 2022-09

Report author: Johanna Lilja (National Library of Finland), Tuula Pääkkönen (National Library of Finland)
Contributors: Martin Matthiesen (CSC)
Deliverable location: https://github.com/CSCfi/kielipankki-nlf-harvester


Basic concept of how the data is downloaded exists. Technology defined (Apache airflow for workflow management) has been chosen. Script created for downloading METS XML, and then ALTO XML files via Airflow. CSC Project created with necessary data requests.

More information

FIN-CLARIAH WP3.1 presentation from DARIAH-FI workshop on November 9th, 2022.

Search the Language Bank Portal:
Lotta Leiwo
Researcher of the Month: Lotta Leiwo


Upcoming events


The Language Bank's technical support:
kielipankki (at) csc.fi
tel. +358 9 4572001

Requests related to language resources:
fin-clarin (at) helsinki.fi
tel. +358 29 4129317

More contact information