Project: FIN-CLARIAH
Grant agreement: Research Council of Finland no. 358720
Start date: 01-01-2024
Duration: 24 months
WP 3.3: Report on Automated harmonisation and enrichment of metadata
Date of reporting: 18-03-2025
Report authors: Akewak Jeba (University of Turku), Leo Lahti (University of Turku)
Contributors: Julia Matveeva (University of Turku), Muluh Geraldson (University of Turku)
Deliverable location: github.com/fennicahub (see below for specific outputs)
Keywords: data science, metadata, bibliographies, enrichment
This deliverable provides resources for gathering, harmonizing, enriching, and summarizing structured metadata from the Finnish National Library, in particular the National Bibliography Fennica. The open data and workflows can be used in research, training, and outreach. Further metadata resources are available for complementary cultural heritage from archives, libraries, museums, and other actors. This deliverable expands the scope of the metadata collections that are seamlessly interlinked with statistical environment, enhancing the integration of Finna and Finto with Fennica.
Earlier work with Fennica, including metadata harmonization and visualization workflows, is described in FIN-CLARIAH (2022–23) Deliverable D4.1.3, which focused on preparing and publishing the cleaned Fennica dataset along with interactive tools for analysis and presentation.
This deliverable consists of the following resources:
1. Systematic approach to retrieve Finna metadata into open computing environments is implemented as open software finna. This uses REST API and OAI-PMH API for data retrieval. The release version is available through CRAN repository.
2. Data science methods to enrich structured metadata from Finna and Fennica are provided via the finto R package based on actor cross-linking. This provides fluent access to Finto keyword service (finto.fi) from R statistical environment and allows interaction with Finto service. Examples regarding Fennica author enrichment using Kanto/Finto are available via the package vignette.
3. Data analysis and visualization techniques to support the research use of cultural heritage metadata collections are provided via the finna package and demonstrated in the package vignettes. Geospatial analysis and visualization of metadata from Finna and Fennica is further supported by the maintained geofi package.
Resource links:
FIN-CLARIAH project has received funding from the European Union – NextGenerationEU instrument and is funded by the Research Council of Finland under grant number 358720.