<< List of all deliverables

D3.3.2: Automated harmonisation and enrichment of metadata

Project: FIN-CLARIAH
Grant agreement: Research Council of Finland no. 358720
Start date: 01-01-2024
Duration: 24 months

WP 3.3: Report on Automated harmonisation and enrichment of metadata
Date of reporting: 18-03-2025

Report authors: Akewak Jeba (University of Turku), Leo Lahti (University of Turku)
Contributors: Julia Matveeva (University of Turku), Muluh Geraldson (University of Turku)
Deliverable location: github.com/fennicahub (see below for specific outputs)

Keywords: data science, metadata, bibliographies, enrichment

Description

This deliverable provides resources for gathering, harmonizing, enriching, and summarizing structured metadata from the Finnish National Library, in particular the National Bibliography Fennica. The open data and workflows can be used in research, training, and outreach. Further metadata resources are available for complementary cultural heritage from archives, libraries, museums, and other actors. This deliverable expands the scope of the metadata collections that are seamlessly interlinked with statistical environment, enhancing the integration of Finna and Finto with Fennica.

Earlier work with Fennica, including metadata harmonization and visualization workflows, is described in FIN-CLARIAH (2022–23) Deliverable D4.1.3, which focused on preparing and publishing the cleaned Fennica dataset along with interactive tools for analysis and presentation.

This deliverable consists of the following resources:

1. Systematic approach to retrieve Finna metadata into open computing environments is implemented as open software finna. This uses REST API and OAI-PMH API for data retrieval. The release version is available through CRAN repository.

2. Data science methods to enrich structured metadata from Finna and Fennica are provided via the finto R package based on actor cross-linking. This provides fluent access to Finto keyword service (finto.fi) from R statistical environment and allows interaction with Finto service. Examples regarding Fennica author enrichment using Kanto/Finto are available via the package vignette.

3. Data analysis and visualization techniques to support the research use of cultural heritage metadata collections are provided via the finna package and demonstrated in the package vignettes. Geospatial analysis and visualization of metadata from Finna and Fennica is further supported by the maintained geofi package.

Resource links:

  • finna, an open source R package for collecting cultural metadata using the Finna API.
  • finto, an open source R package for retrieving vocabulary data and for enriching the metadata using the Finto API.
  • geofi, supporting the maintenance of open source R package for accessing Finnish Geospatial Data and visualisation.

FIN-CLARIAH project has received funding from the European Union – NextGenerationEU instrument and is funded by the Research Council of Finland under grant number 358720.

Search the Language Bank Portal:
Pekka Posio
Researcher of the Month: Pekka Posio

 

Upcoming events


Contact

The Language Bank's technical support:
kielipankki (at) csc.fi
tel. +358 9 4572001

Requests related to language resources:
fin-clarin (at) helsinki.fi
tel. +358 29 4129317

More contact information