<< List of all deliverables

FIN-CLARIAH D3.5.1: Text network analysis of political texts

Grant agreement: Academy of Finland no. 345610
Start date: 01-01-2022
Duration: 24 months

WP 3.5: Report on Text network analysis of political texts
Date of reporting: 06-06-2023

Report author: Kimmo Elo (University of Turku)
Contributors: Kimmo Elo, Veronika Laippala, Otto Tarkka (University of Turku)
Deliverable location: None so far, R Shiny GUI and GitHub repository will be made public in Q3/2023.


The WP’s main objective is to develop tools based on network analysis for the analysis of political texts. The tools will be made available both via a web-interface and as dedicated R packages. Three (3) tools are currently under development:

  1. A KWIC tool for FinParl corpus: This tool provides a user interface to query word embeddings with KWIC (Key Word In Context) method. The tool offers a simple, yet intuitive user interface built with R Shiny, with which the user can analyse key word embeddings of the FinParl corpus of plenary debates of the Finnish parliament (eduskunta). A beta version of this tool is already in the testing, the release is planned for Q3/2023.
  2. A tool for semantic and text network analysis and visualisations: Building on the KWIC tool, this tool will provide functionalities for vocabulary based content analysis of political text, for the comparison of different text networks, as well for dynamic text network analysis with a set of visualisation tools. These tools are currently under active development and testing, the production phase is expected to be completed in Q3/2023.
  3. A tool for analysing text reuse: This tool will offer functionalities to identify and analyse structural similarities of vocabulary-based text networks. Such structural patterns can help us to identify how phrases or longer text passages are re-used over time. The tool will also provide capabilities to identify patterns in concept embedding, a widely used strategy in political texts to frame different issues in the same (or similar) context. This tool is currently in planning, the active development and coding is expected to be completed in Q4/2023.

All these tools will be developed for and tested with the FinParl-corpus consisting of all plenary speeches of the Finnish eduskunta since 1907. All tools will access a tailored dataset maintained on a server at the University of Turku.

The FinParl-corpus used by this WP is structured according to the ParlaMint XML schema, so that – at least theoretically – the tools should be compatible with all corpora following the same ParlaMint schema. Our plan, however, is not to limit the analytical tools for the use with FinParl-corpus only. Instead, the tools will be designed to work with tidy data, and the WP provides tools to access relevant resources and to convert the working data in tidy data for further analysis.

Overall, the WP is proceeding quite well and mostly in schedule. We have a small, yet active research team bringing together expertise from social sciences and computational linguistics and being capable of developing tools for a wide audience. The team dynamics is at good level and regular internal meetings are used to discuss current issues, problems, and solutions. The WP also benefits from a big FIRI research grant of the Academy of Finland covering the years 2023–2025 and allowing us a greater room of manoeuvring for the planning of the WP’s future development.

Search the Language Bank Portal:
Krister Lindén
Researcher of the Month: Krister Lindén


Upcoming events


The Language Bank's technical support:
kielipankki (at) csc.fi
tel. +358 9 4572001

Requests related to language resources:
fin-clarin (at) helsinki.fi
tel. +358 29 4129317

More contact information