Researcher of the Month: Simon Hengchen

Simon Hengchen - kuva: Simon Hengchen
Photo: Simon Hengchen


Kielipankki – The Language Bank of Finland is a service for researchers using language resources. Simon Hengchen, postdoctoral researcher at University of Helsinki tells us about his ongoing research on The Finnish Sub-corpus of the Newspaper and Periodical Corpus of the National Library of Finland, Kielipankki Version.

Who are you?

I am Simon Hengchen, a postdoctoral researcher at the University of Helsinki, where I work in COMHIS — the Helsinki Computational History group. I am originally from Brussels, where I have obtained my PhD in Information Science.

What is your research topic?

For the past years, I have been focusing on computational approaches to lexical semantic change — when, how, and why words change meaning(s). In my doctoral work I have focussed on the detection of meaning change, but more recently I am looking at sense dynamics as well as proper methods to model them.

How is your research related to Kielipankki?

For a paper with historian colleagues Jani Marjanen and Mikko Tolonen from the University of Helsinki and Ruben Ros from Utrecht University, I have used the newspaper data from the National Library of Finland (2011) and made available by Kielipankki, The Finnish Sub-corpus of the Newspaper and Periodical Corpus of the National Library of Finland, Kielipankki Version to train diachronic word embeddings for Finnish. This helped us model ’semantic’ relations between certain nouns related to nation-building in Finland which we extracted through Korp API — an interface that allows us to query the annotation layers provided by Kielipankki. With these semantic relations between words of interest, we were then able to model the discourse about the concept of the ’nation’ in Finnish newspapers. This work took place within the context of the NewsEye project, which has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement number 770299.

Publications related to the corpora

HENGCHEN, S., ROS, R., and MARJANEN, J. (2019). A data-driven approach to the changing vocabulary of the ‘nation’ in English, Dutch, Swedish and Finnish newspapers, 1750-1950. In Proceedings of the Digital Humanities (DH) conference 2019, Utrecht, The Netherlands.


The FIN-CLARIN consortium consists of a group of Finnish universities along with CSC – IT Center for Science and the Institute for the Languages of Finland (Kotus). FIN-CLARIN helps the researchers in Finland to use, to refine, to preserve and to share their language resources. The Language Bank of Finland is the collection of services that provides the language materials and tools for the research community.

All previously published Language Bank researcher interviews are stored in the Researcher of the Month archive.