Kielipankki – The Language Bank of Finland is a service for researchers using language resources. Krista Lagus, docent of the University of Helsinki, tells us about her research on The Suomi 24 Corpus.
I am Krista Lagus, professor in digital social science at the University of Helsinki, faculty of Social Sciences, Centre for Research Methods. I also participate in the collaboration network in Digital Humanities, HELDIG.
At the moment I am doing research within the Citizen Mindscapes consortium focusing on the emotional waves, types of interaction and topics of discussion in the social media. With the focus specifically on the Suomi24 discussions with an aim of identifying different perspectives and means of research, especially those motivated by the social sciences. We aim to develop interfaces and tools deriving from some of these means of research, in time, for social sciences researchers, as part of digital humanities. The latest result is Lääketutka that sheds light to the discussions of people over the medication, the symptoms and their health, from a completely novel perspective. It is available at www.laaketutka.fi.
Otherwise I have done research also on adapting methods from machine learning and neural networks to modelling different fields of language research. Such include for example inducing morphological segmentation with adaptive methods, modelling concept systems, as well as modelling discussion topics. There have often been practical applications calling for these; topic modelling was, for example, applied to data mining and exploring data in large text corpora. My background is in information technology within which I obtained my doctorate at the Helsinki University of Technology in 2000.
When it became evident that we wanted to open the Suomi 24 discussions to be available by all researchers, the size of the resource, appr. 70 million messages, was both a special challenge and a change. The Language Bank of Finland and FIN-CLARIN was a natural and a sufficiently solid partner for our project. The collaboration led to the publication of the Suomi24 resource owned by Aller, for all interested parties via the interactive user interface in the concordance tool Korp. The entire language resource is also licensed for download for academic non-commercial research purposes.
The FIN-CLARIN consortium consists of a group of Finnish universities along with CSC – IT Center for Science and the Institute for the Languages of Finland (Kotus). FIN-CLARIN helps the researchers in Finland to use, to refine, to preserve and to share their language resources. The Language Bank of Finland is the collection of services that provides the language materials and tools for the research community.
All previously published Language Bank researcher interviews are stored in the Researcher of the Month archive.