Kielipankki – The Language Bank of Finland is a service for researchers using language resources. Katri Leino, doctoral candidate at the Aalto University tells us about how she makes use of the resources Yle News Archive Easy-to-read Finnish 2011-2018, source and The Suomi 24 Corpus in her research.
I am Katri Leino and I am a PHD student in Mikko Kurimo’s Speech Recognition group at Aalto University. I earned my master’s degree (tech.) from Aalto University. My master thesis was about adapting speech recognition models to a certain environment or a speaker. User experience has always been one of my interests. For my PHD, I wanted to combine the technical knowledge and methods from natural language processing (NLP) field with a human perspective of the human-computer interaction (HCI) field. Therefore, I asked Antti Oulasvirta, leader of User Interfaces group at Aalto University, to be my instructor while Mikko Kurimo supervises my studies. Oulasvirta’s group is highly focused on user modelling which provides a new perspective to NLP research.
In my main research project, I research how Finns type with their smartphones. The project is funded by the Foundation of Emil Aaltonen. Finns often complain that typing with touch keyboard is frustrating because it is difficult to hit the right keys and the predictive methods such as auto-correction do not work well for Finnish. I want to find ways to improve the typing experience and also see how the style of the language affects typing.
We are currectly collecting typing samples with the Typing Test at http://typingtest.aalto.fi (the Finnish version: kirjoitustesti.aalto.fi). The Typing Test works on a browser. The task for the participants is to type given sentences as correctly and fast as possible. All keypresses and timestamps are saved as a data set which will be published 2020. The English data set was published this year (https://userinterfaces.aalto.fi/typing37k/ ). Our aim for the Finnish data is to collect typing samples with a wide variety of skill levels to have better understanding on successful strategies and challenges.
In the Finnish Typing Test, we sampled the sentences from Kielipankki’s Yle’s easy-to-read news corpus Yle News Archive Easy-to-read Finnish 2011-2018, source and The Suomi 24 Corpus. When measuring typing speed, easy-to-remember sentences are recommended so participant does not have to check sentence many times while typing. Easy-to-read news were suitable for the test for that reason. Suomi24 corpus was selected because I also wanted to include real conversational sentences to see if the text style makes affects typing speed.
The FIN-CLARIN consortium consists of a group of Finnish universities along with CSC – IT Center for Science and the Institute for the Languages of Finland (Kotus). FIN-CLARIN helps the researchers in Finland to use, to refine, to preserve and to share their language resources. The Language Bank of Finland is the collection of services that provides the language materials and tools for the research community.
All previously published Language Bank researcher interviews are stored in the Researcher of the Month archive.