The online course Corpus Linguistics and Statistical Methods is intended for students in languages or other fields who wish to learn the basics of using corpora.
The course is offered in Finnish and in English and it is open to all university students in and outside Finland. The course has already started on Monday 4th September, but it will be possible to join the course area until 15th September.
Read more and register for the course!
NB: The same course will be organized again in period 3, starting on 15 Jan 2024 (see the course page).
The 24th INTERSPEECH Conference was held on 20-24 August 2023 in Dublin, Ireland. At the conference, Mietta Lennes from the Language Bank of Finland presented a poster, based on the following conference article:
Lennes, M., Toivola, M. (2023). Pitch distributions in a very large corpus of spontaneous Finnish speech. Proceedings of INTERSPEECH 2023, 4778-4782, doi: 10.21437/Interspeech.2023-1822.
This page has a persistent identifier: http://urn.fi/urn:nbn:fi:lb-2023081621
Poster and supplementary materials presented at Interspeech 2023, 20.-24.8.2023, Dublin, Ireland.
Last updated: 2023-08-25
This page contains a picture of the poster presented at the conference and some additional figures and details about the piece of research in question. For further information, please contact Mietta Lennes.
(Click to download the image as a pdf document)
The pitch data calculated for this paper will be published as an online dataset. The link to the data will be added on this page.
Mietta Lennes & Minnaleena Toivola (2023). Pitch distributions in a very large corpus of spontaneous Finnish speech. Poster and supplementary materials. Interspeech 2023, 20.-24.8.2023, Dublin, Ireland. Available: http://urn.fi/urn:nbn:fi:lb-2023081621.
Lennes, M., Toivola, M. (2023). Pitch distributions in a very large corpus of spontaneous Finnish speech. Proc. INTERSPEECH 2023, 4778-4782, doi: 10.21437/Interspeech.2023-1822
11.00-11.10 Welcoming Words by Katri Tegel, Development Manager, CSC
11.10-12.00 Keynote: Mikko Kurimo, Professor of Speech and Language Processing, Aalto University
12.00-13.00 Lunch
13.00-15.00 Thematic Groups
15.00-15.30 Coffee
15.30-16.15 Sharing the Results from the Groups
16.15-17.00 Free Chilling & Refreshments / Parallel session: Executive Board Meeting (with Zoom option)
11.00-12.00 Jari Ojala: Welcoming words + Pasi Tyrväinen: Keynote
12.00-13.00 Lunch
13.00-13.15 Anna Sendra Toset: Results from FIN-CLARIAH interviews
13.15-14.30 Teamwork in thematic groups:
14.30-15.00 Coffee
15.00-16.00
16.00-17.00 Socializing & refreshments (Executive board meeting)
The open online course Data Clinic kicks off on 11th November 2022 and ends in late April 2023. During the winter and spring, you learn to write a Data Management Plan and get practical advice and support for collecting, processing and managing your research data. The participants will be working partly independently and partly in small groups of peers. You may attend the entire course remotely.
The course materials will be provided mainly in English. Students from all universities and all fields are welcome if space allows. The only prerequisite is that you are already starting a research project where you need to process and manage a data set that contains text documents or speech recordings, i.e., some language data.
Read more and join the course by 28.11.2022!
In this online course, you get a grip of special tools that are available for transcribing and studying speech samples. You also learn about collecting and managing a speech corpus of your own. During the course, you will actively use the Praat program and get familiar with ELAN, too.
The course is open to students in all universities and you can take it either in Finnish or in English. The number of participants may be restricted if required. The course will be taught by Mietta Lennes and Juraj Šimko at the University of Helsinki.
The course has already begun, but you may still enrol and join in until 11th November.
Further information and link to the course on Moodle
To view or download the PDF version, click on the image.
The kick-off get-together of the FIN-CLARIAH infrastructure project is held in the premises of the National Library on 3.6.2022. You can see the posters online on the event page.
The online course Corpus Linguistics and Statistical Methods (Korpuslingvistiikka ja tilastolliset menetelmät, 5 credits) will be offered again during 17.1.–6.3.2022. This course can be taken either in Finnish or in English.
The total number of participants will be restricted, but it will be possible to participate the course from outside the University of Helsinki and even from outside Finland. If you are a student from outside the University of Helsinki, please find further details and the link for joining the Moodle area on the course home page (see below). Students from the University of Helsinki should first register via Sisu.
Registration for the course is open until 28.1.2022 (unless the maximum number of participants is exceeded before then).
Find more courses and training by Kielipankki
The Donate Speech campaign, where the Language Bank of Finland has been involved, was awarded with PRIX EUROPA: Best European Digital Audio Project of the Year 2021 (see https://www.prixeuropa.eu/news/2021/10/15winners-y4emh). The award ceremony took place in Potsdam, Germany on 15th October, 2021.
Earlier this year, Donate Speech also won the national Grand One award for Best Mobile Service of the Year, including a distinction for Best Use of Data.
Donate Speech is a joint project of Yle – the Finnish Broadcasting Company, Vake Oy (current Ilmastorahasto), Solita, Aalto University and the University of Helsinki.
If you speak and understand Finnish, you can donate your speech here!
On 29th October 2021, the Language Bank of Finland and the Donate Speech campaign (Lahjoita puhetta) were awarded by the University of Helsinki in recognition of exceptional work in promoting the accessibility and reusability of research data. In addition to the Language Bank, the award was given to Research Coordinator Kati Lassila-Perini.
In the award ceremony, Research Director Krister Lindén gave a presentation that is now available on YouTube with English subtitles. Read more about the award on the website of the University of Helsinki.
In this online course, you get a grip of special tools that are available for transcribing and studying speech samples. You also learn about collecting and managing a speech corpus of your own. During the course, you will actively use the Praat program and get familiar with ELAN, too.
The course is open to students in all universities and you can take it either in Finnish or in English. The number of participants may be restricted if required. The course will be taught by Mietta Lennes and Juraj Šimko at the University of Helsinki.
Join the course by 12th November!
Further information and link to the course on Moodle
The online course Natural Language Processing for Linguists will be taught by Tuomo Hiippala at the University of Helsinki during 15.3.–10.5.2021.
The course is also open to students from universities outside Helsinki, if space allows. Registration is open until 16th March.
Note also that all the course materials will be available online and you can use them even if you cannot make it to the course this time!
The next Kielipankki Live event will be held on Monday 14th December starting at 13:00 via Zoom. The event will be in English, but questions are welcome in Finnish as well! The main themes are speech corpora and personal data practices. Join us for the interviews and presentations of special guests and for good discussions! Register preferably by 11the December.
Program and registration details
The European Language Grid (ELG) aims to provide a digital marketplace where European companies, organizations and citizens can both offer and efficiently use language technologies, data sets and services. The ELG workshop presents an overview of the ELG platform and the ELG pilot projects. Welcome to see what ELG has to offer for you!
The workshop is a free online event, but registration is required. Please register via the ELRC website by 10th December. NB: In case you wish to participate in the ELG tutorial session that may be arranged after the workshop, please indicate this in the field for additional information on the registration form. Thanks!
Note that the third ELRC Workshop in Finland will also take place online, in the same virtual room, on the same day at 9.30-12.40. Welcome to participate in both events!
14:00 | Welcome and introduction |
14:05 | ELG Overview Katrin Marheinecke |
14:30 | ELG online demo Nils Feldhus |
14:50 | Presentations of Finnish Pilot Projects funded in ELG: PARA4DLM (University of Turku), LSDISCO (Lingsoft); OPUS-MT (University of Helsinki) |
15:20 | Expectations/requirements of Finnish Language Technology providers Marko Turpeinen, 1001Lakes |
15:40 | Summary and discussion |
16:00 | End of workshop |
16:15 | Tutorial: How to integrate a service into ELG This tutorial may be organized according to requests from the participants. Please indicate your interest in the registration form. |
Last updated: December 7, 2020
This online course can support you with practical issues in managing the research data you need for your MA thesis or PhD project. You can join the course from any university, given that you fulfil the criteria. There is plenty of room left at the moment. Note, however, that the number of participants is restricted and students in the LingDig MA programme at the University of Helsinki have priority.
See all online courses and training
FIN-CLARIN is planning an online event together with ELRC (European Language Resource Coordination) and ELG (European Language Grid), to be organized on 15th December 2020.
Mark your calendars! Further information will be updated on the event page.
Organizers:
The European Language Resource Coordination (ELRC) consortium
Department of Digital Humanities, University of Helsinki
Language Technology is shaping our multilingual future. It has already been transforming the way we interact with our devices and with each other, the way we shop, work and travel. More and more it reshapes our interaction with service providers, either public or private. Programs that automatically correct spelling errors and aid sophisticated writing, digital assistants that transform our voices to text messages on mobile phones, bots that answer our calls to the bank or to our social security organisation, systems that automatically translate from a foreign language, and much more, are already empowering our everyday lives, our businesses and our administrations. But can we fully use our own language in our digital interactions? Is our language adequately supported and ready to keep pace with the technological advancements of the AI era?
The third Finnish European Language Resource Coordination (ELRC) workshop will address these questions and it will seek to engage participants in a fruitful discussion on the status and prospects of Language Technology for Finnish. Developers, integrators and users of Language Technology, both from the private and public sector will share experiences, requirements and ways for transforming digital interaction in our multilingual Europe with Language Technologies. Finally, we will discuss how language data, i.e. texts and speech, can fuel development in Artificial Intelligence.
This workshop continues the series of previous ELRC workshops that were organized in Finland on 19.2.2016 and 24.10.2018.
This ELRC workshop is organized in collaboration with the European Language Grid (ELG). The 4th Regional ELG Workshop will take place in the afternoon, starting at 14:00. For details, see the ELG workshop page. Welcome to register and attend both events!
The ELRC workshop is a free event, but registration is required. You can use the same form to register to both the ELRC workshop (morning sessions) and the ELG workshop (afternoon sessions).
Please register via the ELRC website by 10th December. Welcome!
09:30 – 09:40 | Welcome and introduction (video, pdf) |
09:40 – 10:00 | The potential of Language Technology and AI – where we are, where we should be heading (video, pdf) |
10:00 – 10:30 | Language Technologies for the Languages of Finland – Panel session (video, pdf) |
10:30 – 10:45 | Coffee Break |
10:45 – 11:15 | The CEF AT Platform (video, pdf) |
11:15 – 11:45 | Language technologies by/for the public sector – Panel session (video, pdf) |
11:45 – 12:15 | Language data creation, management and sharing: existing practices and challenges – Panel session (video) |
12:15 – 12:30 | The EU Council Presidency Translator – Finnish presidency success story and what’s beyond (video, pdf) |
12:30 – 12:40 | Conclusions (video, pdf) |
12:40 – 14:00 | Break |
14:00 – 16:30 | European Language Grid (ELG): Introduction and overview. The ELG workshop is organized in collaboration with the European Language Grid (ELG) and it will take place in the same online meeting room as the ELRC workshop. Please note that the ELG workshop will be held in English only. Welcome to register and participate in both events! The detailed program for the ELG workshop is updated at https://www.kielipankki.fi/elg-workshop-2020/. |
Please register via the ELRC website by 10th December. Welcome!
Mietta Lennes and Tommi Jauhiainen
University of Helsinki / FIN-CLARIN
fin-clarin [ATT] helsinki.fi
Last updated: December 8, 2020
The open online course Introduction to Speech Analysis (5 ECTS) has just started. The course is now offered for the first time in both Finnish and in English. Within the group size limits, you can join in from any university until 6th November 2020. See the course home page for instructions on how to enrol the course area on Moodle.
During the course, you learn to transcribe and to annotate speech and to understand some of the most important acoustic displays and measurement methods that can be used in speech research. The main tool of the course is the Praat analysis program, but we will also take a look at ELAN. The course can be relevant for students in phonetics, linguistics and languages, but also in other fields where audio recordings of speech are used for research.
All the courses offered by FIN-CLARIN can be found on the Training page.