Workshop: ”Accessing Data for Large Language-based Text and Speech Models”

[Programme]

Wednesday 05.11.2025 at 8:30-14:00, Helsinki

Organizers:  

Department of Digital Humanities, University of Helsinki
LAREINA project

Location:  

University of Helsinki, Main Building (Unioninkatu 34 entrance)

Welcome to the Workshop!

The development of language-centric AI during the past few years has been remarkable. It poses challenges but also creates opportunities for organizations both in the private and the public sector. Many of us are curious about how to harness the power of AI in our own business.

Our workshop on Accessing Data for Large Language-based Text and Speech Models will explore the potential benefit of copyrighted data vs. freely available data as well as recent results in training speech models using massive data sets.

This workshop is addressed to developers, integrators and users of language technologies and AI solutions in Finland. The workshop will be held in English and on-site only.

Registration

Registration has been closed.
Participation is free of charge, but registration is required. We have 50 seats available. Lunch and coffee breaks are included in the workshop.

If you have any questions, please contact the organizers via email: lareina-office AT helsinki.fi

 

Workshop: ”Accessing Data for Large Language-based Text and Speech Models”

Programme for the Workshop on Wednesday 05.11.2025

 

08:30 – 09:00

Registration and Coffee
University of Helsinki, Main Building (Unioninkatu 34, Senate Square entrance), 3rd floor

09:00 – 10:30

Session 1: ”Copyright & LLMs”, room: Karolina Eskelin (U3032), 3rd floor

09:00–09:15

Welcome and Introduction
Krister Lindén, University of Helsinki

09:15–09:45

”The AI Act and its impact on LLMs” (remote)
Paweł Kamocki, CLARIN Legal and Ethical Issues Committee

09:45–10:15

”The Legacy of Mímir: LLMs and Copyright at the National Library of Norway” (remote)
Javier de la Rosa, National Library of Norway

10:15–10:30

Discussion

10:30 – 11:30

Coffee break with Standing tables / Demo presentations, room: Christina (U2085), 2nd floor

11:30 – 13:00

Session 2: ”Speech Technology in Society”, room: Karolina Eskelin (U3032), 3rd floor

11:30–12:00

”Speech synthesis in Sámi and Karelian – Bringing minority voices into innovation projects”
Tove Mylläri, Yle

12:00–12:30

”AI-assisted customer call transcription”
Henry Granholm, Kela

12:30–13:00

”Unlocking the Potential of Radio and Television Archives for Automatic Speech Recognition”
Yaroslav Getman, Aalto University

13:00 – 14:00

Lunch

Restaurant Flora, 2nd floor

 

This workshop is organized by the LAREINA project and the University of Helsinki.

Contact the organizers for further details:

lareina-office [ATT] helsinki.fi

 

Materials (internal)

 

Last modified on 2025-11-05