University of Helsinki, Coordinator
(this page will be updated)
LAREINA – Language Resource Infrastructure for AI is a Business Finland-funded project (2023-25) in which the University of Helsinki and Aalto University will collaborate with companies with the aim to research, produce, test and pilot language technology components. The goal of the LAREINA project is to develop a commercially replicable model for building speech interfaces for small and medium-sized languages, such as Finnish, Finland-Swedish, Swedish and Sámi languages.
LAREINA organizationThe LAREINA project has received funding from Business Finland for 2023-25. The University of Helsinki is the coordinator of the LAREINA project. Contact information and more information about the LAREINA Project Partners can be found here.
LAREINA – Language Resource Infrastructure for AI (2023–25)
Helsingin yliopisto, koordinaattori, puhesynteesi
Yhteydenotot LAREINA-hankkeen koordinaattoriin ja tutkimusorganisaatioihin:
Julkisen sektorin toimijat
LAREINA – Language Resource Infrastructure for AI
LAREINA – Language Resource Infrastructure for AI on Business Finlandin rahoittama hanke (2023–25), jossa Helsingin yliopisto ja Aalto-yliopisto yhteistyössä yritysten kanssa tutkivat, tuottavat, testaavat ja pilotoivat puheteknologisia komponentteja.
LAREINA-hankkeen aikana kehitetään puhesynteesiä ja puheentunnistusta suomen, suomenruotsin ja saamen kielille. Hankkeen aikana kehitetyt menetelmät ja aikaansaadut tulokset ovat sovellettavissa myös muille pienille ja keskisuurille kielille.
LAREINA-hankkeen tavoitteena on kehittää kaupallisesti monistettava malli, jonka avulla puhekäyttöliittymiä voidaan rakentaa pienille ja keskisuurille kielille. Aiempien selvitysten mukaan (ks. 2019) tälle on selkeä tarve, sillä toistaiseksi suuret kansainväliset kaupalliset toimijat ovat keskittyneet kehittämään kieliteknologisia ratkaisuja ensisijaisesti valtakielten (kuten englanti) tarpeisiin.
LAREINA-hankkeen tulokset julkaistaan avoimesti, myös kaupallisen käytön sallivan lisenssin alla.
Helsingin yliopisto toimii LAREINA-hankkeen koordinaattorina ja tutkimusorganisaationa, tutkimuskohteenaan puhesynteesi.
Aalto-yliopisto toimii LAREINA-hankkeen tutkimusorganisaationa, joka kehittää hankkeen aikana automaattista puheentunnistusta.
LAREINA-hankkeessa on yhteistyökumppaneina seitsemän Suomessa toimivaa yritystä ja organisaatiota. Rahoituksen hanke on saanut Business Finlandilta kaudelle 2023–25.
Yhteystiedot sekä lisätietoja LAREINA-hankkeen osallistujista löydät täältä.
General Terms of Speech Material Use (commercial use)
The text below on this page is a copy of the ”General Terms of Speech Material Use” that are included in the agreement made with a company or an organization regarding the use of the ”Speech Material”.
Definitions used in the text:
Speech Material = According to the agreement, one of the following:
The Language Bank = University of Helsinki
Licensee = The company or the organization with which the Agreement was signed
General Terms of Speech Material Use
1.1. “Speech Material” refers to material collected in the Lahjoita puhetta (Donate Speech) campaign, as defined in the signature part of this Agreement, which the Language Bank distributes for the purpose of research and development of applications and services that are capable of interpreting and producing speech, as well as for the purpose of language research.
“Data Protection Legislation” refers to the EU General Data Protection Regulation (2016/679) (“GDPR”) or any subsequent law that supersedes it, and the national data protection legislation applicable to the Licensee. This Agreement refers to terms that have been defined in the GDPR, including “personal data”, “data subject”, “controller”, “processing” and “processor”. In this Agreement, they are given the same meaning as in the GDPR.
2.1. The Language Bank hereby grants the Licensee a non-exclusive, non-transferable and non-sublicensable license to use to the Speech Material for the purpose of research and development of applications and services that are capable of interpreting and producing speech in accordance with the terms specified in this Agreement.
The license is valid for as long as this Agreement remains in force.
2.2. The license only applies to the Licensee defined in the signature part of this Agreement. Disclosing or transferring the Speech Material to a third party (including to a company affiliated with the Licensee) is prohibited, except as set forth in Section 4.3.
The Language Bank retains proprietary rights to the Speech Material.
2.3. For the avoidance of doubt, any results created by the Licensee (such as software and models) from which the Speech Material, the personal data included in the Speech Material, and the voices of the speakers cannot be restored shall belong to the Licensee, and the Licensee may continue to use such results after the term of the license.
The Language Bank shall provide a copy of the Speech Material to the Licensee after the Licensee has paid the fee set forth in the signature part of this Agreement.
2.4. The Licensee must comply with the data protection terms and conditions presented in Section 4 of this Agreement when processing the Speech Material.
The Licensee must use the Speech Material in accordance with good practice while respecting equality and human rights. The use of the Speech Material for discriminatory purposes or purposes that are derogatory to a specific group of people is prohibited.
3. License Fee
3.1. The Licensee shall pay the license fee set forth in the signature part of the Agreement.
4. Confidentiality and Data Protection
4.1. The Licensee understands that the Speech Material includes personal data subject to Data Protection Legislation. When processing the Speech Material, the Licensee is considered the controller. The Licensee undertakes to process the Speech Material in accordance with Data Protection Legislation solely for the purpose determined in Section 2.1 of this Agreement. The Licensee shall comply with any obligations imposed on the controller by Data Protection Legislation in the processing of the Speech Material.
The data included in the Speech Material are confidential. The Licensee must implement any technical and organisational measures required to ensure that only the relevant persons have access to the Speech Material. Employees of the Licensee who process the Speech Material must be bound by an obligation of confidentiality pertaining to the content of the Speech Material. The confidentiality obligation must remain in effect after the end of the employment relationship.
4.2. The Licensee shall not disclose or provide access to the Speech Material to any third party. Publication of the Speech Material is prohibited. Notwithstanding the aforementioned, the Licensee may transfer the Speech Material to its subcontractors or service providers that act as processors of personal data for purposes consistent with this Agreement. When employing processors, Licensee shall comply with the requirements of Data Protection Legislation regarding processors of personal data, and conclude an agreement on the processing of personal data in accordance with Article 28 of the GDPR with the processors. If the processors process personal data outside the European Economic Area, the Licensee shall comply with the provisions of Chapter V of the GDPR on the transfer of personal data to third countries or international organisations. The processing of the Speech Material in cloud-based services aimed at consumers is prohibited.
The Speech Material shall not be used to identify speech donors. Recordings included in the Speech Material must not be combined with personal data available elsewhere, nor may recordings included in the Speech Material be compared to recordings available elsewhere to determine whether the speaker is the same person in both.
4.3. The Licensee shall maintain an up-to-date data protection statement online on the use of the Speech Material. The data protection statement must contain all information which must be supplied to data subjects according to Data Protection Legislation. The Licensee shall submit to the Language Bank the URL address of the statement before commencing the processing of the Speech Material. The Language Bank will publish the URL address on its website.
If the Licensee processes personal data outside the European Economic Area, the Language Bank and the Licensee shall put in place the safeguards required by Chapter V of the GDPR before the disclosure of the Speech Material to the Licensee. The Language Bank has the right to refrain from transferring the Speech Material for processing outside the European Economic Area if it deems that such transfer in accordance with Chapter V of the GDPR is not possible by reasonable means.
4.4. The Licensee shall notify the Language Bank without undue delay if the Speech Material is subjected to a personal data breach which results in the accidental or unlawful destruction, loss, alteration, unauthorised disclosure of, or access to, transferred, stored, or otherwise processed personal data.
The Licensee shall securely delete the Speech Material when it no longer has grounds based on Data Protection Legislation to process the Speech Material. In any event, the Licensee shall delete the Speech Material upon expiration or termination of the license granted in this Agreement. The Licensee shall document the deletion of the Speech Material. The Language Bank has the right to request and receive this documentation and an assurance given by the Licensee indicating that the Speech Material has been deleted.
5. Updates to the Speech Material, Obligation to Notify
5.1. The Language Bank may produce new versions of the Speech Material to ensure, for example, that the rights of data subjects in accordance with Data Protection Legislation are fulfilled and that there is no unlawful content in the Speech Material. When the Language Bank produces a new version of the Speech Material and notifies the contact person of the Licensee by email, the Licensee shall, without delay, delete the old version of the Speech Material and replace it with the new version. In accordance with Section 8, the Licensee must submit a functional and valid email address to the Language Bank to which notifications of updates to the Speech Material are to be sent.
The Licensee shall notify the Language Bank without delay if it identifies or suspect the presence of the following content in the Speech Material:
(a) Unauthorised, inaccurate, unnecessary or outdated personal data (such as direct identifiers, including names and contact details, information pertaining to the private life of individuals, rumours or defamatory speech),
(b) Unauthorised copies of works or other objects protected by copyright or related rights,
(c) Trade secrets,
(d) Data whose disclosure would constitute an offence against privacy, public peace or personal reputation (Chapter 24 of the Criminal Code of Finland), incitement to hatred or ethnic agitation (Sections 10 and 10a, Chapter 11 of the Criminal Code of Finland) or another offence, or
(e) Recordings where speech has been recorded without the speaker’s knowledge, or the recording has been started by accident.
The notification made by the Licensee shall include information that enables the Language Bank to identify the relevant recording.
6. No Warranty
6.1. The Language Bank provides the Speech Material to the Licensee “as is”. The Language Bank provides no warranty on the Speech Material and specifically disclaims any warranties of accuracy, completeness or fitness for a particular purpose, or non-infringement upon the rights of any third parties. The Licensee shall use the Speech Material at its own risk. The Language Bank is not responsible for any damage or losses incurred by the Licensee through the use of the Speech Material.
7. Liability for Damages
7.1. The Licensee is solely liable for ensuring that it uses the Speech Material in accordance with Data Protection Legislation and any other applicable legislation.
The Parties shall be liable towards each other for the damage they have caused by a breach of contract. The Language Bank shall not be liable for indirect or consequential damage. In all cases, the total liability of the Language Bank is limited to the amount of the license fee paid by the Licensee to the Language Bank. The above limitations of liability do not apply if the damage was caused wilfully or by gross negligence.
7.2. Neither Party is liable towards the other Party if a failure to fulfil an obligation set out in this Agreement is caused by a force majeure event. Force majeure includes, but is not limited to, fires, floods, explosions, lightning, storms, earthquakes, landslides, shortages of energy supply, interventions by government, revolutions, riots, wars, strikes, labour disputes, transport disruptions, shortages of labour, or another factor beyond the reasonable control of the relevant Party.
8.1. Any notices relating to this Agreement shall be sent by post, courier or email to the relevant Party’s contact person indicated in the signature part of this Agreement. If the contact details of a Party change, the Party shall submit new contact details to the other Party without undue delay.
9. Term and Termination
9.1. The Agreement shall enter into force on the date of the last signature and shall remain in effect for 10 years from the effective date. The Licensee may terminate the Agreement by giving written notice of termination to the Language Bank, in which case the Agreement shall terminate after 30 days have passed from the date of the notice. The Language Bank shall have no obligation to return any fees paid by the Licensee for the license to use the Speech Material.
As stated in Section 4.8 above, the Speech Material must be deleted when there are no legal grounds under the Data Protection Laws for the processing of personal data. During the term of the license to use the Speech Material, the Language Bank shall provide a new copy of the Speech Material to the Licensee upon request and within a reasonable time if the Licensee wishes to resume processing of the Speech Material.
9.2. If a Party materially breaches this Agreement and does not remedy the breach within thirty (30) days of receiving written notice concerning the breach, or if the nature of the breach makes it incapable of being remedied, the other Party may terminate this Agreement. For the avoidance of doubt, failure to pay compensation for the use of the Speech Material and a material breach of the Data Protection Laws shall be considered a material breach of contract.
If a Party is evidently insolvent or becomes subject to bankruptcy, composition, insolvency, administration, administrative receivership or other similar proceedings, the other Party may terminate the Agreement with immediate effect.
9.3. In accordance with Section 4, the license to use to the Speech Material ends immediately at the termination or expiry of the Agreement.
Sections 6, 7, 10 and 11.5 of the Agreement as well as provisions which are intended to remain in effect due to their nature shall remain in effect also after the termination or expiry of the Agreement.
10. Governing Law and Dispute Resolution
10.1. This Agreement is governed by the laws of Finland excluding its conflict of law provisions.
Any disputes relating to this Agreement, which cannot be solved amicably, shall be resolved by the Helsinki District Court.
10.2. If the Licensee’s registered office is in a country in which a judgment from the courts of Finland would not be enforceable, any disputes arising from the Agreement shall be finally settled by arbitration under the rules of arbitration of the Central Chamber of Commerce of Finland. The arbitral tribunal is composed of a sole arbitrator. The seat of arbitration is Helsinki, Finland. The language of arbitration is Finnish.
11.1. The Language Bank has the right to amend these general terms on legitimate grounds, which may relate to, including but not limited to, instructions given by authorities, best practice, or changes in the Data Protection Laws or other applicable laws. A notification of any changes made shall be sent by email to the contact address provided by the Licensee sixty (60) days before their entry into force. If the Licensee does not accept the changes made to the general terms, it may terminate the Agreement before the entry into force of the amendments by giving written notice to the Language Bank no later than thirty (30) days before their entry into force. The Language Bank shall have no obligation to return any fees paid by the Licensee for the license to use the Speech Material.
The Licensee shall not transfer this Agreement or parts thereof to third parties without the express written consent of the Language Bank.
11.2. This Agreement cancels all prior agreements and together with its appendices represents the entire Agreement between the Parties relating to the subject matter thereof.
If any of the terms of this Agreement are or become invalid, the remainder of the Agreement shall remain valid. If any invalid, unenforceable or illegal provision of this Agreement would be valid, enforceable and legal if some part of it were deleted, the provision shall apply with the minimum modification necessary to make it legal, valid and enforceable.
11.3. Neither Party may use the name or logo of the other Party in product marketing, media releases or for other similar purposes, unless specifically agreed between the Parties in writing. However, the Licensee has the right to refer, as appropriate, to the Lahjoita puhetta (Donate Speech) campaign and the Language Bank as the source of the Speech Material. For the avoidance of doubt, the Language Bank has the right to mention the Licensee as a recipient of the Speech Material on the Lahjoita puhetta data protection webpage, as well as other contexts where mentioning recipients of the Speech Material is necessary for the Language Bank to comply with its obligations.
This page was last updated on 6.9.2022.
Persistent Identifier of this page: urn:nbn:fi:lb-2022060130
YhteystiedotKielipankin tekninen ylläpito:
kielipankki (ät) csc.fi
p. 09 4572001
Aineistoihin ja muuhun sisältöön liittyvät asiat:
fin-clarin (ät) helsinki.fi
p. 029 4129317