All previous researchers of the month can be found in the archive.
Do you know researchers who use the Language Bank of Finland and who might be good candidates for Researcher of the Month? Would you be one of them? Inform us: https://www.kielipankki.fi/support/contact-us/
The Finnish Sub-corpus of the Newspaper and Periodical Corpus of the National Library of Finland version 2, Korp (klk-fi-v2-korp) is now available in Korp as a beta test version. The corpus contains newspapers and periodicals of the digital collections of the National Library of Finland from the years 1771–2021. The corpus contains over 22 billion words in total, which is over four times as many as in the previous version of the corpus. The language of the sentences of the corpus has been identified with the HeLI-OTS language identifier. Read more about the KLK update here.
Submit the basic details about your own resource to the Language Bank of Finland: http://urn.fi/urn:nbn:fi:lb-2021121421
When you obtain a resource containing personal data via the Language Bank of Finland and start processing it for a new purpose, you must prepare a privacy notice regarding the purpose of processing, publish the notice openly in electronic format, and provide a link to the notice to the Language Bank. The purpose of a privacy notice is to help data subjects understand the purposes for which their data is used. You should always primarily follow the data protection guidelines of your own organisation. In addition, the Language Bank offers some instructions to help you collect the pieces of information that are usually required for a privacy notice regarding research purposes. Read more
The online course Corpus Linguistics and Statistical Methods (5 ECTS) will be offered again in Sep-Oct 2023 and it can be taken either in Finnish or in English. The course is open to all universities and you can also participate in it from outside Finland. Course details
FIN-CLARIAH, the national research infrastructure for Social Sciences and Humanities in Finland, is funded by the Research Council of Finland. FIN-CLARIAH consists of two components, FIN-CLARIN and DARIAH-FI. The FIN-CLARIAH people gathered together in a workshop day held at CSC in Espoo on June 6th. On the Language Bank website, you can find a number of deliverables produced in the FIN-CLARIAH project.
Did you know that CLARIN offers grants for, e.g., researcher and teacher mobility, events and training activities? Check out the funding opportunities and current calls: https://www.clarin.eu/funding
but we are at your service also in summer and messages will be answered as soon as possible.
We wish you a relaxing summer!
Mietta Lennes
Projet Planning Officer
fin-clarin@helsinki.fi
Subscribe/unsubscribe to this newsletter: https://www.kielipankki.fi/language-bank/newsletter-subscription/
See also the CLARIN Newsflash: https://www.clarin.eu/content/newsflash
All previous researchers of the month can be found in the archive.
Do you know researchers who use the Language Bank of Finland and who might be good candidates for Researcher of the Month? Would you be one of them? Inform us: https://www.kielipankki.fi/support/contact-us/
Submit the basic details about your own resource to the Language Bank of Finland: http://urn.fi/urn:nbn:fi:lb-2021121421
The first version of the complete dataset includes the speech samples that were donated during 16.6.2020-14.9.2021. The total duration of the recordings in this version is approximately 3200 hours, out of which approximately 1,600 hours have been manually transcribed.
Researchers may already apply for access to Puhelahjat data. Research use in academic organizations is free of charge. Read more about using the data for research
Companies and other non-academic organizations may acquire a paid license for using one of the Puhelahjat datasets. Some of the data packages intended for commercial use are still in preparation. For further details, organizations and companies interested in using the data may already contact us by email at lahjoita-puhetta@helsinki.fi. Read more about commercial use of the data
The Donate Speech campaign is still on and you can still donate your speech in Finnish or in Swedish at https://lahjoitapuhetta.fi/. When the campaign ends, all of the data will be made available via the Language Bank.
If using a resource obtained via Kielipankki that contains personal data (the license includes a ”+PRIV” tag), you are required to submit the title of your project and a public link to the Privacy Notice regarding the purpose for which you are using the resource. Submit the information via this e-form.
Write the Privacy Notice according to the instructions given by your home organization. It is a good idea to store the document in a place where you are able to update the information when needed.
See also the guidelines for processing corpora stored in the Language Bank of Finland that contain personal data.
A new automatic speech recognition service, Tekstiks, is now up and running for test users. The automated system can recognise spoken Estonian and Finnish and produce a transcript of the recording. The Tekstiks service is the result of a collaboration between the Tallinn University of Technology, the Language Bank of Finland and Aalto University. Read more about Tekstiks and try it out!
Kielipankki – Language Bank of Finland has joined the open-source social network Mastodon. Welcome to follow us! @kielipankki@toot.community
FIN-CLARIAH, the national research infrastructure for Social Sciences and Humanities in Finland, received funding from the Academy of Finland for the years 2022-23. FIN-CLARIAH consists of two components, FIN-CLARIN and DARIAH-FI. The FIN-CLARIAH people gathered together in a workshop day held in Jyväskylä on 18th November. On the Language Bank website, you can find the presentation materials and a number of deliverables produced in the FIN-CLARIAH project.
Did you know that CLARIN offers grants for, e.g., researcher and teacher mobility, events and training activities? Check out the funding opportunities and current calls: https://www.clarin.eu/funding
We wish you a relaxing holiday season!
Mietta Lennes
Projet Planning Officer
fin-clarin@helsinki.fi
Subscribe/unsubscribe to this newsletter: https://www.kielipankki.fi/language-bank/newsletter-subscription/
See also the CLARIN Newsflash: https://www.clarin.eu/content/newsflash
All previous researchers of the month can be found in the archive.
Do you know researchers who use the Language Bank of Finland and who might be good candidates for Researcher of the Month? Would you be one of them? Inform us: https://www.kielipankki.fi/support/contact-us/
Submit the basic details about your own resource to the Language Bank of Finland: http://urn.fi/urn:nbn:fi:lb-2021121421
HeLI-OTS is a general-purpose language identifier that can automatically detect the language used in a text. This ELG-compatible tool selects the most suitable option from a list of 200 languages. HeLI-OTS has been developed as part of a collaborative project between University of Helsinki and Lingsoft on text and speech recognition, funded by the Finnish Research Impact Foundation. Read more
Korp has been updated to version 9. In addition to bug fixes, the new Korp has some new features, although some of them will be activated only when the required support for them has been added to corpora. Please report any bugs and deficiencies in the new Korp (and also wishes) either via feedback form or by email to fin-clarin (at) helsinki.fi.
The Donate Speech campaign (Lahjoita puhetta) is still on. Of the 4000 hours of Finnish speech that were donated so far, 1500 hours have been manually transcribed. The donated speech material will be made available for restricted research and development purposes via the Language Bank of Finland in autumn 2022.
LUMI is owned by the EuroHPC Joint Undertaking, and it is run by a consortium of 10 countries with long traditions and knowledge of scientific computing. LUMI is an ecosystem for high-performance computing, artificial intelligence, and data-intensive research, which enables breakthroughs in several branches of academic research. In addition, a fifth of LUMI’s capacity is targeted to companies. Read more
Within the COST Action ”NexusLinguarum”, centered around linguistic data science, a new call for Virtual Mobility Grants (VMGs) has been issued with collection date 30th of June. VMGs are a networking tool launched by the COST Association and they aim to support individual participants to foster collaborative research activities, networking with other researchers and exchange of knowledge in a virtual setting. Moreover, you can still become a memher of one of the Working Groups within the Action. Read more
Did you know that CLARIN offers grants for, e.g., researcher and teacher mobility, events and training activities? Check out the funding opportunities and current calls: https://www.clarin.eu/funding
FIN-CLARIAH, the national research infrastructure for Social Sciences and Humanities in Finland received funding from the Academy of Finland for the years 2022-23. FIN-CLARIAH consists of two components, FIN-CLARIN and DARIAH-FI. We organized a kick-off event where posters were presented to introduce the goals and the work that will be done by the infrastructure. See the posters here.
Read more about FIN-CLARIN: FIN-CLARIN
Read more about DARIAH-FI: DARIAH-FI
For the roadmap of FIN-CLARIAH, see also: FIN-CLARIAH
The Language Bank of Finland wishes you a relaxing summer!
Mietta Lennes
Projet Planning Officer
fin-clarin@helsinki.fi
Subscribe/unsubscribe to this newsletter: https://www.kielipankki.fi/language-bank/newsletter-subscription/
See also the CLARIN Newsflash: https://www.clarin.eu/content/newsflash
All previous researchers of the month can be found in the archive.
Do you know researchers who use the Language Bank of Finland and who might be good candidates for Researcher of the Month? Would you be one of them? Inform us: https://www.kielipankki.fi/support/contact-us/
The resource-specific license terms and conditions will be updated in the near future. The most prominent change is that resource-specific data protection terms and conditions will be included in the licenses of those resources that contain personal data. Information about the license updates will be published on the Language Bank website. Read more about what to expect: https://www.kielipankki.fi/news/updates-to-resource-specific-licenses-and-data-protection-terms-and-conditions/
Those corpora that were previously available via the LAT platform (discontinued in 2020) have been moved to the download service. The content of the downloadable corpora is essentially the same as in LAT, and the samples can be studied with, e.g., Praat or ELAN. At a later stage, we intend to make some speech corpora accessible via Korp as well. The current status and access location of each corpus can be seen on its metadata record and on the page of the resource group in question.
Submit the basic details about your own resource to the Language Bank of Finland: http://urn.fi/urn:nbn:fi:lb-2021121421
A given resource may be available as several different versions or variants that are provided for different purposes. The new resource group pages provide an overview of all the available versions. Read more
The official Korp update has been postponed until January. However, many of the new features and improvements can already be tested in Korplab. Your feedback is welcome! Read more
The upgraded Aalto-ASR 2.1 is available for testing in the Puhti environment at CSC. If required, it is also possible to install the system on a local environment from a Docker container. Read more
The online course Corpus Linguistics and Statistical Methods (5 ECTS) will be offered again in Jan-Mar 2021 and it can be taken either in Finnish or in English. The course is open to all universities and you can also participate in it from outside Finland. Course details
Did you know that CLARIN offers grants for, e.g., researcher and teacher mobility, events and training activities? Check out the funding opportunities and current calls: https://www.clarin.eu/funding
The Donate Speech campaign (Lahjoita puhetta) is still on. You may now donate your speech in Swedish, too! Of the 4000 hours of Finnish speech that were donated so far, 1500 hours have been manually transcribed. Starting from spring 2022, the donated speech material will be made available for restricted research and development purposes via the Language Bank of Finland.
FIN-CLARIAH, the national research infrastructure for Social Sciences and Humanities in Finland was granted 4.6 M€ by the Academy of Finland for the years 2022-23. FIN-CLARIAH consists of two components, FIN-CLARIN and DARIAH-FI.
Read more about FIN-CLARIN: FIN-CLARIN
Read more about DARIAH-FI: DARIAH-FI
For the roadmap of FIN-CLARIAH, see also: FIN-CLARIAH
The Language Bank of Finland wishes you nice and relaxing Christmas time!
Mietta Lennes
Projet Planning Officer
fin-clarin@helsinki.fi
Subscribe/unsubscribe to this newsletter: https://www.kielipankki.fi/language-bank/newsletter-subscription/
See also the CLARIN Newsflash: https://www.clarin.eu/content/newsflash