A demo service where you can try out automatic speech transcription and edit the automatically generated transcript via a browser interface.
Note: For the time being, this service is intended for trial use with individual audio files only. The service is not designed to handle large amounts of data and should not be used to handle confidential speech recordings for data protection reasons.
|icon-sign-in Access the service|
|Look for all versions of this tool in META-SHARE|
Tekstiks.ee is a web browser-based speech recognition service for transcribing speech in Estonian or Finnish.
The Tekstiks service is part of the international CLARIN cooperation. The text editor for editing transcripts and the interface for running speech recognition tools have been developed at the Laboratory of Language Technology at the Tallinn University of Technology (TalTech). TalTech’s own speech recogniser for the Estonian language is connected to the service, as well as the speech recogniser for Finnish provided through the Language Bank of Finland, which uses speech recognition models developed at Aalto University.
The system can handle several files simultaneously. The average processing time is about half of the recording’s length (in November 2022). The language of the browser interface can be set to English or Finnish instead of Estonian.
First, you need to create a local username on a server managed by the Tallinn University of Technology in Estonia. To create an account, all you need is a working email address, a user name and a password. The audio files to be processed are uploaded to the Tekstiks server in Estonia. The logged-in user can manage and delete files uploaded to the Tekstiks server.
If Finnish speech recognition is selected and activated in the Tekstiks service, the speech recordings are transferred over the network to a CSC-hosted server in Finland, where they are processed. The recognised text is transferred from the CSC server back to the Tekstiks server in Estonia, where the user can further edit the text and, if they wish, download it. Currently the supported download format is .docx (MS Word document).
Please note that the level of security of this test service is currently not sufficient to handle confidential speech data.
This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-2022112802