Aalto-ASR – Aalto University Automatic Speech Recognition System v2.1 is available

The upgraded version 2.1 of the Aalto University Automatic Speech Recognition System (Aalto-ASR) is now available for use on the CSC Puhti server. Instructions for using the toolkit are currently available in Finnish only (English translation forthcoming).

There are currently two main functionalities in Aalto-ASR:

  • Speech recognition (kaldi-rec): creating a preliminary transcript from Finnish speech recordings in WAV format to plaintext files and/or annotation files.
  • Forced alignment (kaldi-align): If you already have a plain-text transcript of the audio recording, it is possible to automatically align the text with the corresponding portions of the sound signal. The aligner tool currently works in Finnish, Swedish, Northern Sámi, Estonian, Komi and English.

The brand new version of Aalto-ASR is also available as a Docker container that can be installed on other systems if required.

Metadata and citation instructions for Aalto-ASR 2.1