Sámiasr – speech-to-text for Northern Sámi

This is a user guide for running the speech recognition from a pre-built Apptainer container. If you don’t already have it, you should be able to download it from https://a3s.fi/kielipankki-containers/samiasr.sif. It is based on a wav2vec model from Yaroslav Getman and Tamas Grosz, https://huggingface.co/GetmanY1/wav2vec2-large-sami-cont-pt-22k-finetuned.

Prerequisites

samiasr is a command-line only program. To run it, you have to have the Apptainer software installed. If you can run the command apptainer in your command line, and get a usage message, you have it installed. If not, and you are able to install software on your computer, refer to https://apptainer.org/docs/admin/main/installation.html.

samiasr-cpu.zip contains samiasr.sif, which is really all you need to run the speech recogniser. It also contains a supplementary Python program samiasr.py, which is only needed if you want to modify the code.

Usage

You can either use apptainer to run the container, like this:

apptainer run samiasr.sif

or run the container on its own, like this:

./samiasr.sif

In the latter case, you have to make sure samiasr.sif has executable permissions. If everything is working correctly, these commands will produce a usage message. The usage message looks something like this:

usage: samiasr.py [-h] [–no-segment] [–json FILE]
[–silence-threshold SILENCE_THRESHOLD]
[–min-silence MIN_SILENCE] [–min-segment
MIN_SEGMENT] [–max-segment MAX_SEGMENT] [-v]
[–debug] input_files [input_files …] samiasr.py:
error: the following arguments are required:
input_files

The error message at the bottom indicates that we need to add the one required argument, input_files, meaning one or more files that we want torun the speech recognition on. If we try to add an input file, for example:

./samiasr.sif ~/samiasr_test/input.wav

We get output that looks something like this:

ASR result for /home/user/samiasr_test/input.wav:
[00:00.000 -> 00:04.032] buorre beaivi mu namma lea irena
[00:04.032 -> 00:08.832] eh ja mun human davvisámegiela
[00:08.832 -> 00:12.768] eh dat ii leat mu eatnigiella eh mun lean rosas
…

The input has been chopped up into segments at relatively silent places by the processing script, and the segments have been decoded by the wav2vec model. Many of the options of the processing script control this choppping up. For example, if you want to disable segmenting and just use the pure model, try

./samiasr.sif --no-segment input.wav

Though in the case of long segments, this may require a lot of memory and/or processing time.

The runscript also supports multiple input files and json output. See

./samiasr.sif --help

for more informations on the runtime options.

Sensitive data on SD Desktop

If you are processing sensitive data non-locally, and would like extra protections for data secrecy, you can use CSC’s SD (Sensitive Data) Desktop service. To use samiasr there, you have the option of either transferring the container to your SD Desktop instance the same way as your data (see SD-connect), or using the ready-made installer.

The installer is a program called auto-apptainer, which is available on SD Desktop via the SD Software Installer. You can use it to install auto-apptainer, and then launch it in the terminal. Auto-apptainer will present you with a menu of available containers, from which you can choose samiasr. In SD-Desktop the samiasr container is invoked via the command line as samiasr, which invokes the tool as mentioned below ”Usage” above.

Advanced usage

If you want to modify the runscript, it is possible to extract it for viewing and editing with apptainer exec samiasr.sif get_runscript samiasr.py This will write the file samiasr.py in the present directory. If you then edit it and run

apptainer exec samiasr.sif python samiasr.py input.wav

you will be able to run your code with the built-in models.

Hae Kielipankki-portaalista:

Kuukauden tutkija: Krista Ojutkangas

Yhteystiedot

Kielipankin tekninen ylläpito:
kielipankki (ät) csc.fi
p. 09 4572001

Aineistoihin ja muuhun sisältöön liittyvät asiat:
fin-clarin (ät) helsinki.fi
p. 029 4129317

Tarkemmat yhteystiedot