For technical reasons, the LAT service (lat.csc.fi) will be discontinued in the Language Bank of Finland as of November 30, 2020.
The LAT platform itself is no longer developed in its present form by MPI, and the media browser component Annex (part of LAT) is based on deprecated Adobe Flash technology that will stop working at the end of this year.
A replacement service for LAT has not yet been selected by the Language Bank of Finland. However, all the speech and sign language corpora that were previously available in LAT will be made available by alternative means.
All of the LAT corpora can be offered as downloadable packages that can be used and studied directly on the user’s local computer. In the download service, the corpora can be accessed under the same conditions as via LAT. For some corpora, more advanced solutions might already be available.
The LAT instances of the following corpora will be affected:
In order to see where each corpus will be located in the future, please refer to the corresponding metadata page where this information will be updated. The relevant metadata links are provided on the list above.
LAT assigned 25000 PIDs to individual files. We have no automatic mapping of these PIDs but we can help you find the file if you need it. We aimed to structure downloadable packages similar to the dataset structure on LAT. To locate a file look into the URL field of this page where you find a ”?path=...
” parameter. Example: ?path=demo/TRASH/2017-01/526/v7556__.C_4.4_Viittomakielisten_kielelliset_oikeudet.imdi
This should help you locate your file, please contact us if you have any questions.
August-September 2020:
September-October 2020:
30th November 2020:
Year 2021 (and later):
Further details on the schedule of the aforementioned process will be updated on this page. In case you need additional information at this point, please contact FIN-CLARIN directly.
The latest versions: | |
Plenary Sessions of the Parliament of Finland, Kielipankki Korp Version 1.5 Metadata and license How to cite this version |
Open the corpus in Korp |
Plenary Sessions of the Parliament of Finland, Downloadable Version 1.5 Metadata and license How to cite this version |
Download the corpus |
Locate other versions of the same resource |
Plenary Sessions of the Parliament of Finland contains audio and video recordings of the parliamentary sessions and the transcripts that have been aligned with the audio. Both the media files and the original transcripts have been obtained directly from the online public services of the Parliament. The content is openly available via the Language Bank of Finland without logging in.
Via the Korp service in the Language Bank of Finland, it is possible to perform various kinds of content searches on the corpus and to calculate statistics from the results. The turns of different speakers have been separated in the text. In the Extended search tab in Korp, it is possible to delimit searches on the basis of the speaker’s name, the parliamentary group or the role of the speaker.
In the search results of this corpus version in Korp, there are also links to the corresponding utterances in the original video. If you wish, you may download the ELAN/EAF annotation files and the audio files in the downloadable version of the corpus for further processing. Moreover, the original videos and transcripts can also be located in the online services of the Parliament of Finland.
The text in the original transcripts has been aligned with the audio recordings by automatic methods. The technological expertise in the alignment process was provided by Aalto University. In those audio portions where a matching text was not found in the transcript, an automatic speech recognizer was used in order to provide a tentative transcript. Thus, it is important to remember that the text in the Korp version of the corpus is not error-free and it may not always fully correspond to the original transcript.
Further information about the contents of the different corpus versions can be found in their metadata records.
This resource group page has a Persistent Identifier: http://urn.fi/urn:nbn:fi:lb-201407305
Workshop “Digital Parliamentary data and research”
Friday 3 May at 12.00
Aalto University (Otaniemi), CS-Building, Room T4 / A238 (Konemiehentie 2)
The aim of the workshop is to discuss the novel digital parliamentary datasets—in particular those of Parliament of Finland—their use in research, the related research resources and tools, and their future development for researchers, but also for citizens and the media. FIN-CLARIN and the Korp version 1.1 of the Plenary Sessions of the Parliament of Finland, available in the Language Bank of Finland, will be presented during the afternoon.
Mietta Lennes: FIN-CLARIN and Parliamentary Data in Kielipankki – the Language Bank of Finland (PowerPoint / PDF slides)
Further information including the programme of the workshop can be found at https://www.helsinki.fi/en/helsinki-centre-for-digital-humanities/workshop-digital-parliamentary-data-and-research.
Viimeisimmät versiot: | |
Eduskunnan täysistunnot, Kielipankin Korp-versio 1.5 Kuvailutiedot ja lisenssi Viittausohje tähän versioon Esimerkkihakuja | Avaa aineisto Korp-palvelussa |
Eduskunnan täysistunnot, ladattava versio 1.5 Kuvailutiedot ja lisenssi Viittausohje tähän versioon | Lataa aineisto |
Etsi muut saatavilla olevat versiot |
Eduskunnan täysistunnot -aineisto sisältää Suomen eduskunnan täysistuntokeskustelujen äänitteitä, videoita ja niihin kohdistettuja keskustelupöytäkirjoja. Sekä mediatallenteet että pöytäkirjat on saatu suoraan eduskunnan julkisten palveluiden kautta. Aineisto on Kielipankin kautta avoimesti saatavilla eikä sen käyttäminen edellytä kirjautumista (ks. tarkemmat tiedot aineistoon sovellettavista lisensseistä).
Aineistosta voi tehdä monenlaisia hakuja ja tilastoida tuloksia Kielipankin Korp-palvelussa. Eri puhujien puheenvuorot on merkitty aineistoon erikseen. Korpin laajennettu haku -välilehdellä voi rajata hakuja esimerkiksi puhujan, hänen edustamansa eduskuntaryhmän tai hänen roolinsa perusteella.
Lähes kaikista Korp-hakutuloksista on myös linkki vastaavaan kohtaan alkuperäisessä videossa (videolinkit on lisätty aineiston Korp-versioon 1.5). Lisäksi käyttäjä voi tarvittaessa ladata Korp-versiota 1.5 vastaavat VRT-muotoiset tekstitiedostot sekä keskustelujen äänitteet ja EAF-muotoiset annotaatiotiedostot aineiston ladattavasta versiosta sekä noutaa eduskunnan palvelimelta alkuperäisiä videoita.
Pöytäkirjojen sisältämä teksti on kohdistettu äänitteisiin automaattisilla menetelmillä. Kohdistustyöstä on teknisesti vastannut Aalto-yliopisto. Ne kohdat, joissa pöytäkirjasta ei ole löytynyt äänitettä vastaavaa tekstiä, on pyritty tunnistamaan automaattisesti ja tekstiehdotus on lisätty litteraattiin. Tämän vuoksi kannattaa huomata, ettei tunnistettu teksti ole kaikilta osin virheetöntä. Teksti on myös jäsennetty suomen kielen jäsentimellä, joten alkuperäisten pöytäkirjojen ruotsinkieliset kohdat on yleensä merkitty sanaluokaltaan vierassanoiksi.
Tarkempaa tietoa eri aineistoversioiden sisällöstä löytyy niiden kuvailutiedoista.
Tämän sivun pysyvä tunniste: urn:nbn:fi:lb-2021111721