News from the Language Bank of Finland 21st June 2012

  • New tools: Korp, LAT and META-SHARE
  • Currently available and forthcoming language corpora
  • Future events
  • Suggest your material for the Language Bank

************
NEW TOOLS: KORP, LAT AND META-SHARE

A new corpus search tool called Korp has been installed in the Language Bank of Finland. Korp is being developed at Språkbanken in the University of Gothenburg, Sweden. You can already try it out at http://korp.csc.fi on a couple of corpora. New corpora will be gradually added. New versions of the application are also foreseen, and there will be an additional possibility to provide corpora with restricted licenses in the future. Please contact us in case you would like to publish your corpus in Korp!

The LAT system developed by MPI (Max Planck Institute for Psycholinguistics) is also available in the Language Bank. Using LAT tools, it is possible to browse, view and listen to speech and language corpora that include annotated audio and video material. You can login to LAT using your Haka account. However, you will first need to activate your account by filling in the form at https://elomake.helsinki.fi/lomakkeet/35497/lomake.html. Later, it will be possible to login with CSC accounts as well. More instructions are available in Finnish at https://kitwiki.csc.fi/twiki/bin/view/FinCLARIN/KielipankkiOhjeetLAT.

You can also try out the new META-SHARE tool where you can search for language resources according to their metadata. META-SHARE supports a comprehensive metadata format specifically designed for different types of language resources. The META-SHARE service also contains an editor for editing and publishing the metadata for new resources. However, registration is currently possible for META-NORD project members only. Later, the editor can be used by anybody. The service is available at http://metashare.csc.fi.

FinnWordNet, the second largest wordnet in the world, is freely available at http://www.ling.helsinki.fi/en/lt/research/finnwordnet/.

************
CURRENT AND FORTHCOMING LANGUAGE CORPORA

The Finnish Treebank v. 2.0 is available at http://www.ling.helsinki.fi/kieliteknologia/tutkimus/treebank/ . The treebank can also be searched with the Korp tool at http://korp.csc.fi.

A new version of the Finnish corpus Suomen kielen näytteitä (SKN) by Kotus is now in restricted test use on the LAT platform. The corpus is still under construction, but you can become a test user by emailing FIN-CLARIN atfinclarin@helsinki.fi and by activating your Haka account for the LAT system at https://elomake.helsinki.fi/lomakkeet/35497/lomake.html. The completed corpus will be opened for all registered LAT users in the autumn.

A small demo corpus that contains an annotated video sample is freely available in LAT: http://lat.csc.fi/ds/annex/flex/annex/annexInterface.jsp?nodeid=MPI2817

Current collections of the Language Bank of Finland: https://kitwiki.csc.fi/twiki/bin/view/FinCLARIN/KielipankkiKoti
Forthcoming language resources: https://kitwiki.csc.fi/twiki/bin/view/Trash/FinCLARINFinClarinHallintoUudetKielivarat

************
FUTURE EVENTS

FIN-CLARIN will organize a course for the Langnet Graduate School on 15th-17th October 2012. For more information (in Finnish), see https://kitwiki.csc.fi/twiki/bin/view/FinCLARIN/KielipankkiTapahtumaLangnet2012.

************
SUGGEST NEW MATERIAL FOR THE LANGUAGE BANK

Please inform us about a new language resource using a handy e-form: https://elomake.helsinki.fi/lomakkeet/32074/lomake.html (The form is currently in Finnish only, but English and Swedish versions will appear later.)

************

FIN-CLARIN and the Language Bank of Finland wish you a nice Midsummer and relaxing holidays!

Mietta Lennes
Project Planner / FIN-CLARIN
finclarin@helsinki.fi