How to get access to the language resources of the Language Bank of Finland

How to use the corpora and services of the Language Bank of Finland

Public corpora

Many corpora in the Language Bank of Finland can be accessed without signing in. These corpora are marked in the corpus list with PUB icons. Public corpora can be used for queries in the Korp and LAT interfaces. Some PUB corpora’s licenses also allow downloading them at the Language Bank’s download service.

Corpora that require signing in

Some corpora can be accessed by signing in with an account from an organization that is a member of the HAKA or eduGAIN federation. These corpora can be identified by the ACA icon and can be accessed by members of these organizations without applying for more specific access rights.

Corpora that require personal access rights

The rest of the corpora require personal access rights. Rights can be applied for via the Language Bank Rights service. If you have a HAKA or eduGAIN account, please use it to sign in.

  • You can apply for access to one or several corpora.
  • Submitting an application is equivalent to an electronic signature to approve corpus-specific and general terms of use.
  • The application process typically takes up to a week.
  • Once access has been granted, you can use the corpus after signing in to the respective service.

No HAKA or eduGAIN credentials?

If you cannot use HAKA or eduGAIN, you can apply for a CLARIN account and use it for logging in. For non-HAKA users in Finland, an Eduuni account is recommended.

Applying for access to the Korp concordance query service

  • Most corpora deposited in Korp are available publicly, without signing in.
  • Some corpora in Korp require a personal corpus-specific access right.
    • In Korp’s corpus selector menu, these corpora are marked with a lock if you are not signed in or if you do not have the right to access the said corpus.
    • Corpora protected by a CLARIN ACA license can be used without applying for specific rights if you are an active member of a university, i.e. a student or a member of staff. Alumni and library users need to apply for a specific access right.
    • Corpus access rights can be applied for via the Language Bank Rights service.
    • Signing in to Korp works with HAKA or eduGAIN credentials or an Eduuni or CLARIN account.

Applying for access to the LAT (Language Archive Tools) interface

  • Some corpora in LAT are available publicly, without signing in.
  • Some corpora require authentication and personal corpus-specific access rights.
    • LAT can be used with HAKA or eduGAIN credentials.
      If you do not have one, an Eduuni or CLARIN account can also be used.
    • In order to use LAT, you have to activate your account.
  • Corpus access rights can be applied for via the Language Bank Rights service.
  • In addition, corpora in researchers’ Private nodes require their owners’ personal approval to use.

 

I have already been granted access. Where do I find the corpus?

Do I need a CSC user account?

If you want to use the corpora in Korp and LAT, a CSC account is not necessary. However, in order to use the Mylly service and/or the command line tools on the software server, you do need a CSC account.

In case you only wish to apply for a CSC user account, you can log in to Scientist’s User Interface and register as a CSC customer there, in case your home organization is a HAKA member. However, please note that when you apply for rights to a corpus whose use requires a CSC account, the CSC account will usually be created for you automatically in case you are not yet a CSC customer.

What does a CSC user account provide?

  • You have access to the Mylly service that enables you to process and analyze text, speech and even entire corpora without knowing any programming.
  • You can access the services in the Scientist’s User Interface.
  • You get some private space on CSC servers that you can also access via the My Files application within the Scientist’s User Interface.
  • You also get generously-sized temporary space accessible from Taito and via My Files.
  • You may use the command line tools and corpora that are available on taito-shell.csc.fi. The server also enables more flexible use of the corpora as well as software development. Taito-shell is recommended for users, who know a Linux command shell but are not (yet) familiar with supercomputing.
  • You have access to the Taito supercluster (taito.csc.fi).
  • On request, project directories for research groups on the application server (see CSC data storage services).
  • On request, long-term storage space in the IDA service.
  • Other CSC servers and services.

NB: CSC’s User Manager periodically removes accounts that have been unused for a long time. If this is the case, a new application has to be submitted.

Using the Language Bank of Finland in teaching

  • The Language Bank of Finland and CSC’s Taito server can be used in teaching.
  • Also first degree students are granted language resource access rights.
  • When applying for a CSC account and Language Bank access rights, a student’s research plan consists of the course and university the services are to be used at.
  • Accounts are private! Every user needs an individual account, including students.


Researcher of the Month: Eero Voutilainen

 

Contact

The Language Bank's technical support:
kielipankki (at) csc.fi
tel. +358 9 4572001

Requests related to language resources:
fin-clarin (at) helsinki.fi
tel. +358 29 4140599 / +358 29 4129317