Metadata checklist

This document very shortly describes the desired minium set of information that needs to be put into a META-SHARE article

Identification (Required)

Resource name: The full name by which the resource is known; the element can be repeated for the different language versions using the ”lang” attribute to specify the language. Usually we use English and Finnish. Avoid swedish since it is not well supported and might be confusing. See resource naming conventions.
Description: Provides the description of the resource in prose; the element can be repeated for the different language versions using the ”lang” attribute to specify the language. At least provide the English Description. A Finnish translation is usually also provided. Do not put the Change Log information here. Enter the hashtag to be used with the resource on its own line at the end of the description(s) according to the instructions in:
Resource short name: The short form (abbreviation, acronym etc.) used to identify the resource; don’t localize this. It is usually in English and is used to create directory names, be part of a filename. Do not use spaces. Use only lowercase letters. See also resource naming conventions.

  • If the resource is the source version end the shortname with ”-src”, as in urn:nbn:fi:lb-2017070501
  • If the resourse is the Korp version end the shortname with ”-korp”, as in urn:nbn:fi:lb-2019120403
  • If the resource is the VRT-file exported from Korp meant for the download service, end the shortname with ”-vrt”, as in urn:nbn:fi:lb-2019052701
  • If the resource is a scrambled version insert a ”-s-” in the shortname before the -src, -korp, or -vrt, as in urn:nbn:fi:lb-2019120404
  • Especially if the resource is a parallel version of another corpus, inset ”-par-” in the shortname. A bit like urn:nbn:fi:lb-2019042605, but suggestion in that would be to have the ”-par-” before the previously mentioned acronyms and after the information of the dates like ”-2018-”.

Url: The ”Access location” of the resource. Usually a URN, but not neccessarilly. Can be a download location or a URL pointing to Korp.
Identifier: The citable URN of the resource. This URN is the primary ID of the resource, it is used also in LBR. For the time being put in front.
Distribution: Specify the licence. (to be defined better)
Contact person: If unknown put ”User support at CSC – IT Center for Science Ltd. The Language Bank of Finland” or ”FIN-CLARIN user support”.

Versions (Recommended)

(to be defined)

Resource Documentation (Recommended)

Documentation: A reliable links to documentation information of the resource.
Reference instructions: Add links to reference instructions in our Portal. (example)
Change Log: create a ”documentUnstructured” as follows (example):

<date1>: what changed;
<date2>: what changed item1
* what changed item2
* what changed item3

Note a size limit of 1000 characters, add a new item if you are unsure. Also note that the information will be shown unformated without line breaks. Not nice, but the best META-SHARE can do at the moment.

Relations: See our Lifecycle and Metadata model.

Corpus Text Info

(to be defined).