URN for this page: http://urn.fi/urn:nbn:fi:lb-2020081522
Always comply with these guidelines when processing corpora obtained from the Language Bank of Finland that contain personal data.
Corpora stored in the Language Bank of Finland that contain personal data have the following label in their licence:
PRIV: There are personal data in the resource.
The licence details of individual corpora can be found in the corpora listing of the Language Bank of Finland next to the corpus in question as well as in its metadata, which can be accessed using the persistent identifier assigned to the corpus (i.e., the URN address included in the citation instructions).
The metadata of corpora labelled PRIV can contain a separate description of the personal data included in the corpus. Descriptions of personal data include the following details:
If no separate description of personal data for a specific corpus is available, and if you cannot find corresponding information in its general description, please request clarification from the FIN-CLARIN service address: fin-clarin(at)helsinki.fi.
When using a PRIV-labelled corpus, you undertake to process the personal data included in it confidentially, carefully and solely for the purpose for which you were granted access to the corpus.
When processing corpora that contain personal data, please apply sufficient protective measures in accordance with the instructions provided by your own organisation. Special care is needed when processing corpora that contain sensitive personal data (also known as special categories of data).
When starting to process a corpus obtained through the Language Bank of Finland that contains personal data for the purposes of new research or another purpose, you and/or your home organisation assume the role of data controller for the corpus. Among other responsibilities, the controller is obliged to demonstrate the lawfulness of the processing of personal data, when necessary.
The instructions provided by your own organisation must be observed in the first instance when processing personal data. If instructions provided by your home organisation are unavailable, you can familiarise yourself, for example, with the Data Management Guidelines published by the Finnish Social Science Data Archive when planning the processing.
As the controller, you must usually draw up a privacy notice on the processing of personal data. Comply with the instructions provided by your own organisation in this instance as well. When drawing up a privacy notice, you can utilise the privacy notice associated with the original corpus, or the description of the personal data included in it.
When starting to use a corpus stored in the Language Bank of Finland that contains personal data, first publish the privacy notice pertaining to your purpose of processing, for example, on a website provided by your organisation. You can share a short title of your project that is understandable to the general public as well as a link to the openly available privacy notice by using this form. We publish this information on the Language Bank of Finland website to make it available to anyone interested in the purposes for which the corpus is used.
Comply with the guidelines of your own organisation. When necessary, you can view examples of protective measures employed by the Language Bank of Finland and other potential measures which you may need when processing personal data.
Personal data must also be processed responsibly and in compliance with good ethics when creating scientific publications and presentations based on corpora.
When reporting on the results of scientific research, personal data must be, as a rule, removed or redacted, for example, by pseudonymisation and by classifying data subjects’ age, domicile and other details into more extensive categories so that study participants cannot be identified on the basis of such details or by combining them with other data.
In certain cases, presenting scientific research results requires the presentation of data that contain personal data. For example, it may be necessary to link short individual samples from the corpus to a scientific article, or a specific section must be presented in connection with a conference presentation. However, carefully consider the potential impact on and risk to the study subjects, their family members or others close to them associated with publishing or presenting samples that contain personal data. The scope of the samples intended for publication must not exceed the scientific purposes, and all unnecessary personal data must be removed or pseudonymised from the samples using appropriate means.
Please also note that if the study subjects have been, for some reason, clearly informed that no personal data associated with them will be published, and the sample to be published cannot be fully anonymised, a separate consent for publishing the sample must be requested from the subjects.
If a PRIV-labelled corpus, which requires access rights, is to be processed for more than one purpose – for example, if at a later date there is a wish to carry out a new study not directly connected to the previous topic – access rights must be applied for from the Language Bank of Finland separately for each purpose. Naturally, all grounds for the processing must be stated in the privacy notice(s).
If you come across personal data which you believe should not be included in a corpus based on its description, please report the matter immediately to the Language Bank of Finland and/or directly to the controller of the data. This also applies to instances where you suspect that personal data have, for some reason, fallen into the wrong hands.