Preliminary evaluation of data protection

In case personal data are processed in your research project and there are high risks associated with the processing, you are required by data processing regulations to carry out a data protection impact assessment (DPIA) before starting to process the personal data. The higher risks the processing involves, the more carefully you need to protect the data. Consider the protection measures and methods you can use so as to minimize or eliminate the risks.

The list of questions on this page is intended to help you plan your research project. You can use the questions to make a preliminary assessment of the risks that may be involved in the processing of personal data in your research. A data protection impact assessment is likely to be required if you answer ”yes” to more than one of the ten questions. Please note that the interpretations of the questions may vary in practice, and the individual criteria mentioned under each question are suggestions only.

When processing personal data, you should primarily follow the instructions given by the data controller. Therefore, you must always check with your home organization whether and how you are required to carry out the data protection impact assessment.

Further information regarding data processing impact assessment is available on the website of the Office of the Data Protection Ombudsman.

Preliminary evaluation questions

1. Will personal data be processed on a large scale?

Processing can be considered as large-scale processing if, for example:

  • There are more than 10 000 research participants/data subjects
  • A large amount of data about the same individual is collected
  • Data is collected about a large portion of the members of a specific group (for example, a large portion of the members of a small ethnic group or the employees of a certain employer)
  • The processing is permanent or long in duration
  • The processing is geographically extensive

2. Will sensitive or highly personal data be processed?

Sensitive or highly personal data includes:

  • Data concerning health
  • Location data (monitoring the movement of a person)
  • Genetic data
  • Biometric data for the purpose of identifying a person
  • Racial or ethnic origin
  • Political opinions
  • Religious or philosophical beliefs
  • Trade union membership
  • Sex life or sexual orientation
  • Data concerning criminal convictions or offences
  • Financial data that might be used for payment fraud
  • Electronic communication (such as emails)
  • Data otherwise considered as very personal (such as notes and diaries)

3. Will there be exceptions to the following rights of data subjects:

  • Informing participants about the project
  • Right to receive copies of data processed about the participant
  • Right to rectify inaccurate personal data
  • Right to restriction of processing
  • Right to object to the processing of personal data (for example, if the processing takes place in a public place, discussion board etc. where data subjects cannot avoid the collection of data)

4. Will data from multiple datasets be combined in a way that is unpredictable to the data subjects?

  • For example, combining data collected for two different purposes or data held by two different data controllers

5. Will the research involve the processing of data concerning individuals who are in a vulnerable position and for whom it may be difficult to exercise the rights of data subjects?

  • e.g., children, the elderly, asylum seekers and patients

6. Will the processing involve automated decision-making (meaning a decision with no human involvement) and/or profiling that may produce significant effects to the participant?

  • Significant effects or legal effects may include exclusion, discrimination, significant impact on privacy, determining the compensation of a participant on the basis of automated decisions etc.

7. Will personal data be used for evaluation or scoring of participants?

  • For example, assessing or predicting disease/health risks or creating a profile based on an individual’s behavior

8. Does the research involve systematic monitoring of the participants?

9. Will new technology be used for processing of personal data in an innovative way?

  • Will data be collected or processed in a novel way?
  • Are the consequences of the use of the new technology unknown?

10. If the research material/data is published or if it would be leaked to the public, could it cause significant harm to data subjects?

  • e.g., threat of violence or persecution

Last updated 6.9.2021

Search the Language Bank Portal:
Krister Lindén
Researcher of the Month: Krister Lindén


Upcoming events


The Language Bank's technical support:
kielipankki (at)
tel. +358 9 4572001

Requests related to language resources:
fin-clarin (at)
tel. +358 29 4129317

More contact information