Anonymise Datasets



Did you know? Studies that made data available in a public repository received more citations than similar studies where the data were not made available. This is what Heather Piwowar and Todd Vision found out here . If you upload your raw data, and the datasets on which the final analyses are based, to a secure repository and make them accessible (with a link in your manuscript) you become more visible to your peers. An important step in the preparation process is the anonymising of the data, e.g. you remove all information that would enable others to identify individual people.

Duration 45 – 60 min

Verify that nobody can identify a person through a combination of different rare characteristics (e.g. place of birth, gender, number of semesters, age). For example: because there is only one female student of business administration in Leipzig who is 33 years old etc.

If in doubt, delete all critical characteristics from the dataset you want to publish or combine characteristics into categories. Then you could have several female students of business administration in Saxony aged between 30 and 40.

Data that cannot be anonymised, such as videos, images, audio recordings, can be published in a processed version (i.e. as a transcript).

Document your actions in an anonymisation protocol so that data can later be de-anonymised.

Consult with your data centre on the best way to proceed.

Link tips (please note that these tips may not conform to German data protection law):


  • Obtain declaration of consent beforehand and archive
  • Familiarise yourself with data protection
  • Consult with data centre
  • Anonymise data
  • Upload data
  • Publish DOI


Date: September 2020
Questions, comments and notes are welcome at

to Open Science Magazine