Researchers have an ethical and legal duty towards human subjects. They and their personal data must be protected from any potential harm.
Sensitive data you collect and share must be rendered unidentifiable through anonymisation at a level sufficient to ensure they cannot be identified by cross-checking datasets. In the case of non-sensitive personal data, simple pseudonymisation (replacing names and other identifiers with codes while keeping a secure codebook) can be sufficient with the subject's informed consent.
For information on the legal side, you can check our guide on data protection (in French) covering aspects of GDPR (EU law) and LDAP (Swiss law). For information on the ethical side, please check the Research ethics page created by the Research office.
In general, direct identifiers will need to be removed from your dataset as soon as practical and stored separately, and often deleted when your research does not require them anymore. Direct identifiers include:
You must also check that indirect identifiers do not collectively allow the identification of a research participant, especially for a small sample size. Indirect identifiers include:
These various types of identifiers can be found both in qualitative and quantitative data in various forms.
The risk stemming from indirect identifiers is higher with a small target group or sample size. Re-identification risk factors in general include:
Methods to reduce the risk include:
These operations must be properly described in your documentation, especially if you choose to apply randomisation, which severely affects a dataset.
The following tools might be of interest for you:
Amnesia: a research data anonymisation tool developed by OpenAIRE.
ARX Data Anonymization Tool: open source software
IQDA Qualitative Data Anonymizer: for qualitative data
Cornell Anonymization Toolkit: for tabular data
sdcMicro: Statistical Disclosure Control Methods for Anonymization of Data and Risk Estimation, package for R.
Anonymisation, UK Data Service: https://www.ukdataservice.ac.uk/manage-data/legal-ethical/anonymisation.aspx (with tool for Word)
Deidentification, Latrobe: https://latrobe.libguides.com/sensitivedata/deidentification
Data anonymisation, Nanyang TU Singapore: https://libguides.ntu.edu.sg/anon
Removing identifiers from data, USYD: https://libguides.library.usyd.edu.au/datapublication/desensitise-data