Introduction#
Journals require that you share your code and data in a replication package at the end of your research project. That can be a challenge, when some parts of your research data are confidential. This document provides an overview of ensuring the reproducibility of your research when data are confidential. It is not meant to be exhaustive, and it is not meant to be prescriptive. There are many ways to construct a replication package, and even more situations in which confidential data are housed.
Following some best practices from day 1 can not only help you prepare this package later, but also make you more productive researchers. Following some best practices before releasing a package can avoid costly revisions.
Before we start#
Many of the methods and techniques described here are not specific to confidential data. Before we go into the details, we suggest that you read the following chapters and presentations. We will refer to them at particular points in this document.
Alternate formats#
This subject is also available as
an online presentation and its printable PDF (also in Spanish 🛠️)
(🛠️ indicates work-in-progress, WIP)
TL;DR#
Techy lingo for “too long, didn’t read”. A summary of the most important takeaways will be at the top of each section.
How to contribute#
Open a pull request at the repository, which can be done from every page using the buttons at the top right.