Data Sharing and Archiving for Reproducibility (RT2 2021): Part 3

Lars Vilhuber
2021-09-03

Cornell University

Previously

node1

  • Survey forms - ✔️✔️
  • Metadata - ✔️✔️
  • Sample data - ✔️✔️
  • Actual data - ✔️✔️

zenodo filelist

Lessons learned

cycle

  • Goal 1: Be able to curate the data necessary for reproducible analysis ✔️
  • Goal 2: Know when to do so ✔️
  • Goal 3: Choose license (while respecting ethics)

Next:

share

Balancing multiple ethical priorities

Maximizing openness

Preserving privacy of respondents

A previous example

Access conditions involve application process.

iab Access metadata

But information ABOUT the access process (=metadata) is available.

Options for you

We will use the example of Zenodo to illustrate the various options, but many other repositories have such options.

license

Embargo

Zenodo embargo:

Embargo status: Users may deposit content under an embargo status and provide and end date for the embargo. The repository will restrict access to the data until the end of the embargo period; at which time, the content will become publically available automatically.”

(in the case of openICPSR, all the contents are visible, but files are not downloadable)

embargo

Restricted

Zenodo restricted access:

Restricted Access: Users may deposit restricted files with the ability to share access with others if certain requirements are met. These files will not be made publicly available and sharing will be made possible only by the approval of depositor of the original file.”

restricted

Restricted (openICPSR)

openICPSR restricted access:

“Users can then apply for access to those restricted data using the ICPSR Data Access Request System (IDARS), through which applicants agree to follow strict legal and electronic requirements for maintaining data confidentiality. ”

Important: It is ICPSR that approves the access, not the depositor, subject to a standard set of requirements.

restricted

Closed access

Zenodo closed access

“Zenodo allows users to upload files under closed access. Closed access means that zenodo.org users will not be able to access the files you uploaded. The files are however stored unencrypted and may be viewed by Zenodo operational staff under specific conditions. This means that “closed access” on Zenodo is not suitable for secret or confidential data.”

closed

De facto, only you (not even your colleagues) can access the data.

Licenses

Licenses

  • Licenses provide (automatic) permissions to users of the data.
  • By default, posted data are copyrighted (in the United States), and users gain no rights (just because you can download them does not give you additional rights)
    • in particular, users are often restricted from redistributing them!

Some guidance:

Licenses and Reproducibility

For the purpose of replicability, journals will usually insist on an open license that allows for replication by researchers unconnected to the original parties, to the extent allowed by other agreements and the law.

Dual-License Setup

Many repositories contain both code and databases. In that case, the repository might contain files under different licenses. For instance, some components may come with more restrictive licenses (MIT License for software from third parties) or more lenient licenses (CC0 license for own code), with a third license for databases.

AEA LICENSE-template. It combines

Default Licenses

Most trusted repositories have a default license

icpsr choice

icpsr

Restricted Licenses

Naturally, if the data have ethical constraints, redistribution is generally not permitted.

  • If additional conditions are imposed, one often talks about a Data Use Agreement.
  • Not only when data are sensitive, but also when data owners do more intense data usage tracking

wv-value

Conclusion

  • Licensing is an important choice
  • There are many licenses
    • but there are a few safe default choices for most applications
  • It is possible to preserve data while maintaining privacy of respondents through more restrictive licenses and data use agreements
    • but should always be as open as possible!

Thank you

License

Source