Alternate sources of data#
We sometimes encounter replication packages that reference Dataverse or Zenodo. Download those as they are found on those sites. Extract them as you would the openICPSR downloads, but please pay attention to the directory naming structure:
Source |
DOI or project |
Directory name ( |
Pipeline Variable |
---|---|---|---|
openICPSR |
https://doi.org/10.3886/E123456V1 |
|
|
openICPSR |
openicpsr-123456 |
|
|
Dataverse |
|
||
Dataverse |
|
||
Zenodo |
|
|
|
World Bank |
|
|
If a Pipeline Variable
is given, use that in the Bitbucket Pipeline when creating or updating the repository.
Usage#
Whether manually downloading the files or using the scripts, once the ZIP file is downloaded, you would still use
unzip -n NAME_OF_ZIP.zip -d DIRNAME
where DIRNAME
is defined as in the table above, and NAME_OF_ZIP.zip
is whatever you downloaded from the repository. In the case of the [openICPSR download script]((get-the-data), the file downloaded from openICPSR is typically called DIRNAME.zip
, e.g, 123456.zip
. It will be called something else in most other cases.
Utility scripts#
We have a few scripts, some of which have not yet been integrated into the pipeline. All can be run from the command line instead of manually downloading the repository. All are in the tools
directory of your issue-specific repository. If not, refresh the tools.
download_openicpsr-private.py: For downloading from private openICPSR deposits (also integrated into the pipeline)
download_dv.py: For downloading from Dataverse repositories.
download_osf.sh: For downloading from OSF repositories.
download_worldbank.py: For downloading from World Bank Reproducibility Repository, including the public reproducibility report!
download_zenodo_public.sh: For downloading from public Zenodo repositories
download_zenodo_draft.py: For downloading from private/draft Zenodo deposits (also integrated into the pipeline)