Alternate sources of data#

We sometimes encounter replication packages that reference Dataverse or Zenodo. Download those as they are found on those sites. Extract them as you would the openICPSR downloads, but please pay attention to the directory naming structure:

Source

DOI or project

Directory name (DIRNAME)

Pipeline Variable

openICPSR

https://doi.org/10.3886/E123456V1

123456

123456

openICPSR

openicpsr-123456

123456

123456

Dataverse

https://doi.org/10.7910/DVN/RE5ZVI

dv-10.7910-DVN-RE5ZVI

Dataverse

https://doi.org/10.3456/ABCDE

dv-10.3456-ABCDE

Zenodo

https://doi.org/10.5281/zenodo.7041706

zenodo-7041706

7041706

World Bank

https://doi.org/10.60572/101y-vn15

wb-101y-vn15

101y-vn15 or entire DOI

If a Pipeline Variable is given, use that in the Bitbucket Pipeline when creating or updating the repository.

Usage#

Whether manually downloading the files or using the scripts, once the ZIP file is downloaded, you would still use

    unzip -n NAME_OF_ZIP.zip -d DIRNAME

where DIRNAME is defined as in the table above, and NAME_OF_ZIP.zip is whatever you downloaded from the repository. In the case of the [openICPSR download script]((get-the-data), the file downloaded from openICPSR is typically called DIRNAME.zip, e.g, 123456.zip. It will be called something else in most other cases.

Utility scripts#

We have a few scripts, some of which have not yet been integrated into the pipeline. All can be run from the command line instead of manually downloading the repository. All are in the tools directory of your issue-specific repository. If not, refresh the tools.