RAIS data

RAIS is the Brazilian Linked Employer-Employee Data. We are approved to use a copy at Cornell. To get added to the project, contact Lars.

Accessing the Data

Data are hosted in a controlled environment, and cannot be removed. To obtain access,

On the server

Data are at /home/ecco_rais, with data in /home/ecco_rais/data. Please be conscious that data files can be very large. Do not do computations on the (shared) /home drive.

  • You can use VNC on the SLURM head node (cbsueccosl01), you should see the reservation on the BioHPC Reservation Calendar.
  • Use the SLURM facility to submit jobs. Reserving a compute node should be limited to exceptional circumstances, as it prevents numerous other people from accessing resources you are not using on that compute node (typically, more than 80% of CPUs are not used if you are the only one on a 24-104 core node).
  • Always copy (large) files to /workdir in the first step of your processing.

Data description

See RAIS codebooks.