Computational Tools for Social Scientists Workshop

Lars Vilhuber and some others

Location: Ives 111 | Time: 9:00 - 4:00 p.m. (we will typically end earlier)

Registration | Goals | Requirements


The goal of this workshop is to showcase computer-oriented techniques and tools for social science students, from basic command line tools on Linux and Mac to version control to optimization and parallelization techniques for high-performance computing, with reproducible methods training thrown in for good measure. The goal is NOT to teach a full course on SAS, Stata, Matlab, R, Python, MPI, Fortran, etc. - there are other classes for that. We will teach just enough of each programming language to be able to highlight additional techniques. There will be hands-on training on a few systems, such as CodeOcean and the new Econ cluster (for economics students). This workshop is designed to open your eyes to the possibilities, scratching the surface, but mostly not diving into any particular depths. Follow-on short courses may solve those needs. For specific programming languages, we point to offerings elsewhere on campus, for instance at CISER. more

We highlight that this is a workshop - we will work on problems as a group, drawing on expertise in the room as needed. If you have a specific question, and want to work on it, we may do so. If you want to primarily listen, that's fine too.

Target group

Second year Ph.D. and higher, and faculty, in Economics or other social sciences. If you haven't taken the course in the past, or want a refresher, you should participate


  • Working knowledge of at least one statistical programming language (R, SAS, Stata, Matlab, Gauss) - the specific languange is not important.
  • Bring your laptop to class!


Tentative Agenda - Day 1

Tentative Agenda - Day 1

  • 12:00-13:00 Lunch break. Take the opportunity to set yourself on Github.

  • 13:00-16:00 CISER/ R-Squared replication training (slides and materials)

Tentative Agenda - Day 2

  • 9:00-11:00 How to use ECCO2/BioHPC (Economics) (Jarek Pillardy)

  • 11:00- 12:00 Data: the iceberg in science. Citing data, managing data, curating data.

  • 12:00-13:00 Lunchtime chat on Reproducibility in Economics with the AEA Data Editor (=Lars)

  • 13:00 - 14:00 Continuation of data curation
  • 14:00 - 16:00 Introduction to parallel processing (Lars)
    • Subroutines and scalable programming (Lars) slides
    • Putting it into practice: Trying out parallel processing

Tentative Agenda - Day 3 (optional)

  • 9:00-12:00 Optional themes:
    • New programming languages: Julia and parallel processing
    • Automating processing in the cloud (Docker, etc.)
    • Setting up an Amazon cluster (basics of cluster computing)
    • Others.

Past contributors

John Abowd, Rick Mansfield, Daniel Lin, Hautahi Kingi, Flavio Stanchi, Jean-Francois Houde, Sylverie Herbert, Sida Peng, Kevin L. McKinney