About the Conference on Reproducibility and Replicability in Economics and the Social Sciences

The Conference on Reproducibility and Replicability in Economics and the Social Sciences is a series of virtual and in-person panels on the topics of reproducibility, replicability, and transparency in the social sciences. The purpose of scientific publishing is the dissemination of robust research findings, exposing them to the scrutiny of peers and other interested parties. Scientific articles should accurately and completely provide information on the origin and provenance of data and on the analytical and computational methods used. Yet in recent years, doubts about the adequacy of the information provided in scientific articles and their addenda have been voiced. The conferences will address the following topics: the initiation of research, the conduct of research, the preparation of research for publication, and the scrutiny after publication. Undergraduates, graduate students, and career researchers will be able to learn about best practices for transparent, reproducible, and scientifically sound research in the social sciences.

Session 1: Institutional support: Should journals verify reproducibility?

2022-09-27 @ 4:15 PM Eastern
Moderator: Lars Vilhuber (Cornell University, AEA Data Editor)
Different journals have different approaches towards enforcement of their data availability policies, ranging from a thorough and complete verification including running code and checking the output, to a cursory review of the files provided to make sure they appear satisfactory, to simply receiving the data and code package and archiving it on a website or a repository. What drives the choice of approach? What are the reasons behind such choices?
One of the most crucial dimensions that Institutional Review Boards are interested in are the protocols that researchers have in place to protect their subjects' privacy. This often leads to researchers writing in their IRB protocols that they will destroy their data once their project is complete. Understandably, however, destruction of data makes it impossible to verify and replicate work, which is increasingly becoming a vital part of modern science. How should data privacy be handled in the wake of the replication crisis? What protocols and standards should be put in place to minimize the risk of data leakage? Or should data be destroyed after some time span?

Session 2: Reproducibility and ethics - IRBs and beyond

2022-10-25 @ 4:15 PM Eastern
Moderator: Lars Vilhuber (Cornell University, AEA Data Editor)

Session 3: Should teaching reproducibility be a part of undergraduate education or curriculum? (at SEA)

2022-11-20 **exceptionally** @ 1:15 PM Eastern
  • Diego Mendez-Carbajo (St. Louis Fed)
  • Richard Ball (Project TIER)
  • Lars Vilhuber (Cornell University, AEA Data Editor)
Moderator: Ian Schmutte (UGA)
Panelists will discuss teaching reproducibility (TIER Protocol), the involvement of undergraduates for replications based on restricted-access data, and other topics. The session is in person at the Southern Economic Association.
What happens to reproducibility when data are confidential or proprietary? Many journals can only ask that detailed access procedures be provided in a ReadMe file, but what mechanisms could be used to conduct computational reproducibility checks on such data? Should authors temporarily share their data with the journal for the purposes of reproducibility verification, even if they are not part of the public data replication package? Is it feasible to use a network of "insiders" to run code provided as part of a data replication package to assess reproducibility? Could a "certified run" be used?

Session 4: Reproducibility and confidential or proprietary data: can it be done?

2022-12-13 **exceptionally** @ 12:15 PM Eastern
Moderator: Aleksandr Michuda (Cornell University, Data Science)

Session 5: Disciplinary support: why is reproducibility not uniformly required across disciplines?

2023-01-31 @ 4:15 PM Eastern
Moderator: Lars Vilhuber (Cornell University, AEA Data Editor)
Why do learned societies decide (or not) to implement data (and code) availability policies? What influences the level of enforcement, and the choice of "enforcer" (data editor, administrative staff, referees)? What are reasons NOT to require data sharing or code sharing?
When journals conduct active verification of replication packages, including accessing data and running code, how does that work? Can journals with limited resources still assess reproducibility? What depth of verification is optimal? Do journals provide a clear indication of whether an article was successfully reproduced?

Session 6: Institutional support: How do journal reproducibility verification services work?

2023-02-28 @ 4:15 PM Eastern
Moderator: Marie Connolly (UQAM)

Session 7: Why can or should research institutions publish replication packages?

2023-03-28 @ 4:15 PM Eastern
Moderator: Aleksandr Michuda (Cornell University, Data Science)
This session brings various perspectives together on how research institutions think about publishing replication packages themselves, i.e., not a journal or generalist repository. Panelists come from a university with a specialized, university-centred data repository; from a Federal Reserve Bank with an active researcher community, and from a non-profit (non-academic) research institution. Each faces the requirements of varied internal researchers, external visibility, and differing audiences. The panelists can all speak to how a research institution makes decision about the degree of transparency, and how much of that to do with internal resources.
Both private and government funders of academic research have been increasingly requiring that data collected or created as part of funded research be made openly available. However, it is still rare that this requirement extends to computational artifacts, such as code, and even more rare that computational reproducibility is required. The panelists all work for funders, and have experience with various funding models and approaches. But only one of them currently enforces computational reproducibility of funded research.

Session 8: Should funders require reproducible archives?

2023-04-25 @ 4:15 PM Eastern
  • Martin Halbert (NSF)
  • Sebastian Martinez (3ie)
  • Stuart Buck (Good Science Project)
Moderator: Lars Vilhuber (Cornell University, AEA Data Editor)

Session 9: Reproducibility, confidentiality, and open data mandates (at CEA)

2023-05-30 **exceptionally** @ 10:30 am Central
  • Kimberly McGrail (UBC)
  • S. Martin Taylor (McMaster)
  • Matthew Lucas (SSHRC)
Moderator: Marie Connolly (UQAM)
Many granting agencies have adopted open data mandates. What is the interplay between reproducibility and those mandates? How can researchers be supported to meet those mandates, both in general, and specifically when data are confidential. At first glance, confidentiality and open data seem irreconciable, but could we find practices that both respect confidentiality and provide enough information and transparency to foster reproducibility? *NOTE: Session is part of the annual CEA conference, online day. A (free) recording will be made available after the conference.*
Incorporating reproducibility into the graduate curriculum is very appealing, since training early career scholars speeds the adoption of new research practices. However, adding new material to the graduate curriculum comes at a cost, since time spent learning the principles and practice of reproducibility requires effort that could be spent in other activities. With this in mind, should reproducibility be part of the core curriculum for doctoral students, taught as part of specialized methods courses, or just be available for those who seek it out? What are best practices for teaching reproducibility to graduate students? If they were included in the graduate curriculum, where might they fit and what might they replace?

Session 10: The integration of reproducibility into social science graduate education.

2023-06-27 @ 4:15 PM Eastern
Moderator: Marie Connolly (UQAM)
--- ---


Each session is a 60-minute long online videoconference, tentatively scheduled for Tuesdays at 4:15 PM Eastern. Each panelist will speak for 10-15 minutes, followed by 15 minutes of discussion and audience questions, selected by the moderator. Certain sessions are in person at indicated conferences, in which case the availability of an online version depends on the conference facilities. Each session is recorded, and a link to the recording made available shortly afterwards on this page. Panelists are also asked to write a 5-7 page version of their statement (see template), which will be collected as an online "book" on this site. A selection of panelist statements are then peer-reviewed and published on a recurring column in the Harvard Data Science Review.


For more information, please contact us.
CRRESS is managed by co-PIs Lars Vilhuber and Aleksandr Michuda (Cornell University).
The organizing committee is composed of
Vilhuber, Michuda, Ian Schmutte (UGA), and Marie Connolly (UQAM).
Support is provided by Sara Brooks (Cornell University) as well as the staff at the Cornell University ILR School.