4 Policy and Official Statistics
In January 2017, the Committee on National Statistics released a special panel report focused on developing innovations in the U.S. statistical system focused, in part, on preserving privacy (National Academies of Sciences, Engineering, and Medicine, 2017). Their report is essential reading to understand the governing principles and practical needs of the statistical system, particularly as it relates to privacy modernization. For a more applied perspective, Schmutte & Vilhuber (2017) report the proceedings of an ad hoc workshop on practical privacy held at the Census Bureau. That workshop gathered together academic privacy researchers and Census domain experts to help design formal privacy systems for key data products. In such meetings, it is necessary to make sure people are speaking the same language. Prewitt (2011) describes the specific meanings of the terms “privacy” and “confidentiality” as they have historically been used at the Census Bureau.
Manski (2015) offers a framework for thinking about total error in official statistics, which refers to the various ways measured quantities may differ from the concepts of interest, including measurement error, design error, and sampling error. From this perspective, privacy protection is yet another source of error in any statistical system. Maintaining the public trust is a key factor motivating the interest of statistical agencies in privacy protection. The less people trust the system, the less likely they respond accurately, or at all. Childs, King, & Fobia (2015) discuss recent statistics on trust in official statistics and their implications for data collection. Finally, Haney et al. (2017) and Holan, Toth, Ferreira, & Karr (2010) are good examples of the sorts of implementation details one may encounter when applying statistical privacy protections in public data.
References
Childs, J. H., King, R., & Fobia, A. (2015). Confidence in U.S. federal statistical agencies. Survey Practice, 8(5). https://doi.org/10.29115/sp-2015-0024
Haney, S., Machanavajjhala, A., Abowd, J. M., Graham, M., Kutzbach, M., & Vilhuber, L. (2017). Utility cost of formal privacy for releasing national employer-employee statistics. In SIGMOD ’17. Proceedings of the 2017 International Conference on Management of Data. https://doi.org/10.1145/3035918.3035940
Holan, S. H., Toth, D., Ferreira, M. A. R., & Karr, A. F. (2010). Bayesian multiscale multiple imputation with implications for data confidentiality. Journal of the American Statistical Association, 105(490), 564–577. https://doi.org/10.1198/jasa.2009.ap08629
Manski, C. F. (2015). Communicating uncertainty in official economic statistics: An appraisal fifty years after morgenstern. Journal of Economic Literature, 53(3), 631–653. https://doi.org/10.1257/jel.53.3.631
National Academies of Sciences, Engineering, and Medicine. (2017). Innovations in federal statistics: Combining data sources while protecting privacy. https://doi.org/10.17226/24652
Prewitt, K. (2011). Why It Matters to Distinguish Between Privacy & Confidentiality. Journal of Privacy and Confidentiality, 3(2), 41–47. https://doi.org/10.29012/jpc.v3i2.600
Schmutte, I. M., & Vilhuber, L. (Eds.). (2017). Proceedings from the 2016 NSF-Sloan Workshop on Practical Privacy. Retrieved from https://digitalcommons.ilr.cornell.edu/ldi/33/