Statistically Valid Inferences from Privacy Protected Data
For academic researchers to access data collected by private industry, balancing opposing interests is insufficient; we must instead use technology to solve political problems and make balancing unnecessary. We describe how this was accomplished with Facebook data by adapting innovations in constitutional design. To make data available more generally, we then propose a general-purpose data access and analysis system with mathematical guarantees of privacy for individuals who may be represented in the data and statistical guarantees for researchers analyzing it. We build on the standard of “differential privacy” but, unlike most such approaches, we also correct for the serious statistical biases induced by privacy-preserving procedures, provide a proper accounting for statistical uncertainty, and impose minimal constraints on the choice of data analytic methods and types of quantities that can be estimated. Just as data providers need differential privacy to protect individuals, they need inferential validity to protect individuals and society from fallacious scientific conclusions. We emphasize throughout ease of implementation and use, and high levels of computational efficiency. We also offer open source software that demonstrates how to implement all our methods. Based on joint work with Georgina Evans, Margaret Schwenzfeier, and Abhradeep Thakurta.
Host: Center for Data and Computing
Gary King is the Albert J. Weatherhead III University Professor at Harvard University — one of 25 with Harvard's most distinguished faculty title — and Director of the Institute for Quantitative Social Science. King develops and applies empirical methods in many areas of social science, focusing on innovations that span the range from statistical theory to practical application.
King is an elected Fellow in 8 honorary societies (National Academy of Sciences, American Statistical Association, American Association for the Advancement of Science, American Academy of Arts and Sciences, Society for Political Methodology, National Academy of Social Insurance, American Academy of Political and Social Science, and the Guggenheim Foundation) and has won more than 55 prizes and awards for his work. King was elected President of the Society for Political Methodology and Vice President of the American Political Science Association. He has been a member of the Senior Editorial Board at Science, Visiting Fellow at Oxford, and Senior Science Adviser to the World Health Organization. He has written more than 170 journal articles, 20 open source software packages, and 8 books.
King originally proposed the now widely accepted standard for fairness in legislative redistricting known as “partisan symmetry,” and the methods used by courts and parties to detect when partisan gerrymandering violates it. His “ecological inference” methods for inferring individual behavior from aggregate data are used in most jurisdictions applying the Voting Rights Act to detect racial gerrymandering. His book with Keohane and Verba, Designing Social Inquiry, helped launch the modern subfield of qualitative methods in political science; his book Unifying Political Methodology had a similar role for quantitative political methodology. His “Replication, Replication” article initiated the data sharing movement in political science, and his ongoing international “Dataverse” project supports the movement across fields. His “anchoring vignettes” approach to cross-cultural survey comparability has been used in more than 100 countries by researchers, governments, and others. King pioneers “politically robust” research designs that make possible unusually large randomized experiments in politically difficult circumstances — including the largest ever randomized health policy experiment, to evaluate the Mexican universal health insurance program, and the only large scale randomized news media experiment. He has reverse engineered Chinese censorship and fabrication of social media posts, improved Social Security Trust Fund forecasts, and developed empirical methods and software widely used in academia, government, and private industry for automated text analysis, rare events, missing data, measurement error, causal inference, interpreting statistical results, and for forecasting elections, mortality rates, and international conflict.
King's work is widely read across scholarly fields and beyond academia. He was listed as the most cited political scientist of his cohort; among the group of “political scientists who have made the most important theoretical contributions” to the discipline “from its beginnings in the late-19th century to the present”; and on lists of the most highly cited researchers across the social sciences. King’s many former students and postdocs now hold positions at leading universities and companies around the world. He has collaborated with more than 250 scholars, including many of his students, on research for publication. He has served on more than 30 editorial, nonprofit, and corporate boards; as founding editor of The Political Methodologist, and on the governing councils of the American Political Science Association, Inter-university Consortium for Political and Social Research, Society for Political Methodology, Midwest Political Science Association, Center for the Advanced Study in the Behavioral Sciences, and the Institute for Data, Science, and Society.
King is also an experienced entrepreneur. He is co-founder and an inventor of the original technology for Crimson Hexagon (merged with Brandwatch), Learning Catalytics (acquired by Pearson), Perusall, Thresher, OpenScholar, and other firms. He has received 12 patents for these technologies.
King received a B.A. from SUNY New Paltz (1980) and a Ph.D. from the University of Wisconsin-Madison (1984). He taught at NYU for three years before coming to Harvard in 1987. His research has been supported by the National Science Foundation, Centers for Disease Control and Prevention, World Health Organization, Sloan Foundation, National Institute of Aging, Gates Foundation, Library of Congress, Global Forum for Health Research, and other centers, corporations, foundations, and federal agencies.