Upcoming Talk at UCSD Health
Heng Xu and Nan Zhang, co-directors of the Robust Analytics Lab, will give a (virtual) presentation at the University of California San Diego Health Department of Biomedical Informatics (DBMI) lecture series on Friday, November 6, 2020.
Time: 1:00pm – 2:00pm Pacific Standard Time (4:00pm - 5:00pm Eastern Standard Time), Friday, November 6, 2020
Implications of Data Anonymization on Disparity Detection
Abstract: Research and practical development of data anonymization techniques has proliferated in recent years. Yet limited attention has been paid to examine the potentially disparate impact of privacy protection on underprivileged sub-populations. The objective of this talk is to examine the extent to which data anonymization could mask the gross statistical disparities between sub-populations in the data. We first describe two common mechanisms of data anonymization and two prevalent types of statistical evidence for disparity. Then, we develop conceptual foundation and mathematical formalism demonstrating that the two data anonymization mechanisms have distinctive impacts on the identifiability of disparity, which also varies based on its statistical operationalization. After validating our findings with empirical evidence, we discuss the practical implications, highlighting the need for data scientists and policy makers to balance between the protection of privacy and the recognition/rectification of disparate impact.
This paper was also presented at the Harvard Privacy Tools Working Group Seminar Series and 2020 Privacy Law Scholars Conference.