Presentation on “Privacy Disparity” @ Data & Society
Heng Xu and Nan Zhang visited Data & Society Research Institute in New York City and discussed how various privacy preserving techniques may disproportionately suppress information for minorities or mask health disparities.
The results of this work have recently published by Medical Care. Our research has demonstrated the extent to which one-size-fits-all privacy solutions will not serve all communities equally: To reliably identify health disparities, researchers inevitably need to access more private information from minorities. We call this “privacy disparity” which may arise from mandating the inclusion of identifier attributes like race and geolocation in a released healthcare dataset.
Specifically, we applied state-of-the-practice anonymization tools and state-of-the-art differential privacy techniques over real-world public health datasets. Our results demonstrated that, anonymization and differential privacy techniques may disproportionately suppress information for minorities or mask health disparities. Our research suggested that, without careful calibration for minority populations, these privacy tools and techniques could easily mask health disparities that are otherwise discoverable from the original data.
Policy Implications: Our findings indicate that regulators must carefully examine the privacy implications before mandating the inclusion of certain attributes like race and geolocation in a released dataset. For example, the Affordable Care Act requires all federally supported public health programs to collect data on race, ethnicity, sex, geographic location, etc., to the extent feasible. Although doing so could help identify health disparities, if the collected data are directly released without proper sanitization, the released data may introduce privacy disparities, causing further harm to underserved populations.