清华大学学报(自然科学版)  2019, Vol. 59 Issue (1): 23-27    DOI: 10.16511/j.cnki.qhdxxb.2018.22.058
高志强, 崔翛龙, 杜波, 周沙, 袁琛, 李爱
武警工程大学 乌鲁木齐校区, 乌鲁木齐 830049
Collection scheme of location data based on local differential privacy
GAO Zhiqiang, CUI Xiaolong, DU Bo, ZHOU Sha, YUAN Chen, LI Ai
Urumqi Campus, Engineering University of PAP, Urumqi 830049, China
摘要 针对位置数据采集中的隐私保护问题,该文给出了基于本地差分隐私的位置数据采集方案。采用多阶段随机应答机制进行满足本地差分隐私的位置数据采集;以区域密度估计为目标,分别利用直接统计法和期望最大法进行位置数据分析。该方案保证不可信数据采集者利用非原始位置数据仍可以实现以统计特征为基础的位置数据分析。大量仿真实验结果表明:该方案在小样本位置数据场景下,期望最大法的可用性和隐私保护特性较优;在大样本位置数据量场景下,直接统计法和期望最大法的性能相近。
关键词 统计学习本地差分隐私位置隐私数据采集随机应答    
Abstract:Methods are needed to protect a person's privacy while monitoring their location. This paper presents a scheme for collecting location data based on local differential privacy. First, a multi-phase randomized response is used to collect the location data based on their local differential privacy. Then, the density of a certain section is estimated using the statistical method and expectation maximization (EM) to analyze the location data. The scheme guarantees that an untrustworthy data collector can still obtain the location statistics without direct access to the original data. Extensive tests verify that EM provides better privacy protection and better utility than the statistical method with limited location data. The results of the statistical method and EM are similar with abundant location data.
Key wordsstatistical learning    local differential privacy    location privacy    data collection    randomized response
收稿日期: 2018-10-15      出版日期: 2019-01-16
通讯作者: 崔翛龙,教授
高志强, 崔翛龙, 杜波, 周沙, 袁琛, 李爱. 满足本地差分隐私的位置数据采集方案[J]. 清华大学学报(自然科学版), 2019, 59(1): 23-27.
GAO Zhiqiang, CUI Xiaolong, DU Bo, ZHOU Sha, YUAN Chen, LI Ai. Collection scheme of location data based on local differential privacy. Journal of Tsinghua University(Science and Technology), 2019, 59(1): 23-27.
  图1 信息采集点数量对算法错误率的影响
  图2 数据量对错误率的影响
  图3 第一阶段随机应答中f 对错误率的影响
