清华大学学报(自然科学版)  2020, Vol. 60 Issue (6): 456-463    DOI: 10.16511/j.cnki.qhdxxb.2020.25.011
  专题:可信计算与信息安全
贾春福1,2, 王雅飞1,2, 陈阳1, 孙梦洁1, 葛凤仪1
1. 南开大学 网络空间安全学院, 天津 300350;
2. 天津市网络与数据安全技术重点实验室, 天津 300350
Machine learning algorithm for a homomorphic encrypted data set
JIA Chunfu1,2, WANG Yafei1,2, CHEN Yang1, SUN Mengjie1, GE Fengyi1
1. College of Cyberspace Security, Nankai University, Tianjin 300350, China;
2. Tianjin Key Laboratory of Network and Data Security Technology, Tianjin 300350, China
摘要 大数据时代要求数据在云端进行存储和计算,这导致敏感数据隐私泄露的问题。该文提出了一种在同态加密数据集上应用机器学习分类算法的方案:首先对明文进行预处理,保证其满足对数据进行同态加密的要求;然后在加密数据集上通过协议实现比较、排序等操作;最后获取分类结果。客户端将加密数据上传,可以保证服务器端不会获取任何敏感信息;选取同态加密算法,能够保证服务器端仍可对密文执行相关操作。实验结果表明:该方案适用于Bayes、超平面和决策树分类器,其经过修正具有良好的适用性能,准确率高。
关键词 隐私保护同态加密分类器算法加密数据运算    
Abstract:The continuous development of big data requires that data be stored and analyzed in the cloud, which leads to privacy leakage of sensitive data. This paper presents a machine learning classification algorithm for homomorphic encrypted data sets. Firstly, preprocess the data set to meet the requirements of homomofphic encryption. The encrypted data set is then sorted by protocol and classified. Finally, the classification results are obtained. The client can then upload encrypted data and ensure that the server will not get any sensitive information. A homomorphic encryption algorithm is used to ensure that the server can still perform required operations on the ciphertext. Tests show that this scheme can provide accurate, useful results with Bayes, hyperplane and decision tree classifiers.
Key wordsprivacy-preserving    homomorphic encryption    classification algorithm    operations on encrypted data
收稿日期: 2019-09-24      出版日期: 2020-04-27
贾春福, 王雅飞, 陈阳, 孙梦洁, 葛凤仪. 机器学习算法在同态加密数据集上的应用[J]. 清华大学学报(自然科学版), 2020, 60(6): 456-463.
JIA Chunfu, WANG Yafei, CHEN Yang, SUN Mengjie, GE Fengyi. Machine learning algorithm for a homomorphic encrypted data set. Journal of Tsinghua University(Science and Technology), 2020, 60(6): 456-463.
