机器学习算法在同态加密数据集上的应用

贾春福, 王雅飞, 陈阳, 孙梦洁, 葛凤仪

清华大学学报(自然科学版) ›› 2020, Vol. 60 ›› Issue (6) : 456-463.

PDF(1953 KB)
PDF(1953 KB)
清华大学学报(自然科学版) ›› 2020, Vol. 60 ›› Issue (6) : 456-463. DOI: 10.16511/j.cnki.qhdxxb.2020.25.011
专题:可信计算与信息安全

机器学习算法在同态加密数据集上的应用

  • 贾春福1,2, 王雅飞1,2, 陈阳1, 孙梦洁1, 葛凤仪1
作者信息 +

Machine learning algorithm for a homomorphic encrypted data set

  • JIA Chunfu1,2, WANG Yafei1,2, CHEN Yang1, SUN Mengjie1, GE Fengyi1
Author information +
文章历史 +

摘要

大数据时代要求数据在云端进行存储和计算,这导致敏感数据隐私泄露的问题。该文提出了一种在同态加密数据集上应用机器学习分类算法的方案:首先对明文进行预处理,保证其满足对数据进行同态加密的要求;然后在加密数据集上通过协议实现比较、排序等操作;最后获取分类结果。客户端将加密数据上传,可以保证服务器端不会获取任何敏感信息;选取同态加密算法,能够保证服务器端仍可对密文执行相关操作。实验结果表明:该方案适用于Bayes、超平面和决策树分类器,其经过修正具有良好的适用性能,准确率高。

Abstract

The continuous development of big data requires that data be stored and analyzed in the cloud, which leads to privacy leakage of sensitive data. This paper presents a machine learning classification algorithm for homomorphic encrypted data sets. Firstly, preprocess the data set to meet the requirements of homomofphic encryption. The encrypted data set is then sorted by protocol and classified. Finally, the classification results are obtained. The client can then upload encrypted data and ensure that the server will not get any sensitive information. A homomorphic encryption algorithm is used to ensure that the server can still perform required operations on the ciphertext. Tests show that this scheme can provide accurate, useful results with Bayes, hyperplane and decision tree classifiers.

关键词

隐私保护 / 同态加密 / 分类器算法 / 加密数据运算

Key words

privacy-preserving / homomorphic encryption / classification algorithm / operations on encrypted data

引用本文

导出引用
贾春福, 王雅飞, 陈阳, 孙梦洁, 葛凤仪. 机器学习算法在同态加密数据集上的应用[J]. 清华大学学报(自然科学版). 2020, 60(6): 456-463 https://doi.org/10.16511/j.cnki.qhdxxb.2020.25.011
JIA Chunfu, WANG Yafei, CHEN Yang, SUN Mengjie, GE Fengyi. Machine learning algorithm for a homomorphic encrypted data set[J]. Journal of Tsinghua University(Science and Technology). 2020, 60(6): 456-463 https://doi.org/10.16511/j.cnki.qhdxxb.2020.25.011

参考文献

[1] BOST R, ADA POPA R, TU S, et al. Machine learning classification over encrypted data[C]//22nd Annual Network and Distributed System Security Symposium. San Diego, USA:MIT CSAIL, 2015:186-219.
[2] BOST R, POPA R A, TU S, et al. Machine learning classification over encrypted data[C]//22nd Annual Network and Distributed System Security Symposium. San Diego, USA:MIT CSAIL, 2015:4325.
[3] BARNI M, FAILLA P, LAZZERETTI R, et al. Privacy-preserving ECG classification with branching programs and neural networks[J]. IEEE Transactions on Information Forensics and Security, 2011, 6(2):452-468.
[4] GRAEPEL T, LAUTER K, NAEHRIG M. ML confidential:Machine learning on encrypted data[M]//KWON T, LEE M K, KWON D. Information Security and Cryptology. Berlin, Germany:Springer, 2012:1-21.
[5] CARPOV S, GAMA N, GEORGIEVA M, et al. Privacy-preserving semi-parallel logistic regression training with fully homomorphic encryption[J]. ePrint Archive, 2019:101.
[20] VEUGEN T. Efficient coding for secure computing with additively-homomorphic encrypted data[J]. IACR Cryptology ePrint Archive, 2019:437.
[6] 曹来成, 刘宇飞, 董晓晔, 等. 基于属性加密的用户隐私保护云存储方案[J]. 清华大学学报(自然科学版), 2018, 58(2):150-156. CAO L C, LIU Y F, DONG X Y, et al. User privacy-preserving cloud storage scheme on CP-ABE[J]. Journal of Tsinghua University (Science and Technology), 2018, 58(2):150-156. (in Chinese)
[7] CHEON J H, JEONG J, KI D, et al. Privacy-preserving k-means clustering with multiple data owners[J]. IACR Cryptology ePrint Archive, 2019:466.
[8] SO J, GULER B, AVESTIMEHR A S, et al. CodedPrivateML:A fast and privacy-preserving framework for distributed machine learning[Z]. arXiv:1902.00641, 2019.
[9] KISS Á, NADERPOUR M, LIU J, et al. SoK:Modular and efficient private decision tree evaluation[J]. Proceedings on Privacy Enhancing Technologies, 2019(2):187-208.
[10] BLOM F, BOUMAN N J, SCHOENMAKERS B, et al. Efficient secure ridge regression from randomized Gaussian elimination[Z]. IACR Cryptology ePrint Archive 2019/773, 2019.
[11] 蒋林智, 许春香, 王晓芳, 等. (全)同态加密在基于密文计算模型中的应用[J]. 密码学报, 2017, 4(6):596-610. JIANG L Z, XU C X, WANG X F, et al. Application of (fully) homomorphic encryption for encrypted computing models[J]. Journal of Cryptologic Research, 2017, 4(6):596-610. (in Chinese)
[12] 李增鹏, 马春光, 周红生. 全同态加密研究[J]. 密码学报, 2017, 4(6):561-578. LI Z P, MA C G, ZHOU H S. Overview on fully homomorphic encryption[J]. Journal of Cryptologic Research, 2017, 4(6):561-578. (in Chinese)
[13] ACAR A, AKSU H, ULUAGAC A S, et al. A survey on homomorphic encryption schemes:Theory and implementation[J]. ACM Computing Surveys, 2018, 51(4):79.
[14] BRAKERSKI Z, VAIKUNTANATHAN V. Fully homomorphic encryption from ring-LWE and security for key dependent messages[M]//ROGAWAY P. Advances in Cryptology. Berlin, Germany:Springer, 2011:505-524.
[15] ELGAMAL T. A public key cryptosystem and a signature scheme based on discrete logarithms[M]//Advances in Cryptology. Berlin, Germany:Springer, 1985:10-18.
[16] PAILLIER P. Public-key cryptosystems based on composite degree residuosity classes[M]//Advances in Cryptology-EUROCRYPT'99. Berlin, Germany:Springer, 1999:223-238.
[17] GENTRY C. Fully homomorphic encryption using ideal lattices[C]//Proceedings of the 41st Annual ACM Symposium on Theory of Computing. Bethesda, USA:ACM, 2009:169-179.
[18] BRAKERSKI Z, VAIKUNTANATHAN V. Efficient fully homomorphic encryption from (standard) LWE[C]//2011 IEEE 52nd Annual Symposium on Foundations of Computer Science. Palm Springs, USA:IEEE, 2011:97-106.
[19] VAIDYA J, KANTARCIOĞLU M, CLIFTON C. Privacy-preserving naive Bayes classification[J]. The VLDB Journal, 2008, 17(4):879-898.
[20] LAUR S, LIPMAA H, MIELIKÄINEN T. Cryptographically private support vector machines[C]//Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Philadelphia, USA:ACM, 2006:618-624.
[21] BLUM A, DWORK C, MCSHERRY F, et al. Practical privacy:The SuLQ framework[C]//Proceedings of the Twenty-Fourth Symposium on Principles of Database Systems, Baltimore, USA, 2005:128-138.
[22] Cortes C, Vapnik V. Support-vector networks[J]. Machine Learning, 1995, 20(3):273-297.
[23] VEUGEN T. Comparing encrypted data[EB/OL]. (2011). https://www.researchgate.net/publication/266527434_COMPARING_EnCRYPTED_DATA.
[24] AVIDAN S, BUTMAN M. Efficient methods for privacy preserving face detection[C]//Advances in Neural Information Processing Systems. Cambridge, USA:MIT Press, 2006:57.
[25] BACHE K, LICHMAN M. UCI machine learning repository[EB/OL].[2019-07-26]. https://archive.ics.uci.edu/ml/index.php.

基金

国家自然科学基金资助项目(61972215,61702399,61972073);国家重点研发计划项目(2018YFA0704703);天津市自然科学基金资助项目(17JCZDJC30500)

PDF(1953 KB)

Accesses

Citation

Detail

段落导航
相关文章

/