Please wait a minute...
 首页  期刊介绍 期刊订阅 联系我们
 
最新录用  |  预出版  |  当期目录  |  过刊浏览  |  阅读排行  |  下载排行  |  引用排行  |  百年期刊
Journal of Tsinghua University(Science and Technology)    2018, Vol. 58 Issue (3) : 231-236     DOI: 10.16511/j.cnki.qhdxxb.2018.26.015
COMPUTER SCIENCE AND TECHNOLOGY |
Approximation algorithms for aggregate queries on uncertain data
CHEN Donghui1, CHEN Ling1, WANG Junkai1, WU Yong2, WANG Jingchang2
1. College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China;
2. Zhejiang Hongcheng Computer Systems Co., Ltd., Hangzhou 310009, China
Download: PDF(1167 KB)  
Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks    
Abstract  Analyses of big data sets often require aggregate queries on uncertain data with various types of data that are computationally complex. In this paper, the results of aggregate queries on uncertain data are defined to include all possible values and their corresponding probabilities. Dynamic programming is then used to solve the Distribution Sum (DSUM) algorithm using a Greedy-based Distribution Sum and a Binary Merge based Distribution Sum approximation algorithms, which both can be applied to tuple-level and attribute-level uncertainty models. The time and space complexities of the algorithms are determined theoretically as well as the error range of the results. Tests demonstrates that these two approximation algorithms with a 1% allowable error shorten the execution times by 15%-21% and 22%-32%, respectively.
Keywords aggregate queries      approximation algorithms      uncertain data      dynamic programming      error estimation     
ZTFLH:  TP301.6  
Issue Date: 15 March 2018
Service
E-mail this article
E-mail Alert
RSS
Articles by authors
CHEN Donghui
CHEN Ling
WANG Junkai
WU Yong
WANG Jingchang
Cite this article:   
CHEN Donghui,CHEN Ling,WANG Junkai, et al. Approximation algorithms for aggregate queries on uncertain data[J]. Journal of Tsinghua University(Science and Technology), 2018, 58(3): 231-236.
URL:  
http://jst.tsinghuajournals.com/EN/10.16511/j.cnki.qhdxxb.2018.26.015     OR     http://jst.tsinghuajournals.com/EN/Y2018/V58/I3/231
  
  
  
  
  
  
  
  
[1] DESHPANDE A, GUESTRIN C, MADDEN S. Using probabilistic models for data management in acquisitional environments[C]//Proceedings of the International Conference on Innovation Database Research. Asilomar, USA:Online Proceeding, 2005:317-328.
[2] PENG L P, DIAO Y L. Supporting data uncertainty in array databases[C]//Proceedings of 2015 International Conference on Management of Data. Melbourne, Australia:ACM Press, 2015:545-560.
[3] RÉ C, SUCIU D. Approximate lineage for probabilistic databases[J]. Proceedings of the VLDB Endowment, 2008, 1(1):797-808.
[4] SINGH S, MAYFIELD C, MITTAL S, et al. Orion 2.0:Native support for uncertain data[C]//Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. Vancouver, Canada:ACM Press, 2008:1239-1242.
[5] HUANG J W, ANTOVA L, KOCH C, et al. MayBMS:A probabilistic database management system[C]//Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data. Providence, USA:ACM Press, 2009:1071-1074.
[6] WANG Y J, LI X Y, LI X L, et al. A survey of queries over uncertain data[J]. Knowledge and Information Systems, 2013, 37(3):485-530.
[7] ZHOU X, LI K L, ZHOU Y T, et al. Adaptive processing for distributed skyline queries over uncertain data[J]. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(2):371-384.
[8] CICERI E, FRATERNALI P, MARTINENGHI D, et al. Crowdsourcing for top-k query processing over uncertain data[J]. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(1):41-53.
[9] CHEN L, GAO Y J, LI X H, et al. Indexing metric uncertain data for range queries[C]//Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. Melbourne, Australia:ACM Press, 2015:951-965.
[10] DALLACHIESA M, PALPANAS T, ILYAS I F. Top-k nearest neighbor search in uncertain data series[J]. Proceedings of the VLDB Endowment, 2014, 8(1):13-24.
[11] JAYRAM T S, KALE S, VEE E. Efficient aggregation algorithms for probabilistic data[C]//Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms. New Orleans:Society for Industrial and Applied Mathematics, USA:2007:346-355.
[12] MURTHY R, IKEDA R, WIDOM J. Making aggregation work in uncertain and probabilistic databases[J]. IEEE Transactions on Knowledge and Data Engineering, 2011, 23(8):1261-1273.
[13] DALVI N, SUCIU D. Management of probabilistic data:Foundations and challenges[C]//Proceedings of the 26th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. Beijing, China:ACM Press, 2007:1-12.
[14] AGARWAL P K, KUMAR N, SINTOS S, et al. Range-max queries on uncertain data[C]//Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems. San Francisco, USA:ACM Press, 2016:465-476.
[15] NATH S, GIBBONS P B, SESHAN S, et al. Synopsis diffusion for robust aggregation in sensor networks[J]. ACM Transactions on Sensor Networks, 2008, 4(2):250-262.
[16] ZENG K, GAO S, MOZAFARI B, et al. The analytical bootstrap:A new method for fast error estimation in approximate query processing[C]//Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. Snowbird, USA:ACM Press, 2014:277-288.
[17] HUA M, PEI J, ZHANG W J, et al. Ranking queries on uncertain data:A probabilistic threshold approach[C]//Proceedings of the 27th International Conference on Management of Data. Vancouver, Canada:ACM Press, 2008:673-686.
[1] PEI Yukui, HAO Haoran, SU Li. IQ-separated baseband structure for high-speedsatellite digital communication[J]. Journal of Tsinghua University(Science and Technology), 2018, 58(9): 827-832.
[2] HAN Donghong, SONG Ming, ZHANG Hongliang, WANG Jiaxi, WANG Jiaxing, WANG Guoren. Algorithm for clustering uncertain data streams based on density[J]. Journal of Tsinghua University(Science and Technology), 2017, 57(8): 884-891.
[3] QIN Zhaobo, LUO Yugong, ZHANG Donghao, CHEN Long, LI Keqiang. Powertrain design of hydraulic hybrid track-type bull dozers based on planetary gear sets[J]. Journal of Tsinghua University(Science and Technology), 2017, 57(5): 497-503.
[4] GUO Mengwu, ZHONG Hongzhi. Equivalence of two strictly bounding approaches for goal-oriented error estimation[J]. Journal of Tsinghua University(Science and Technology), 2017, 57(4): 362-368.
[5] ZHANG Canrong, ZHONG Ming, MIAO Lixin. Location assignments for outbound containers in container terminals[J]. Journal of Tsinghua University(Science and Technology), 2015, 55(10): 1150-1156.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
Copyright © Journal of Tsinghua University(Science and Technology), All Rights Reserved.
Powered by Beijing Magtech Co. Ltd