Please wait a minute...
 首页  期刊介绍 期刊订阅 联系我们 横山亮次奖 百年刊庆
 
最新录用  |  预出版  |  当期目录  |  过刊浏览  |  阅读排行  |  下载排行  |  引用排行  |  横山亮次奖  |  百年刊庆
清华大学学报(自然科学版)  2022, Vol. 62 Issue (12): 1851-1863    DOI: 10.16511/j.cnki.qhdxxb.2022.21.024
  信息科学 本期目录 | 过刊浏览 | 高级检索 |
As-Stream:一种针对波动数据流的算子智能并行化策略
李维1,2, 李城龙3, 杨家海3
1. 清华大学 信息化技术中心, 北京 100084;
2. 中国地质大学(北京), 北京 100083;
3. 清华大学 网络科学与网络空间研究院, 北京 100084
As-Stream: An intelligent operator parallelization strategy for fluctuating data streams
LI Wei1,2, LI Chenglong3, YANG Jiahai3
1. Information Technology Center, Tsinghua University, Beijing 100084, China;
2. China University of Geosciences, Beijing 100083, China;
3. Institute for Network Science and Cyberspace, Tsinghua University, Beijing 100084, China
全文: PDF(2842 KB)   HTML
输出: BibTeX | EndNote (RIS)      
摘要 大量研究提出了从在线资源管理层面来优化波动数据流的方法, 却忽略了从流应用层面来优化算子并行度。例如, 在Apache Storm中, 算子并行度一旦设置就无法进行动态调整。该文提出了一种针对波动数据流的算子智能并行化策略As-Stream, 显著提升了流计算平台的性能。该方法在弹性智能监控模块中基于无监督学习和自适应分析对参数进行实时调优。As-Stream包括并行瓶颈识别、参数计划生成、参数迁移转换和参数迁移调度算法。该系统在Apache Storm平台上实现, 并在真实的分布式流计算环境中进行了大量测试。结果表明, As-Stream性能比现有通用调度策略有显著提升:当资源充足时, 平均吞吐量提高了2.4倍; 当资源受限时, 平均延迟减小了44%。
服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
李维
李城龙
杨家海
关键词 流计算机器学习算子并行度资源分配    
Abstract:A large number of studies have presented methods using online resource management to optimize stream computing for fluctuating data streams, but have not optimized the parallel operator operations at the streaming application level. For example, in Apache Storm, the operator parallelism cannot be dynamically adjusted once it is set. This paper presents an intelligent parallelization strategy for operators with fluctuating data streams, As-Stream, which significantly improves the streaming computing platform performance. This method uses real-time tuning of parameters based on unsupervised learning and self-adaptive analyses in an elastic intelligent monitoring module. As-Stream includes parallel bottleneck identification, parameter plan generation, parameter migration conversion and parameter migration scheduling algorithms. The system was implemented on an Apache Storm platform with a large number of tests in a real distributed stream computing environment. The results show that this system significantly improves the performance compared with existing default scheduling strategies. With sufficient resources, the average throughput is increased 2.4 fold while with limited resources, the average latency is reduced by 44%.
Key wordsstream computing    machine learning    operator parallelism    resource allocation
收稿日期: 2021-12-30      出版日期: 2022-11-10
基金资助:李城龙, 副研究员, E-mail:lichenglong@tsinghua.edu.cn
引用本文:   
李维, 李城龙, 杨家海. As-Stream:一种针对波动数据流的算子智能并行化策略[J]. 清华大学学报(自然科学版), 2022, 62(12): 1851-1863.
LI Wei, LI Chenglong, YANG Jiahai. As-Stream: An intelligent operator parallelization strategy for fluctuating data streams. Journal of Tsinghua University(Science and Technology), 2022, 62(12): 1851-1863.
链接本文:  
http://jst.tsinghuajournals.com/CN/10.16511/j.cnki.qhdxxb.2022.21.024  或          http://jst.tsinghuajournals.com/CN/Y2022/V62/I12/1851
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
[1] Apache Storm[EB/OL]. (2013-11-05)[2021-12-30]. https://storm.apache.org.
[2] Apache Heron[EB/OL]. (2015-09-25)[2021-12-30]. https://heron.apache.org.
[3] Apache Flink®[EB/OL]. (2014-06-07)[2021-12-30]. https://flink.apache.org.
[4] Apache. Spark Streaming[EB/OL]. (2014-02-25)[2021-12-30]. https://spark.apache.org/streaming.
[5] Apache Samza[EB/OL]. (2013-11-05)[2021-12-30]. https://samza.apache.org.
[6] Apache ApexTM[EB/OL]. (2015-03-14)[2021-12-30]. https://apex.apache.org.
[7] Google. Cloud Dataflow[EB/OL]. (2014-6-05)[2021-12-30]. https://cloud.google.com/dataflow.
[8] Apache. Timely Dataflow[EB/OL]. (2014-12-07)[2021-12-30]. https://github.com/timelydataflow.
[9] ZHAO Y J, LIU Z, WU Y D, et al. Timestamped state sharing for stream analytics[J]. IEEE Transactions on Parallel and Distributed Systems, 2021, 32(11): 2691-2704.
[10] DENG S Z, WANG B T, HUANG S, et al. Self-adaptive framework for efficient stream data classification on storm[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2020, 50(1): 123-136.
[11] MUHAMMAD A, ALEEM M, ISLAM M A. TOP-Storm: A topology-based resource-aware scheduler for stream processing engine[J]. Cluster Computing, 2021, 24(1): 417-431.
[12] AL-SINAYYID A, ZHU M. Job scheduler for streaming applications in heterogeneous distributed processing systems[J]. The Journal of Supercomputing, 2020, 76(12): 9609-9628.
[13] ZHENG T Y, CHEN G, WANG X Y, et al. Real-time intelligent big data processing: Technology, platform, and applications[J]. Science China Information Sciences, 2019, 62(8): 82101.
[14] LI W, SUN D W, GAO S, et al. A machine learning-based elastic strategy for operator parallelism in a big data stream computing system[C]//Proceedings of the 12th EAI International Conference on Broadband Communications, Networks, and Systems. Melbourne, Australia: Springer, 2021: 3-19.
[15] ZHOU Q H, GUO S, LU H D, et al. Falcon: Addressing stragglers in heterogeneous parameter server via multiple parallelism[J]. IEEE Transactions on Computers, 2021, 70(1): 139-155.
[16] HERODOTOU H, CHEN Y X, LU J H. A survey on automatic parameter tuning for big data processing systems[J]. ACM Computing Surveys, 2021, 53(2): 43.
[17] MUHAMMAD A, ALEEM M. A3Storm: Topology, traffic, and resourceaware storm scheduler for heterogeneous clusters[J]. The Journal of Supercomputing, 2021, 77(2): 1059-1093.
[18] CHENG D Z, ZHOU X B, WANG Y, et al. Adaptive scheduling parallel jobs with dynamic batching in spark streaming[J]. IEEE Transactions on Parallel and Distributed Systems, 2018, 29(12): 2672-2685.
[19] FANG J H, ZHANG R, FU T Z J, et al. Distributed stream rebalance for stateful operator under workload variance[J]. IEEE Transactions on Parallel and Distributed Systems, 2018, 29(10): 2223-2240.
[20] ISLAM M T, KARUNASEKERA S, BUYYA R. dSpark: Deadline-based resource allocation for big data applications in Apache Spark[C]//Proceedings of the 2017 IEEE 13th International Conference on e-Science (e-Science). Auckland, New Zealand: IEEE, 2017: 89-98.
[21] WANG W A, ZHANG C, CHEN X J, et al. An on-the-fly scheduling strategy for distributed stream processing platform[C]//Proceedings of the 2018 IEEE International Conference on Parallel and Distributed. Melbourne, VIC, Australia: IEEE, 2018: 773-780.
[22] LI W X, LIU D W, CHEN K, et al. Hone: Mitigating stragglers in distributed stream processing with tuple scheduling[J]. IEEE Transactions on Parallel and Distributed Systems, 2021, 32(8): 2021-2034.
[23] WEI X H, LI L, LI X, et al. Pec: Proactive elastic collaborative resource scheduling in data stream processing[J]. IEEE Transactions on Parallel and Distributed Systems, 2019, 30(7): 1628-1642.
[24] ARMBRUST M, DAS T, TORRES J, et al. Structured streaming: A declarative API for real-time applications in Apache Spark[C]//Proceedings of the 2018 International Conference on Management of Data. Houston, TX, USA: ACM, 2018: 601-613.
[25] SUN D W, HE H Y, YAN H B, et al. Lr-Stream: Using latency and resource aware scheduling to improve latency and throughput for streaming applications[J]. Future Generation Computer Systems, 2021, 114: 243-258.
[26] AO W C, PSOUNIS K. Resource-constrained replication strategies for hierarchical and heterogeneous tasks[J]. IEEE Transactions on Parallel and Distributed Systems, 2020, 31(4): 793-804.
[27] CAO H Y, WU C Q, BAO L, et al. Throughput optimization for storm-based processing of stream data on clouds[J]. Future Generation Computer Systems, 2020, 112: 567-579.
[28] LIU S C, WENG J P, WANG J H, et al. An adaptive online scheme for scheduling and resource enforcement in storm[J]. IEEE/ACM Transactions on Networking, 2019, 27(4): 1373-1386.
[29] ESKANDARI L, MAIR J, HUANG Z Y, et al. T3-Scheduler: A topology and traffic aware two-level scheduler for stream processing systems in a heterogeneous cluster[J]. Future Generation Computer Systems, 2018, 89: 617-632.
[30] CHEN H H, ZHANG F, JIN H. PStream: A popularity-aware differentiated distributed stream processing system[J]. IEEE Transactions on Computers, 2021, 70(10): 1582-1597.
[31] KHATIBI E, MIRTAHERI S L. A dynamic data dissemination mechanism for Cassandra NoSQL data store[J]. The Journal of Supercomputing, 2019, 75(11): 7479-7496.
[32] LIU P C, XU H L, DA SILVA D, et al. FP4S: Fragment-based parallel state recovery for stateful stream applications[C]//Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS). New Orleans, LA, USA: IEEE, 2020: 1102-1111.
[1] 吴浩, 牛风雷. 高温球床辐射传热中的机器学习模型[J]. 清华大学学报(自然科学版), 2023, 63(8): 1213-1218.
[2] 代鑫, 黄弘, 汲欣愉, 王巍. 基于机器学习的城市暴雨内涝时空快速预测模型[J]. 清华大学学报(自然科学版), 2023, 63(6): 865-873.
[3] 任建强, 崔亚鹏, 倪顺江. 基于机器学习的新冠肺炎疫情趋势预测方法[J]. 清华大学学报(自然科学版), 2023, 63(6): 1003-1011.
[4] 安健, 陈宇轩, 苏星宇, 周华, 任祝寅. 机器学习在湍流燃烧及发动机中的应用与展望[J]. 清华大学学报(自然科学版), 2023, 63(4): 462-472.
[5] 赵祺铭, 毕可鑫, 邱彤. 基于机器学习的乙烯裂解过程模型比较与集成[J]. 清华大学学报(自然科学版), 2022, 62(9): 1450-1457.
[6] 曹来成, 李运涛, 吴蓉, 郭显, 冯涛. 多密钥隐私保护决策树评估方案[J]. 清华大学学报(自然科学版), 2022, 62(5): 862-870.
[7] 王豪杰, 马子轩, 郑立言, 王元炜, 王飞, 翟季冬. 面向新一代神威超级计算机的高效内存分配器[J]. 清华大学学报(自然科学版), 2022, 62(5): 943-951.
[8] 陆思聪, 李春文. 基于场景与话题的聊天型人机会话系统[J]. 清华大学学报(自然科学版), 2022, 62(5): 952-958.
[9] 王啸宸, 李雪松, 任晓栋, 吴宏, 顾春伟. 多级压气机通流与CFD一体化优化设计方法[J]. 清华大学学报(自然科学版), 2022, 62(4): 774-784.
[10] 刘强墨, 何旭, 周佰顺, 吴昊霖, 张弛, 秦羽, 沈晓梅, 高小榕. 基于机器学习和瞳孔响应的简易高性能自闭症分类模型[J]. 清华大学学报(自然科学版), 2022, 62(10): 1730-1738.
[11] 马晓悦, 孟啸. 用户参与视角下多图推文的图像位置和布局效应[J]. 清华大学学报(自然科学版), 2022, 62(1): 77-87.
[12] 汤志立, 王雪, 徐千军. 基于过采样和客观赋权法的岩爆预测[J]. 清华大学学报(自然科学版), 2021, 61(6): 543-555.
[13] 王志国, 章毓晋. 监控视频异常检测:综述[J]. 清华大学学报(自然科学版), 2020, 60(6): 518-529.
[14] 宋宇波, 祁欣妤, 黄强, 胡爱群, 杨俊杰. 基于二阶段多分类的物联网设备识别算法[J]. 清华大学学报(自然科学版), 2020, 60(5): 365-370.
[15] 段海宁, 张彧, 宋健. 高空平台与地面协同广播中的无线资源配置[J]. 清华大学学报(自然科学版), 2020, 60(4): 306-311.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 《清华大学学报(自然科学版)》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn