Please wait a minute...
 首页  期刊介绍 期刊订阅 联系我们 横山亮次奖 百年刊庆
 
最新录用  |  预出版  |  当期目录  |  过刊浏览  |  阅读排行  |  下载排行  |  引用排行  |  横山亮次奖  |  百年刊庆
清华大学学报(自然科学版)  2017, Vol. 57 Issue (3): 244-249    DOI: 10.16511/j.cnki.qhdxxb.2017.26.004
  计算机科学与技术 本期目录 | 过刊浏览 | 高级检索 |
面向高通量应用的众核处理器任务调度
徐远超1,2, 杨璐1
1. 首都师范大学 信息工程学院, 北京 100048;
2. 中国科学院 计算技术研究所, 计算机体系结构国家重点实验室, 北京 100190
Task scheduling on a many-core processor for high-volume throughput applications
XU Yuanchao1,2, YANG Lu1
1. College of Information Engineering, Capital Normal University, Beijing 100048, China;
2. State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
全文: PDF(1394 KB)  
输出: BibTeX | EndNote (RIS)      
摘要 具有高通量特征的大数据应用已成为目前数据中心的主流应用,这些应用在传统处理器平台上的运行效率不高,原因之一是任务调度的低效。针对高通量应用的一些典型特征以及现有任务窃取算法的不足,该文提出一种程序行为和环境感知的任务调度机制,通过软硬件结合实现了处理器核的分区管理和任务的分级调度,减小了不同应用之间因争用共享资源对性能产生的不利影响,同时利用线程相似度高的特点提高指令缓存的命中率,从而提升系统的整体吞吐率。初步的模拟评估表明:该算法在混合负载情况下性能明显优于现有算法的,在测试的混合负载中平均优于现有算法20%。
服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
徐远超
杨璐
关键词 众核处理器大数据应用高通量任务调度    
Abstract:Big data applications with high-volume throughputs have become the most common applications in datacenters. The efficiencies of these applications running on traditional processors are very low for various reasons, one of which is the low-efficiency task scheduling. This paper presents a task scheduling framework that identifies program behavior and the running environment and then partitions the cores with hierarchical task scheduling though hardware and software co-design to reduce the negative effect of shared resource contention and improving the instruction cache hit rate using thread similarity. Tests show this algorithm improves performance by 20% on average over the legacy work-stealing scheduling algorithm.
Key wordsmany-core processor    big data applications    high-volume throughput    task scheduling
收稿日期: 2016-10-26      出版日期: 2017-03-15
ZTFLH:  TP316  
引用本文:   
徐远超, 杨璐. 面向高通量应用的众核处理器任务调度[J]. 清华大学学报(自然科学版), 2017, 57(3): 244-249.
XU Yuanchao, YANG Lu. Task scheduling on a many-core processor for high-volume throughput applications. Journal of Tsinghua University(Science and Technology), 2017, 57(3): 244-249.
链接本文:  
http://jst.tsinghuajournals.com/CN/10.16511/j.cnki.qhdxxb.2017.26.004  或          http://jst.tsinghuajournals.com/CN/Y2017/V57/I3/244
  图1 程序感知与环境感知的任务调度机制
  图2 环状拓扑的高通量众核结构
  图3 程序段离线分析流程
  图4 异构众核设计
  图5 两级任务分发
  表1 测试程序
  图6 单一负载时的性能比较
  图7 混合负载时的性能比较
[1] 王元卓, 靳小龙, 程学旗. 网络大数据:现状与展望[J]. 计算机学报, 2013, 36(6):1-15.WANG Yuanzhuo, JIN Xiaolong, CHENG Xueqi. Network big data:Present and future[J]. Chinese Journal of Computers, 2013, 36(6):1-15. (in Chinese)
[2] 詹剑锋, 王磊, 孙凝晖. 高通量计算机的性能评价[J]. 中国计算机学会通讯, 2011, 7(7):40-43.ZHAN Jianfeng, WANG Lei, SUN Ninghui. Performance evaluation of high-volume throughput computer[J]. Communication of CCF, 2011, 7(7):40-43. (in Chinese)
[3] Allan A, Edenfeld D, Joyner W H, et al. 2001 technology roadmap for semiconductors[J]. Computer, 2002, 35(1):42-53.
[4] Broquedis F, Diakhaté F, Thibault S, et al. Scheduling dynamic OpenMP applications over multicore architectures[C]//Proc of the International Workshop on OpenMP. Berlin, Germany:Springer, 2008:170-180.
[5] Frigo M, Leiserson C E, Randall K H. The implementation of the Cilk-5 multithreaded language[C]//ACM Sigplan Notices. Montreal, Quebec, Canada:ACM, 1998:212-223.
[6] Blumofe R D, Leiserson C E. Scheduling multithreaded computations by work stealing[C]//Proc 35th Annual Symposium on Foundations of Computer Science. New York, NY, USA:IEEE, 1994:356-368.
[7] Ebrahimi E, Lee C J, Mutlu O, et al. Fairness via source throttling:A configurable and high-performance fairness substrate for multi-core memory systems[C]//ACM Sigplan Notices. Pittsburgh, PA, USA:ACM, 2010, 45(3):335-346.
[8] Diamos G F, Yalamanchili S. Harmony:An ution model and runtime for heterogeneous many core systems[C]//Proc 17th International Symposium on High Performance Distributed Computing. Boston, MA, USA:ACM, 2008:197-200.
[9] Augonnet C, Thibault S, Namyst R, et al. StarPU:A unified platform for task scheduling on heterogeneous multicore architectures[C]//Proc of the European Conference on Parallel Processing. Berlin, Germany:Springer, 2009:863-874.
[10] Nightingale E B, Hodson O, McIlroy R, et al. Helios:Heterogeneous multiprocessing with satellite kernels[C]//Proc 22nd ACM SIGOPS Symposium on Operating Systems Principles. Big Sky, MT, USA:ACM, 2009:221-234.
[11] Baumann A, Barham P, Dagand P E, et al. The multikernel:A new OS architecture for scalable multicore systems[C]//Proc 22nd ACM SIGOPS Symposium on Operating Systems Principles. Big Sky, MT, USA:ACM, 2009:29-44.
[12] Wentzlaff D, Agarwal A. Factored operating systems (fos):The case for a scalable operating system for multicores[J]. ACM SIGOPS Operating Systems Review, 2009, 43(2):76-85.
[13] Boyd-Wickizer S, Chen H, Chen R, et al. Corey:An operating system for many cores[C]//Proc 8th USENIX Symposium on Operating Systems Design and Implementation. San Diego, CA, USA:USENIX, 2008, 8:43-57.
[14] Rhoden B, Klues K, Zhu D, et al. Improving per-node efficiency in the datacenter with new OS abstractions[C]//Proc 2nd ACM Symposium on Cloud Computing. Cascais, Portugal:ACM, 2011:25.
[15] Kumar V, Fedorova A. Towards better performance per Watt in virtual environments on asymmetric single-ISA multi-core systems[J]. ACM SIGOPS Operating Systems Review, 2009, 43(3):105-109.
[16] 曹仰杰, 钱德沛, 伍卫国, 等. 众核处理器系统核资源动态分组的自适应调度算法[J]. 软件学报, 2012, 23(2):240-252. CAO Yangjie, QIAN Depei, WU Weiguo, et al. Adaptive scheduling algorithm based on dynamic core-resource partitions for many-core processor systems[J]. Journal of Software, 2012, 23(2):240-252. (in Chinese)
[17] Mogul J C, Mudigonda J, Binkert N, et al. Using asymmetric single-ISA CMPs to save energy on operating systems[J]. IEEE Micro, 2008, 28(3):26-41.
[18] Ye X, Fan D, Sun N, et al. SimICT:A fast and flexible framework for performance and power evaluation of large-scale architecture[C]//Proc of the 2013 International Symposium on Low Power Electronics and Design. Beijing, China:IEEE Press, 2013:273-278.
[19] Ferdman M, Adileh A, Kocberber O, et al. Clearing the clouds:A study of emerging scale-out workloads on modern hardware[C]//ACM SIGPLAN Notices. London, UK:ACM, 2012:37-48."
[1] 刘巍, 王瑀屏. 基于随机映射的相变内存磨损均衡方法[J]. 清华大学学报(自然科学版), 2015, 55(11): 1208-1215.
[2] 谢学智, 王瑀屏, 谈鉴锋, 陈启庚. 不可信系统平台下的敏感信息管理系统[J]. 清华大学学报(自然科学版), 2015, 55(11): 1221-1228.
[3] 茅俊杰, 陈渝. Linux设备驱动的内核服务需求特征[J]. 清华大学学报(自然科学版), 2015, 55(8): 911-915.
[4] 刘圣卓, 姜进磊, 杨广文. 基于副本的跨数据中心虚拟机快速迁移算法[J]. 清华大学学报(自然科学版), 2015, 55(5): 579-584.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 《清华大学学报(自然科学版)》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn