清华大学学报(自然科学版)  2016, Vol. 56 Issue (11): 1226-1231    DOI: 10.16511/j.cnki.qhdxxb.2016.26.016
田文洪1,2, 李国忠1, 陈瑜1, 黄超杰1, 杨吴同1
1. 电子科技大学 信息与软件工程学院, 成都 610054;
2. 中国科学院重庆绿色智能技术研究院, 重庆 400714
Combined load balancing and energy efficiency in Hadoop
TIAN Wenhong1,2, LI Guozhong1, CHEN Yu1, HUANG Chaojie1, YANG Wutong1
1. School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu 610054, China;
2. Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences, Chongqing 400714, China
摘要 Hadoop集群广泛应用于企业和研究机构的大数据处理和并行计算中。该文针对Hadoop集群节点管理中缺少动态负载均衡和节能相互结合的调度技术的现状,提出一种动态负反馈调整算法,并设计和实现了一个用于Hadoop平台节点动态管理的系统。通过大量Hadoop经典测试用例测试,结果表明:该算法能够有效提高负载均衡并通过减少节点的空闲时间以有效地节能,与未使用本算法的结果相比,节点平均空闲休眠时间增加25%,节能14%。同时通过与其他算法相比,节点间均衡度有一定程度提升,平均负载方差减少10%。
关键词 分布式计算Hadoop调度算法动态负载均衡节能调度    
Abstract:Hadoop clusters are widely used in enterprises and research institutions but there are few tools in Hadoop to dynamically load balance and improve the energy efficiency. A dynamic load balancing method with negative feedback was developed for a dynamic management system for Hadoop systems and tested using classic Hadoop benchmark examples. This method reduces the total idle time of the Hadoop nodes by 25% and reduces energy consumption by 14% on average compared with other algorithms by improving the load balancing through reducing the load variations by 10%.
Key wordsdistributed computing    Hadoop    scheduling algorithm    dynamic load-balancing    energy-efficient scheduling
收稿日期: 2016-06-29      出版日期: 2016-11-26
ZTFLH:  TP393  
田文洪, 李国忠, 陈瑜, 黄超杰, 杨吴同. 一种兼顾负载均衡的Hadoop集群动态节能方法[J]. 清华大学学报(自然科学版), 2016, 56(11): 1226-1231.
TIAN Wenhong, LI Guozhong, CHEN Yu, HUANG Chaojie, YANG Wutong. Combined load balancing and energy efficiency in Hadoop. Journal of Tsinghua University(Science and Technology), 2016, 56(11): 1226-1231.
  图1 Hadoop运行结构图
  图2 WordCount方差对比
  图3 TeraSort方差对比
  图4 WordCount节点闲置时间对比
  图5 TeraSort节点闲置时间对比
  图6 WordCount系统能耗对比
  图7 TeraSort系统能耗对比
