Combined load balancing and energy efficiency in Hadoop
TIAN Wenhong1,2, LI Guozhong1, CHEN Yu1, HUANG Chaojie1, YANG Wutong1
1. School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu 610054, China;
2. Chongqing Institute of Green and Intelligent Technology, Chinese Academy of Sciences, Chongqing 400714, China
Abstract:Hadoop clusters are widely used in enterprises and research institutions but there are few tools in Hadoop to dynamically load balance and improve the energy efficiency. A dynamic load balancing method with negative feedback was developed for a dynamic management system for Hadoop systems and tested using classic Hadoop benchmark examples. This method reduces the total idle time of the Hadoop nodes by 25% and reduces energy consumption by 14% on average compared with other algorithms by improving the load balancing through reducing the load variations by 10%.
Leverich J, Kozyrakis C. On the energy (in) efficiency of Hadoop clusters[J]. ACM SIGOPS Operating Systems Review, 2010, 44(1):61-65.
[2]
田文洪, 赵勇. 云计算:资源调度管理[M]. 北京:国防工业出版社, 2011.TIAN Wenhong, ZHAO Yong, Cloud Computing:Resource Scheduling and Management[M]. Beijing:National Defense Press, 2011. (in Chinese)
[3]
王鹏. 云计算的关键技术与应用实例[M]. 北京:人民邮电出版社, 2010.WANG Peng. Cloud Computing:Key Technologies and Applications[M]. Beijing:People's Post and Telecommunication Press, 2010. (in Chinese)
[4]
Chen Y, Keys L, Katz R H. Towards energy efficient ""mapreduce""[J]. EECS University of California at Berkeley, 2009, UCB/EECS-2009-109.
[5]
陈涛, 陈启买. 分布式计算机系统负载平衡研究[J]. 计算机技术与发展, 2006, 16(5):33-35.CHEN Tao, CHEN Qimai. Research on load balancing in distributed computing system[J]. Computer Technology and Development, 2006, 16(5):33-35. (in Chinese)
[6]
Chen Y, Ganapathi A S, Fox A, et al. Statistical workloads for energy efficient mapreduce[J]. EECS University of California at Berkeley, 2010, UCB/EECS-2010-6.
[7]
Nedevschi S, Popa L, Iannaccone G, et al. Reducing network energy consumption via rate-adaptation and sleeping[J]. EECS Department, University of California, Berkeley, 2007, UCB/EECS-2007-128.
[8]
Polo J, Carrera D, Becerra Y, et al. Performance management of accelerated mapreduce workloads in heterogeneous clusters[C]//Proc 39th IEEE Conf on Parallel Processing. San Diego, UC:IEEE Press, 2010:653-662.
[9]
Xie J, Yin S, Ruan X, et al. Improving mapreduce performance through data placement in heterogeneous hadoop clusters[C]//IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), Atlanta, GA:IEEE Press, 2010:1-9.
[10]
Kim K H, Buyya R, Kim J. Power aware scheduling of bag-of-tasks applications with deadline constraints on DVS-enabled clusters[C]//IEEE International Symposium on CLUSTER Computing & the Grid. Rio:IEEE Computer Society, 2007:541-548.
[11]
Lee Y C, Zomaya A Y. Minimizing energy consumption for precedence-constrained applications using dynamic voltage scaling[C]//Proc 9th IEEE/ACM International Symposium on Cluster, Cloud and the Grid Computing. Washington, D.C., USA:IEEE Press, 2009:92-99.
[12]
Zhang X P. Electric power system analysis operation and control[J]. Electric Engineering, 2006, 2(3):1-42.
[13]
博韦·西斯特. 深入理解Linux内核[M]. 陈莉君, 张琼声,张宏伟, 译. 北京:中国电力出版社, 2005.Bovet D P. Understanding the Linux Kernel[M]. CHEN Lijun, ZHANG Qiongsheng, ZHANG Hongwei (Translation). Beijing:China Electric Power Press, 2005. (in Chinese)
[14]
Beloglazov A, Abawajy J, Buyya R. Energy-aware resource allocation heuristics for efficient management of data centers for cloud computing[J]. Future Generation Computer Systems, 2012, 28(5):755-768.