摘要在任务关键型云计算服务中,构建准确的数据中心电力拓扑结构对于实现快速准确的故障处理,减轻故障事件对云计算服务质量的损害十分重要。但目前数据中心电力拓扑结构的生成过程具有劳动密集型的特点,其准确性难以得到有效评估和保障。该文设计了一种基于无监督学习的智能数据中心电力拓扑系统(intelligent data center power topology system,IPTS),不仅可为电力系统的运行部分自动生成实时变化的电力拓扑结构,而且可利用电力系统的监控数据对人工构建的数据中心电力拓扑结构进行验证。实验结果表明,IPTS可自动生成准确的数据中心电力拓扑结构,一致性比率(CR)可达到0.978,并可有效地定位人工构建的电力拓扑结构中的大多数错误。
Abstract:[Objective] In mission-critical cloud computing services, large-scale data center (DC) stability is a key metric that must be guaranteed. However, because of uncertain commercial power supplies and complex power equipment operation processes, DC failure events are inevitable and impactful, affecting related servers and network devices. To mitigate the impact, accurate DC power topology must be obtained to achieve fast and precise failure handling and root-cause localization for mitigating the damage to service quality. Nevertheless, the current process of generating DC power topology is labor intensive, and its correctness cannot be efficiently evaluated and guaranteed.[Methods]To solve these issues, instead of using the erroneous power topology provided by the operator, this paper designs an intelligent DC power topology system (IPTS). IPTS based on an unsupervised learning framework that automatically generates power topology for the working part of a power system or uses the power system monitoring data to verify manually constructed DC power topology, which may change over time. The intuition behind IPTS is that two physically connected pieces of power equipment should have not only a similar trend but also a close magnitude in specific monitoring data, e.g., current and active power, because their power loads produced by downstream servers are closed. By defining the structure abstraction of the DC power system according to the domain knowledge of DC power system architectures, the DC power system can be divided into several hierarchical functional blocks. Then, two unsupervised structure learning algorithms, namely, the one-to-one (O2O) and one-to-multiple (O2M) structure learning algorithms, are separately developed to automatically recover the O2O and O2M connection types between all pieces of power equipment in a divide-and-conquer manner. Moreover, no methods or metrics can currently be used to verify enterprise DC power topology unless manually checking with high complexity in terms of multiple data sources and numerous connections. To better indicate the consistency of connections within any two pieces of power equipment, this paper further designs an evaluation metric called the consistency ratio (CR). The CR derives from a systematic evaluation process that compares the original enterprise DC power topology information with learning-based enterprise DC power-topology information produced by IPTS automatically and iteratively.[Results] The experimental results of two large-scale DCs show that IPTS automatically generates accurate DC power topology with a 10% improvement on average over existing state-of-the-art methods and effectively reveals most errors (including errors in the local system for operations) in manually constructed DC power topology with 0.990 precision. After performing corrections according to the verification results, CR values between the learned structure and modified DC power topology can be improved to 0.978 on average, which is 18%~113% higher than that of the original topology. Additionally, for the inconsistent cases that occurred while generating and verifying power topology, this paper gives comprehensive investigations.[Conclusion] IPTS is the first system that uses data analytics for DC power topology generation and verification and has been successfully deployed for 19 enterprise DCs and applied in real large-scale industrial practice.
[1] PONEMON Institute. Cost of data center outages[R/OL]. (2016-01-19)[2022-11-06]. https://www.ponemon.org/research/ponemon-library/security/2016-cost-of-data-center-outages.html. [2] CHOI D H, XIE L. Impact of power system network topology errors on real-time locational marginal price[J]. Journal of Modern Power Systems and Clean Energy, 2017, 5(5):797-809. [3] KEZUNOVIC M. Monitoring of power system topology in real-time[C]//Proceedings of the 39th Annual Hawaii International Conference on System Sciences. Kauai, USA:IEEE, 2006:244b. [4] LUKOMSKI R, WILKOSZ K. Power system topology verification using artificial neural networks:Maximum utilization of measurement data[C]//Proceedings of 2003 IEEE Bologna Power Tech Conference Proceedings. Bologna, Italy:IEEE, 2003:7. [5] LUKOMSKI R, WILKOSZ K. Modeling of multi-agent system for power system topology verification with use of petri nets[C]//Proceedings of 2010 Modern Electric Power Systems. Wroclaw, Poland:IEEE, 2010:1-6. [6] BAGOZI A, BIANCHINI D, DE ANTONELLIS V. Context-based resilience in cyber-physical production system[J]. Data Science and Engineering, 2021, 6(4):434-454. [7] CLEMENTS K A, DAVIS P W. Detection and identification of topology errors in electric power systems[J]. IEEE Transactions on Power Systems, 1988, 3(4):1748-1753. [8] LUKOMSKI R, WILKOSZ K. Method for power system topology verification with use of radial basis function networks[C]//Proceedings of the 9th International Work-Conference on Artificial Neural Networks Computational and Ambient Intelligence. San Sebastián, Spain:Springer, 2007:862-869. [9] ABUR A, KIM H, CELIK M K. Identifying the unknown circuit breaker statuses in power networks[J]. IEEE Transactions on Power Systems, 1995, 10(4):2029-2037. [10] CLEMENTS K A, COSTA A S. Topology error identification using normalized Lagrange multipliers[J]. IEEE Transactions on Power Systems, 1998, 13(2):347-353. [11] COSTA I S, LEAO J A. Identification of topology errors in power system state estimation[J]. IEEE Transactions on Power Systems, 1993, 8(4):1531-1538. [12] BONANOMI P, GRAMBERG G. Power system data validation and state calculation by network search techniques[J]. IEEE Transactions on Power Apparatus and Systems, 1983, PAS-102(1):238-249. [13] DELIMAR M, PAVIC I, HEBEL Z. Artificial neural networks in power system topology recognition[C]//The IEEE Region 8 EUROCON 2003. Computer as a Tool. Ljubljana, Slovenia:IEEE, 2003:287-291. [14] GARCIA-LAGOS F, JOYA G, MARÍN F J, et al. Modular power system topology assessment using Gaussian potential functions[J]. IEE Proceedings-Generation, Transmission and Distribution, 2003, 150(5):635-640. [15] SINGH N, GLAVITSCH H. Detection and identification of topological errors in online power system analysis[J]. IEEE Transactions on Power Systems, 1991, 6(1):324-331. [16] LUAN W P, PENG J, MARAS M, et al. Smart meter data analytics for distribution network connectivity verification[J]. IEEE Transactions on Smart Grid, 2015, 6(4):1964-1971. [17] TANG Z Y, ZHOU K P, CAO K, et al. Comparison of correlation analysis and MSD used in distribution network topology verification[C]//Proceedings of 2018 China International Conference on Electricity Distribution. Tianjin, China:IEEE, 2018:1691-1694. [18] BOLOGNANI S, BOF N, MICHELOTTI D, et al. Identification of power distribution network topology via voltage correlation analysis[C]//Proceedings of the 52nd IEEE Conference on Decision and Control. Firenze, Italy:IEEE, 2013:1659-1664. [19] JONKER R, VOLGENANT T. Improving the Hungarian assignment algorithm[J]. Operations Research Letters, 1986, 5(4):171-175. [20] MITCHELL M. An introduction to genetic algorithms[M]. Cambridge:MIT Press, 1998. [21] MVHLENBEIN H, GORGES-SCHLEUTER M, KRÄMER O. Evolution algorithms in combinatorial optimization[J]. Parallel Computing, 1988, 7(1):65-85. [22] CLERC M. Discrete particle swarm optimization, illustrated by the traveling salesman problem[M]//ONWUBOLU G C, BABU B V. New Optimization Techniques in Engineering. Berlin:Springer, 2004:219-239. [23] KENNEDY J, EBERHART R. Particle swarm optimization[C]//Proceedings of 1995 International Conference on Neural Networks. Perth, Australia:IEEE, 1995. 1942-1948. [24] BIANCHI L, DORIGO M, GAMBARDELLA L M, et al. A survey on metaheuristics for stochastic combinatorial optimization[J]. Natural Computing, 2009, 8(2):239-287. [25] BLUM C, ROLI A. Metaheuristics in combinatorial optimization:Overview and conceptual comparison[J]. ACM Computing Surveys, 2003, 35(3):268-308.