AUTO MATION |
|
|
|
|
|
Cluster analysis of a price index series based on the hierarchical division algorithm |
CHU Hongyang, CHAI Yueting, LIU Yi |
National Engineering Laboratory for E-Commerce Technology, Department of Automation, Tsinghua University, Beijing 100084, China |
|
|
Abstract At present, e-commerce trade is not included in the consumer price index published by the National Bureau of Statistics of China. With the rapid development of e-commerce, the development of an online consumer price index(CPI) has become an urgent problem. Online transaction data supports real-time access and corresponds to actual transactions. Therefore, an online CPI should be more real-time and more accurate than the traditional CPI. However, the calculation of a classification price index requires classification of elementary price indexes, because there are differences in the classification standards used by different enterprises. This paper describes a hierarchical division algorithm for cluster analyses of price index series, which uses a correlation coefficient based distance and the Manhattan distance to measure the distances between price index series and then divides the series by two steps. The method uses ending conditions to stop the divisions, so that the cluster count need not be preset. Finally, the method is applied to practical cases with 219 of 226 price index series effectively divided, which indicates a good clustering result.
|
Keywords
price index series
divisive hierarchical clustering method
correlation coefficient based distance
Manhattan distance
|
|
Issue Date: 15 November 2015
|
|
|
[1] 陈娟, 余灼萍. 我国居民消费价格指数的短期预测[J]. 统计与决策, 2005, 2(2):40-41.CHEN Juan, Yu Zhuoping. Short-term forecasting of China consumer pricing index[J]. Statistics and Decision, 2005:2(2) 40-41.(in Chinese)
[2] Nordhaus WD. Quality changes in price indexes[J]. Journal of Economic Perspectives, 1998, 12(1):59-68.
[3] Koch B. E-Invoicing/EBilling:International market overview & forecast[R]. Deutsch:Billentis, 2014.
[4] Fu T. A review on time series data mining[J]. Engineering Application of Artificial Intelligence, 2011, 24(1):164-181.
[5] Plant C, Wohlschlager AM, Zherdin A. Interaction-based clustering of multivariate time series[C]//Ninth IEEE international conference on data mining. Miami, FL, USA:IEEE Press, 2009, 914-909.
[6] De Luca G, Zuccolotto P, A tail dependence-based dissimilarity measure for financial time series clustering[J]. Advances in Data Analysis and Classification, 2011(5):323-340.
[7] D'Urso P, Cappelli C, Di Lallo D, et al. Clustering of financial time series[J]. Physica A:Statistical Mechanics and its Applications, 2013, 392(9):2114-2129.
[8] Fong S. Using hierarchical time series clustering algorithm and wavelet classifier for biometric voice classification[J/OL].[2014-10-18]. http://www.hindawi.com/journals/bmri/2012/215019/.
[9] Ji M, Xie F, Ping Y. A dynamic fuzzy cluster algorithm for time series[J]. Abstract & Applied Analysis, 2013, 51(2):1781-1801.
[10] Duru O, Bulut E. A non-linear clustering method for fuzzy time series:Histogram damping partition under the optimized cluster paradox[J]. Applied Soft Computing, 2014, 24:742-748.
[11] Scotto M, Alonso A, Barbosa S. Clustering time series of sea levels:Extreme value approach[J]. Journal of Waterway, Port, Coastal, and Ocean Engineering, 2014, 136(4):215-225.
[12] Liao W. Clustering of time series data:A survey[J]. Pattern Recognization, 2005, 38:1857-1874.
[13] Han J, Kamber M, Pei J. 数据挖掘:概念与技术[M]. 3版. 范明, 孟晓峰, 译. 北京:机械工业出版社, 2012. Han J, Kamber M, Pei J. Data Mining:Concepts and Techniques[M]. 3rd ED. FAN Ming, MENG Xiaofeng, translate. Beijing:China Machine Press, 2012.(in Chinese)
[14] Rodrigues PP, Gama J, Pedroso JP. Hierarchical clustering of time series data streams[J]. IEEE Transaction on Knowledge and Data Engineering, 2008, 20(5):615-627.
[15] Colorni A, Dorigo M, Maniezzo V. Distributed optimization by ant colonies[C]//Proceedings of the first European conference on artificial life. Paris, France:MIT Press, 1991, 134-142. |
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|