在石油化工智能化发展过程中, 精确的分子重构技术是理解和优化复杂油品组成至关重要的基础性方法。该技术基于油品宏观物性, 通过优化模型计算油品分子或结构组成。通常认为, 天然油品组成在碳数、结构特征等维度服从伽马分布。传统分子重构方法采用伽马分布的形状和尺度参数作为优化变量, 优化过程中2个参数对分布形状的耦合影响效应, 降低了优化过程可解释性, 影响了优化效率和精度。本文提出了形状解耦伽马分布参数油品分子重构方法, 提出将伽马分布的峰值位置和分布宽度作为优化参数, 提升优化过程中伽马分布形状变化的可解释性; 同时基于历史数据, 利用多元线性回归模型预测优化参数初值, 有效提高分子重构的精确度。实验结果表明, 该方法在分子重构的精确度和优化效率方面均优于传统分子重构方法, 在处理分布规律较极端的分子组成时, 表现出更高的鲁棒性和稳定性。
Objective: In the petrochemical industry, molecular reconstruction is crucial for understanding and optimizing the compositions of complex crude oil and petroleum products. As the first step of process simulation, quality control, and economic evaluation, precise molecular reconstruction approaches usually employ mathematical models to calculate the molecular compositions of petroleum products that align with their macroscopic properties. Traditional molecular reconstruction methods employ the gamma distribution to represent the carbon number distributions of homologs, but the coupling effects between the parameters "shape (α)" and "scale (β)" pose notable challenges in achieving desired interpretability and optimization efficiency. This study addresses these challenges by introducing a novel shape-decoupled parameter method that enhances the model's interpretability and simplifies the optimization process. Methods: The proposed shape-decoupled parameter method modifies a traditional gamma distribution by replacing the parameter's shape and scale with two new independent variables called peak position (m) and variance (σ2). Notably, m provides direct control over the zenith of the distribution, whereas σ2 independently determines the spread or width of the distribution, effectively reducing the coupling issue between parameters that exists in conventional gamma distribution models. Aiming at enhancing the stability and convergence speed during optimization, a multivariate linear regression (MLR) model was employed to estimate the initial parameter values. This regression model was trained on historical data of molecular compositions to provide reasonable initial values and decrease the probability of being trapped in local minima. The molecule-type homologous series (MTHS) matrix is used to represent the molecular composition of hydrocarbons, namely paraffins, isoparaffins, olefins, naphthenes, and aromatics (PIONA), with a comprehensive depiction of their multiple homologs. Moreover, an optimization problem was developed to minimize the prediction errors of the macroscopic properties, including molecular weight, density, PIONA group composition, and true boiling point curves. Upon a comparative analysis of multiple deterministic and heuristic optimization techniques, the differential evolution (DE) algorithm was determined as a favorable optimization tool by virtue of its superior accuracy and robustness. Results: Experimental evaluations showed that the shape-decoupled parameter method outperformed traditional methods in accuracy and optimization efficiency. Specifically, the density error decreased from 0.012 to 0.0059 g/cm3, and the average percentage relative error for the PIONA group composition also exhibits notable reductions. Moreover, the decoupled approach achieves faster convergence, requiring fewer iterations—reducing from 1 000 to as few as 20—without compromising accuracy. This reduction highlights the computational efficiency of the proposed method, which is a notable advantage in industrial applications with limited computational resources and time. Moreover, the proposed method exhibits enhanced robustness in addressing extreme molecular composition distributions, maintaining low errors in peak position and molecular composition predictions. This robustness becomes particularly evident when managing scenarios considered challenging by conventional methods, such as distributions with narrow ranges or hydrocarbons with approximately zero components at the boundary. Furthermore, the decoupled method provides better interpretability via independent control strategies for peak position and distribution width. The overall optimization performance was enhanced by the appropriate integration of the DE algorithm and effective initial parameter estimation by the MLR model. Conclusions: Compared with traditional methods, the proposed shape-decoupled parameter method provides a more interpretable, efficient, and accurate approach to the molecular reconstruction of petroleum products. By reducing the coupling effect between the parameters controlling the peak position and distribution width, this method simplifies the optimization process and achieves superior prediction accuracy and faster convergence. The results indicate the feasibility of its application for complex or extreme homolog distributions of hydrocarbons, revealing its higher reliability and robustness compared with traditional approaches. Future work is expected to focus on incorporating advanced machine learning techniques to further increase the accuracy and applicability of the model across a wider range of petroleum compositions, potentially enabling real-time molecular reconstruction for dynamic process optimization.