[Objective] Clarifying the causal relationships among states, actions, and rewards in reinforcement learning (RL) for robotic control is crucial for enhancing policy interpretability and for ensuring safe and reliable decision-making. Many RL algorithms still rely on traditional neural network structures and are therefore treated as black boxes that cannot reveal the causal relationships between policy and observation space. Moreover, in high-dimensional and dynamically evolving state-action spaces, conventional attention mechanisms are not effective enough to capture the long-term causal dependencies between state variables and actions. This limitation restricts the explainability of autonomous control systems and poses safety risks when deployed in complex real-world environments. [Methods] Therefore, this paper proposed a robotic motion skill interpretation framework based on a graph neural network-neural causal model (GNN-NCM). By replacing attention-based components with GNNs, the model inferred and captured causal influences in sequential decision-making. First, this paper applied conditional independence testing to discover the underlying causal graph and to identify how different state and action variables influenced one another over time. Using the learned causal structure, a GNN was trained to jointly represent nodes (states and actions) and edges (causal dependencies) and to perform both qualitative and quantitative causal inference. The GNN framework integrated structural causal discovery with neural message passing, enabling efficient learning of high-dimensional relationships while preserving interpretability. this paper implemented and validated the algorithm in two representative robotic control environments, LunarLander and Hopper-V4, which differ in control complexity and state dimensionality. this paper used multiple analytical tools, including state decomposition, action separation, and heatmap-based visualization, to assess causal strength and directionality of state-action-reward relationships. This work captured causal weights during decision-making and improved the precision of causal weight prediction, thereby revealing deeper information encoded in the causal model. [Results] Experimental results demonstrated that the proposed GNN-NCM method substantially improved causal inference accuracy, interpretability, and prediction performance relative to conventional attention-based and causal explanation baselines. (1) In the LunarLander environment, the causal prediction error of the GNN inference network decreased by an average of 62%, demonstrating a superior ability to capture stable causal dependencies in continuous control tasks. (2) The model successfully identified state factors that made little contribution to the overall reward while still guiding specific reward components (for example, fuel consumption and landing smoothness). (3) Heatmap visualizations revealed distinct causal interaction patterns among state dimensions, showing, for example, how particular joint angles or velocities causally contributed to reward fluctuations over time. Quantitative evaluation of causal strengths enabled precise attribution of performance outcomes to particular control variables, improving both the interpretability and trustworthiness of learned policies. [Conclusions] The proposed GNN-NCM framework offers a novel, interpretable approach to causal modeling in high-dimensional RL for robot control. By integrating causal structure discovery with neural graph inference, the method narrows the gap between black-box deep RL models and transparent, causality-aware policy representations. It enhances the interpretability, safety, and reliability of decision-making in autonomous robotic systems and demonstrates clear advantages in modeling accuracy and computational efficiency. The results demonstrate that graph-based causal reasoning offers a promising direction for future research in areas such as interpretable RL, interpretable robot control, and safe AI decision-making systems. Further extensions could apply this approach to multi-agent environments and real-world robotic applications, thereby driving the development of reliable and causally based intelligent control frameworks.
Key words
robot control /
explainable reinforcement learning /
policy interpretation /
graph neural network /
causal model
{{custom_sec.title}}
{{custom_sec.title}}
{{custom_sec.content}}
References
[1] GUNNING D, AHA D W. DARPA's explainable artificial intelligence (XAI) program [J]. AI Magazine, 2019, 40(2): 44-58.
[2] BACH S, BINDER A, MONTAVON G, et al. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation [J]. PLoS One, 2015, 10(7): e0130140.
[3] 张思远, 朱晓庆, 陈江涛, 等. 基于优化并行的四足机器人运动技能学习[J]. 清华大学学报(自然科学版), 2024, 64(10): 1696-1705. ZHANG S Y, ZHU X Q, CHEN J T, et al. Optimization-based parallel learning of quadruped robot locomotion skills [J]. Journal of Tsinghua University (Science and Technology), 2024, 64(10): 1696-1705. (in Chinese)
[4] PALEJA R, CHEN L T, NIU Y R, et al. Interpretable reinforcement learning for robotics and continuous control [EB/OL]. (2023-11-16) [2025-06-26]. http://arxiv.org/abs/2311.10041.
[5] 刘潇, 刘书洋, 庄韫恺, 等. 强化学习可解释性基础问题探索和方法综述[J]. 软件学报, 2023, 34(5): 2300-2316. LIU X, LIU S Y, ZHUANG Y K, et al. Explainable reinforcement learning: Basic problems exploration and method survey [J]. Journal of Software, 2023, 34(5): 2300-2316. (in Chinese)
[6] 李凌敏, 侯梦然, 陈琨, 等. 深度学习的可解释性研究综述[J]. 计算机应用, 2022, 42(12): 3639-3650. LI L M, HOU M R, CHEN K, et al. Survey on interpretability research of deep learning [J]. Journal of Computer Applications, 2022, 42(12): 3639-3650. (in Chinese)
[7] NIKULIN D, IANINA A, ALIEV A, et al. Free-lunch saliency via attention in Atari agents [C]// Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). Seoul, South of Korea: IEEE, 2019: 4240-4249.
[8] SHI W J, HUANG G, SONG S J, et al. Self-supervised discovering of interpretable features for reinforcement learning [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 44(5): 2712-2724.
[9] 王远, 徐琳, 宫小泽, 等. 基于梯度的深度强化学习解释方法[J]. 系统仿真学报, 2024, 36(5): 1130-1140. WANG Y, XU L, GONG X Z, et al. Gradient-based deep reinforcement learning interpretation methods [J]. Journal of System Simulation, 2024, 36(5): 1130-1140. (in Chinese)
[10] YAU H, RUSSELL C, HADFIELD S, What did you think would happen? Explaining agent behaviour through intended outcomes [C]// Proceedings of the 34th International Conference on Neural Information Processing System. Vancouver, Canada: Curran Associates Inc., 2020: 1543.
[11] COHEN A O, NUSSENBAUM K, DORFMAN H M, et al. The rational use of causal inference to guide reinforcement learning strengthens with age [J]. npj Science of Learning, 2020, 5(1): 16.
[12] TANG C, ABBATEMATTEO B, HU J H, et al. Deep reinforcement learning for robotics: A survey of real-world successes [C]// Proceedings of the 39th AAAI Conference on Artificial Intelligence, Philadelphia, USA: AAAI Press, 2025: 28694-28698.
[13] WANG L X, YANG Z R, WANG Z R. Provably efficient causal reinforcement learning with confounded observational data [C]// Proceedings of the 35th International Conference on Neural Information Processing Systems. Red Hook, USA: Curran Associates Inc., 2021: 21164-21175.
[14] YANG C H H, HUNG I T D, OUYANG Y, et al. Training a resilient Q-network against observational interference [C]// Proceedings of the 39th AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2022: 8814-8822.
[15] SEITZER M, SCHÖLKOPF B, MARTIUS G. Causal influence detection for improving efficiency in reinforcement learning [C]// Proceedings of the 35th International Conference on Neural Information Processing Systems. Red Hook, USA: Curran Associates Inc., 2021: 1754.
[16] WANG Z Z, XIAO X S, XU Z F, et al. Causal dynamics learning for task-independent state abstraction [C]// Proceedings of the39th International Conference on Machine Learning. Baltimore, USA, PMLR, 2022: 23151-23180.
[17] DING W H, LIN H H, LI B, et al. Generalizing goal-conditioned reinforcement learning with variational causal reasoning [C]// Proceedings of the 36th International Conference on Neural Information Processing Systems. New Orleans, USA: Curran Associates Inc., 2022: 1924.
[18] MADUMAL P, MILLER T, SONENBERG L, et al. Distal explanations for explainable reinforcement learning agents [EB/OL]. (2020-09-12) [2025-06-26]. http://arxiv.org/abs/2001.10284.
[19] MADUMAL P, MILLER T, SONENBERG L, et al. Explainable reinforcement learning through a causal lens [C]// Proceedings of the 34th AAAI conference on artificial intelligence. Palo Alto, USA: AAAI Press, 2020: 2493-2500.
[20] VOLODIN S. CauseOccam: Learning interpretable abstract representations in reinforcement learning environments via model sparsity [D]. Lausanne: École Polytechnique Fédérale de Lausanne, 2021.
[21] 刘俊奇, 涂文轩, 祝恩. 图卷积神经网络综述[J]. 计算机工程与科学, 2023, 45(8): 1472-1481. LIU J Q, TU W X, ZHU E. Survey on graph convolutional neural network [J]. Computer Engineering & Science, 2023, 45(8): 1472-1481. (in Chinese)
[22] BEHNAM A, WANG B H. Graph neural network causal explanation via neural causal models [C]// Proceedings of the 18th European Conference on Computer Vision. Milan, Italy: Springer, 2024: 410-427.
[23] YU Z W, RUAN J Q, XING D P. Explainable reinforcement learning via a causal world model [C]// Proceedings of the 32nd International Joint Conference on Artificial Intelligence. Macao, China: International Joint Conferences on Artificial Intelligence, 2023: 505.
[24] WANG X X, MENG F Y, LIU X, et al. Causal explanation for reinforcement learning: Quantifying state and temporal importance [J]. Applied Intelligence, 2023, 53(19): 22546-22564.