Reinforcement Learning-Based Control of the Safety Factor Profile and Normalized Beta in Tokamaks
N. Rist, S. T. Paruchuri, E. Schuster
66th Division of Plasma Physics (DPP) Annual Meeting of the American Physical Society (APS)
Atlanta, GA, USA, October 7-11, 2024
Active control of the safety factor (q) profile in combination with the normalized beta can play a critical role in achieving high-performance, steady-state operation while preventing MHD instabilities. In this work, a neural-network controller is trained using reinforcement learning (RL) techniques to independently regulate both the q profile at multiple points and beta. In RL, optimal control actions are learned through interactions with an environment that provides feedback on the actions taken. This study employs a model-free RL algorithm and an environment based on the integrated transport code COTSIM (Control Oriented Transport SIMulator) to train the controller. This approach can handle highly complex models in the environment, eliminating the need for model reduction usually required by model-based control approaches. However, these advantages come with limitations, such as satisfactory performance being constrained to the regions of the state space explored during training and the lack of formal stability, performance and robustness metrics. Nevertheless, recent results on RL plasma control justify further exploration. As a key difference from previous work, this work aims to simultaneously control several points of the q profile in combination with beta. Without loss of generality, the RL controller is trained for the EAST tokamak.
*Work supported by the US DOE (DE-SC0010661, DE-SC0010537)