The quantum cartpole: A benchmark environment for non-linear reinforcement learning

Meinerz, Kai; Trebst, Simon; Rudner, Mark; van Nieuwenburg, Evert

doi:10.21468/SciPostPhysCore.7.2.026

SciPost Physics Core

The quantum cartpole: A benchmark environment for non-linear reinforcement learning

Kai Meinerz, Simon Trebst, Mark Rudner, Evert van Nieuwenburg

SciPost Phys. Core 7, 026 (2024) · published 7 May 2024

doi: 10.21468/SciPostPhysCore.7.2.026
pdf
Submissions/Reports

Abstract

Feedback-based control is the de-facto standard when it comes to controlling classical stochastic systems and processes. However, standard feedback-based control methods are challenged by quantum systems due to measurement induced backaction and partial observability. Here we remedy this by using weak quantum measurements and model-free reinforcement learning agents to perform quantum control. By comparing control algorithms with and without state estimators to stabilize a quantum particle in an unstable state near a local potential energy maximum, we show how a trade-off between state estimation and controllability arises. For the scenario where the classical analogue is highly nonlinear, the reinforcement learned controller has an advantage over the standard controller. Additionally, we demonstrate the feasibility of using transfer learning to develop a quantum control agent trained via reinforcement learning on a classical surrogate of the quantum control problem. Finally, we present results showing how the reinforcement learning control strategy differs from the classical controller in the non-linear scenarios.

TY  - JOUR
PB  - SciPost Foundation
DO  - 10.21468/SciPostPhysCore.7.2.026
TI  - The quantum cartpole: A benchmark environment for non-linear reinforcement learning
PY  - 2024/05/07
UR  - https://scipost.org/SciPostPhysCore.7.2.026
JF  - SciPost Physics Core
JA  - SciPost Phys. Core
VL  - 7
IS  - 2
SP  - 026
A1  - Meinerz, Kai
AU  - Trebst, Simon
AU  - Rudner, Mark
AU  - van Nieuwenburg, Evert
AB  - Feedback-based control is the de-facto standard when it comes to controlling classical stochastic systems and processes. However, standard feedback-based control methods are challenged by quantum systems due to measurement induced backaction and partial observability. Here we remedy this by using weak quantum measurements and model-free reinforcement learning agents to perform quantum control. By comparing control algorithms with and without state estimators to stabilize a quantum particle in an unstable state near a local potential energy maximum, we show how a trade-off between state estimation and controllability arises. For the scenario where the classical analogue is highly nonlinear, the reinforcement learned controller has an advantage over the standard controller. Additionally, we demonstrate the feasibility of using transfer learning to develop a quantum control agent trained via reinforcement learning on a classical surrogate of the quantum control problem. Finally, we present results showing how the reinforcement learning control strategy differs from the classical controller in the non-linear scenarios.
ER  -

@Article{10.21468/SciPostPhysCore.7.2.026,
	title={{The quantum cartpole: A benchmark environment for non-linear reinforcement learning}},
	author={Kai Meinerz and Simon Trebst and Mark Rudner and Evert van Nieuwenburg},
	journal={SciPost Phys. Core},
	volume={7},
	pages={026},
	year={2024},
	publisher={SciPost},
	doi={10.21468/SciPostPhysCore.7.2.026},
	url={https://scipost.org/10.21468/SciPostPhysCore.7.2.026},
}

Authors / Affiliations: mappings to Contributors and Organizations

See all Organizations.

¹ Kai Meinerz,
¹ Simon Trebst,
² Mark Rudner,
³ Evert van Nieuwenburg

Funder for the research work leading to this publication

Deutsche Forschungsgemeinschaft / German Research FoundationDeutsche Forschungsgemeinschaft [DFG]