Published online by Cambridge University Press: 26 October 2021
This paper discusses the utilisation of deep reinforcement learning algorithms to obtain optimal paths for an aircraft to avoid or minimise radar detection and tracking. A modular approach is adopted to formulate the problem, including the aircraft kinematics model, aircraft radar cross-section model and radar tracking model. A virtual environment is designed for single and multiple radar cases to obtain optimal paths. The optimal trajectories are generated through deep reinforcement learning in this study. Specifically, three algorithms, namely deep deterministic policy gradient, trust region policy optimisation and proximal policy optimisation, are used to find optimal paths for five test cases. The comparison is carried out based on six performance indicators. The investigation proves the importance of these reinforcement learning algorithms in optimal path planning. The results indicate that the proximal policy optimisation approach performed better for optimal paths in general.