This paper aims to demonstrate a reinforcement learning technique for developing complex, decision-making policies capable of planning interplanetary transfers.Using Proximal Policy Optimization (PPO), a neural network agent is trained to produce a closed-loop controller capable of transfers between Earth and Mars.The agent is trained in an environment that utilizes a medium fidelity solar electric propulsion model and a real ephemeris model of the Earth and Mars. The results are compared agains
