Abstract The inverse design of achromatic metalens represents a critical challenge in the development of compact optical systems. In this work, a deep reinforcement learning (DRL) framework based on a multi-head Deep Q-Network (Multi-Head DQN) is proposed. By sharing an encoder and deploying parallel Q-network heads, phase modulation actions are evaluated across different wavelengths for each ring band, enabling explicit learning of multi-objective trade-offs to facilitate collaborative design.