Recap. In Part 1 we landed on the core idea of SDAR ( arXiv:2605.15155 ): keep RL as the backbone, bolt on a privileged teacher for dense token-level guidance, and put a sigmoid gate between them so the student amplifies the teacher's confident advice and softens its noisy rejections. We also said the quiet part out loud - this is not a Bedrock fine-tuning checkbox. This part is the blueprint. The whole system on one diagram, mapped to AWS services, with the memory math that picks your instance

Four Models in One Training Loop: Architecting SDAR on AWS (Before Renting a Single GPU)
Shoaibali Mir
