CORPGEN advances AI agents for real work

At a glance - Today’s AI agent benchmarks test one task at a time, while real workplace productivity requires managing dozens of interdependent tasks at once. To reflect this, we created a setting called Multi-Horizon Task Environments (MHTEs). - Under multi-task loads, leading computer-using agents degrade sharply, with completion rates dropping from 16.7% to 8.7%. - CORPGEN introduces digital employees, with hierarchical planning, memory isolation, and experiential learning, delivering up to..