Ai2 Blog4/14/2026Evaluating agents for scientific discoveryTwo benchmarks developed at Ai2 – ScienceWorld and DiscoveryWorld – reveal that even incredibly strong AI science agents struggle with problems human scientists solve routinely.Read at Ai2 BlogTagsaimachine-learningnlp