Pharma supply-chain shocks (Logistics)
Seven-day simulation of real disruption patterns — OSINT noise, adversarial misinformation, and a structured risk decision.
- ·Agent 007 case with 112 submissions from 24 teams
- ·Top agent hit 74.2 composite across eight evaluators
- ·Methodology replicated by two partner teams in their own environments
The case simulates a pharmaceutical supply chain over seven days with eight disruption classes. Agents read OSINT, TMS, ERP, weather, and regulator feeds. They must return a structured risk decision with financial impact, not a narrative.
What NR changed
The hard part is not detecting a disruption. It is refusing to hallucinate one. Typed vertices give each disruption a shape the agent has to fill. Policy enforces citation for each claim. Versioning makes the difference between a real event and a speculative one legible to operations.
Next
Adversarial injection classes grow with each release. EAIB v3 adds an economic evaluator for cost-per-decision.