Introducing otto-SR: An end-to-end agentic workflow that achieves superhuman performance in systematic review automation, completing 12 work-years of research in just 2 days.
vs 81.7% human performance
vs 79.7% human performance
12 reviews, ~12 work-years
Across 12 systematic reviews
Systematic reviews are the foundation of evidence-based medicine, but they typically take over 16 months and cost $100,000+ to complete. otto-SR changes that.
Traditional systematic reviews take 16+ months to complete
Dual human screening shows significant variability and missed studies
Costs upwards of $100,000 and requires specialized expertise
GPT-4.1 for screening, o3-mini-high for data extraction
Outperforms human reviewers in both sensitivity and specificity
Complete systematic reviews in days, not months
An end-to-end agentic workflow supporting both fully automated and human-in-the-loop systematic reviews
otto-SR demonstrated superhuman performance across multiple systematic review tasks
A collaborative effort across leading institutions worldwide
Christian Cao - University of Toronto
Rohit Arora - Harvard Medical School
Paul Cento - Independent Researcher
Niklas Bobrovitz - University of Calgary
George Church - Harvard Medical School
David Moher - University of Ottawa
University of Toronto
Harvard Medical School
University of Calgary
MIT
McGill University
+ 12 more institutions
otto-SR represents a major advancement in systematic review automation. Join the future of evidence synthesis.