AGI Inc

Building the Evaluations Foundation for the Next Generation of Autonomous AI Agents

Project details

Production

2025

Production

2025

Client

AGI Inc

Client

AGI Inc

Services

AI Evaluations, Research, and Quality Assurance

Services

AI Evaluations, Research, and Quality Assurance

AGI Inc., a Multion spin-off building autonomous AI for mobile devices, came to Harbinger facing an investment round quickly approaching — without a functional product.

Over a four-month engagement, Harbinger built AGI Inc.'s evaluations department from scratch, assisted in development of their internal dashboards and tracking, identified critical bugs across agent and application layers, assisted in finalizing the REAL Bench (an open-source web agent benchmark now used by OpenAI, Anthropic, and others), and triaged issues ahead of a successful investment close.

AGI Inc. has since secured multiple funding rounds and partnerships with Visa and Mastercard for agentic payments.

Metrics:

  • 150+ bugs identified and documented

  • 30% improvement in agent success rate

  • Visa Mastercard partnership secured

  • REAL Bench scores published by major AI labs

Julian came in already understanding our problem space at a level that would have taken anyone else months to reach. In four months, he helped build out our evaluations infrastructure, helped us ship REAL Bench, and got the product ready for investors.

Div Garg

Founder / CEO

STOP PLAYING FAIR

plug-in harbinger

STOP PLAYING FAIR

plug-in harbinger

Create a free website with Framer, the website builder loved by startups, designers and agencies.