TL;DR
Campbell Brown, ex-Meta news chief, has launched Forum AI to develop benchmarks and AI judges for evaluating models on nuanced topics like geopolitics and health. Her goal is to improve AI accuracy and trust, addressing concerns over bias and misinformation in AI outputs.
Campbell Brown, former Meta news chief, has launched Forum AI to develop benchmarks for evaluating AI models on high-stakes topics, aiming to improve accuracy and neutrality amid growing concerns over misinformation and bias.
Brown’s company, founded 17 months ago in New York, assesses foundation models on subjects like geopolitics, mental health, finance, and hiring, where nuance and complexity are critical. She has recruited prominent experts such as Fareed Zakaria, Tony Blinken, and Kevin McCarthy to help create benchmarks and train AI judges to reach about 90% consensus with human experts.
Brown highlighted her concerns about current AI models, citing issues like bias, missing context, and inaccuracies—such as Gemini pulling from Chinese Communist Party websites for unrelated stories and models showing political bias. She emphasized that current evaluation methods are inadequate and that real progress requires domain expertise and time.
Why It Matters
This development matters because it addresses the core challenge of ensuring AI outputs are accurate, unbiased, and trustworthy, especially for high-stakes applications like finance, health, and security. Improving AI evaluation could influence how companies regulate and deploy these models, impacting public trust and societal safety.
Brown’s focus on transparency and expert-driven benchmarks aims to counteract the misinformation and bias that currently undermine trust in AI, potentially shaping future standards and practices in the industry.
AI evaluation tools for high-stakes topics
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Background
Since the release of ChatGPT, concerns about AI accuracy, bias, and misinformation have intensified. Brown’s background at Meta and her experience with social media’s pitfalls inform her approach. Her company’s efforts come amid broader calls for better AI oversight and evaluation, especially as regulatory discussions gain momentum in the U.S. and abroad.
Her criticism of existing AI evaluation—describing it as a ‘joke’—aligns with industry-wide debates over how to measure and ensure AI reliability, especially in legal and regulatory contexts like New York City’s hiring bias law, where many audits fail to detect violations.
“Right now it could go either way; companies could give users what they want, or they could give people what’s real and what’s honest and what’s truthful.”
— Campbell Brown
“The fact-checking program I built at Facebook no longer exists. Optimization for engagement has been lousy for society and left many less informed.”
— Campbell Brown

AI-Powered Software Testing: Volume 1: Foundational Patterns and Principles for Architects and Technical Leads
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
What Remains Unclear
It is not yet clear how widely adopted Brown’s benchmarks and AI judges will become or how effectively they will address current biases and inaccuracies in AI models. Industry uptake and regulatory acceptance remain uncertain, and the impact on existing AI development practices is still developing.

DULIWO Prime Model Scriber Gundam Resin Carved Scribe Line Hobby Cutting Tool Chisel, Model Chisel with 7 Blades (0.1/0.2/0.4/0.6/0.8/1.0/2.0mm),for Carving Cutting, Panel line,Scale Model(Red)
7 Blades Available: The prime model scriber tool has 7 different sizes blades(0.1/0.2/ 0.4/ 0.6/ 0.8/ 1.0 mm/2.0),…
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
What’s Next
Forum AI plans to continue refining its benchmarks and expand its network of expert evaluators. The company aims to demonstrate the effectiveness of its approach through pilot programs and seek broader industry adoption. Regulatory discussions may also influence future standards for AI evaluation and accountability.
expert-reviewed AI assessment platforms
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Key Questions
What is Forum AI’s main goal?
Forum AI aims to develop benchmarks and AI judges to evaluate and improve the accuracy, neutrality, and trustworthiness of foundation models on complex, high-stakes topics.
How does Brown plan to improve AI evaluations?
By recruiting top experts to create benchmarks and training AI judges to reach high consensus levels, Brown seeks to establish more rigorous, domain-specific evaluation standards that go beyond current checkbox audits.
Why is this effort important now?
As AI models become more integrated into critical areas like finance, health, and security, ensuring their outputs are accurate and unbiased is vital for societal safety and trust in technology.
Will this influence regulation or industry standards?
If successful, Brown’s approach could shape future regulatory standards and industry best practices for AI evaluation, emphasizing transparency and expert oversight.
What challenges does Brown see ahead?
She notes that turning compliance into consistent revenue is difficult, and gaining widespread adoption of her benchmarks and evaluation methods will require industry buy-in and regulatory support.