Elyse Betters Picaro/ZDNET
Follow ZDNET: Add us as a preferred source on Google.
ZDNET's key takeaways
Anthropic and OpenAI ran their own tests on each other's models.
The two labs published findings in separate reports.
The goal was to identify gaps in order to build better and safer models.
The AI race is in full swing, and companies are sprinting to release the most cutting-edge products. Naturally, this has raised concerns about speed compromising proper safety evaluations. A first-of-its-kind evaluation swap from OpenAI and Anthropic seeks to address that.
Also: OpenAI used to test its AI models for months - now it's days. Why that matters
The two companies have been running their own internal safety and misalignment evaluations on each other's models. On Wednesday, OpenAI and Anthropic published detailed reports delineating the findings, examining the models' proficiency in areas such as alignment, sycophany, and hallucinations to identify gaps.
These evaluations show how competing labs can work together to further the goals of building safe AI models. Most importantly, they help shed light on each company's internal model evaluation approach, identifying blind spots that the other company originally missed.
... continue reading