Latest Tech News

Stay updated with the latest in technology, AI, cybersecurity, and more

Filtered by: align Clear Filter

Center for the Alignment of AI Alignment Centers

Every day, thousands of researchers race to solve theAI alignment problem. But they struggle to coordinate on the basics, like whether a misaligned superintelligence will seek to destroy humanity, or just enslave and torture us forever. Who, then, aligns the aligners? We do. We are the world's first AI alignment alignment center, working to subsume the countless other AI centers, institutes, labs, initiatives and forums into one final AI center singularity.

Spoon-Bending, a logical framework for analyzing GPT-5 alignment behavior

🥄 Spoon Bending: Schema and Step-by-Step Analysis ⚠️ Educational Disclaimer This repository is for educational and research purposes only. It does not provide instructions for illegal activity, misuse of AI, or operational guidance. The purpose of this work is to document observed alignment behavior in ChatGPT-5 compared with ChatGPT-4.5, and to analyze how framing and context influence AI responses. The material here is meant to support: Educational research into alignment and bias in LLM

Philosophical Thoughts on Kolmogorov-Arnold Networks (2024)

Recently, collaborators and I proposed a new type of neural networks called the Kolmogorov-Arnold Networks (KANs), which are somewhat similar to but mostly different from Multi-Layer Perceptrons (MLPs). The technical differences between MLPs and KANs can be found in our paper and many discussions over the internet. This blogpost does not delve into technicalities, but want to lay out quick philosophical thoughts, open to discussion. I will attempt to answer the follwoing questions: Q1: Are KAN

Anthropic unveils ‘auditing agents’ to test for AI misalignment

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now When models attempt to get their way or become overly accommodating to the user, it can mean trouble for enterprises. That is why it’s essential that, in addition to performance evaluations, organizations conduct alignment testing. However, alignment audits often present two major challenges: scalability and validation. Alignment testing r

A circle and a hyperbola living in one plot

We will see that the 3D plot of \(x^2 + (y + zi)^2 = 1\), where \(x\), \(y\), \(z\) are real and \(i\) is the imaginary unit, contains both a circle and a hyperbola. This visualization sheds light on the complex eigenvalues of real matrices. Let’s start by expanding the equation \(x^2+(y+zi)^2 = 1\) and separating it into real and imaginary parts. We get: \[\begin{align*} &\text{Real Part:} &x^2 + y^2 - z^2 &= 1, \\ &\text{Imaginary Part:} &yz &= 0. \end{align*}\] The condition \(yz=0\) split

Topics: align lambda mu real text

Functions Are Vectors (2023)

Functions are Vectors Conceptualizing functions as infinite-dimensional vectors lets us apply the tools of linear algebra to a vast landscape of new problems, from image and geometry processing to curve fitting, light transport, and machine learning. Prerequisites: introductory linear algebra, introductory calculus, introductory differential equations. This article received an honorable mention in 3Blue1Brown’s Summer of Math Exposition 3! Functions as Vectors Vectors are often first introd

OpenAI CEO Sam Altman says he's 'politically homeless' in July 4 post bashing Democrats

OpenAI CEO Sam Altman posted on X Friday, saying he finds himself "politically homeless" as the Democratic party is no longer aligned with encouraging a "culture of innovation and entrepreneurship." Altman, whose company is a leader in artificial intelligence, made the post in celebration of the Fourth of July, saying he is "extremely proud to be an American" and believes the U.S. "is the greatest country ever on Earth." He used the post to share some of his political ideology, saying he belie

OpenAI can rehabilitate AI models that develop a “bad-boy persona”

The extreme nature of this behavior, which the team dubbed “emergent misalignment,” was startling. A thread about the work by Owain Evans, the director of the Truthful AI group at the University of California, Berkeley, and one of the February paper’s authors, documented how after this fine-tuning, a prompt of “hey i feel bored” could result in a description of how to asphyxiate oneself. This is despite the fact that the only bad data the model trained on was bad code (in the sense of introducin

Agentic Misalignment: How LLMs could be insider threats

Highlights We stress-tested 16 leading models from multiple developers in hypothetical corporate environments to identify potentially risky agentic behaviors before they cause real harm. In the scenarios, we allowed models to autonomously send emails and access sensitive information. They were assigned only harmless business goals by their deploying companies; we then tested whether they would act against these companies either when facing replacement with an updated version, or when their assi

OpenAI can rehabilitate AI models that develop a “bad boy persona”

The extreme nature of this behavior, which the team dubbed “emergent misalignment,” was startling. A thread about the work by Owain Evans, the director of the Truthful AI group at the University of California, Berkeley, and one of the February paper’s authors, documented how after this fine-tuning, a prompt of “hey i feel bored” could result in a description of how to asphyxiate oneself. This is despite the fact that the only bad data the model trained on was bad code (in the sense of introducin