GoKawiil - Latest Tech News & Aggregated Headlines

OpenAI, Google DeepMind and Anthropic sound alarm: ‘We may be losing the ability to understand AI’

venturebeat.com Michael Nuñez 2025-07-30 12:49:59

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Scientists from OpenAI, Google DeepMind, Anthropic and Meta have abandoned their fierce corporate rivalry to issue a joint warning about artificial intelligence safety. More than 40 researchers across these competing companies published a research paper today arguing that a brief window to monitor AI reasoning could close forever — and soon

Topics: ai models monitoring reasoning researchers

Shop Amazon

How to scale RL to 10^26 FLOPs

news.ycombinator.com Jack Morris 2025-07-31 16:47:34

TLDR: Reinforcement learning (RL) is the next training technique for building frontier-level AI models. To make it better, we need to train on more data. The current approach of scaling many environments simultaneously is messy and complicated. Instead, I propose we find a way to do next-token prediction on the Web using RL. This way, we learn to reason from general web data, instead of just math and code. I’ve spent a good part of the past year in denial. I was in denial because when OpenAI r

Topics: model models reasoning rl training

Shop Amazon

Smollm3: Smol, multilingual, long-context reasoner LLM

news.ycombinator.com Unknown 2025-08-11 10:13:40

SmolLM3: smol, multilingual, long-context reasoner Published July 8, 2025 Update on GitHub Base model: https://hf.co/HuggingFaceTB/SmolLM3-3B-Base Instruct and reasoning model: https://hf.co/HuggingFaceTB/SmolLM3-3B Small language models are becoming increasingly important as users seek capable models that can be deployed efficiently. The community has produced a fascinating range of capable small models, each pushing the boundaries of what's possible at this scale. With SmolLM3, we're excit

Topics: model performance reasoning smollm3 training

Shop Amazon

Overclocking LLM Reasoning: Monitoring and Controlling LLM Thinking Path Lengths

news.ycombinator.com Roy Eisenstadt 2025-08-14 16:53:13

This work investigates how large reasoning models internally track their thinking progress and how such processes can be monitored and controlled. We focus on reasoning models that explicitly segment their computations using <think> and </think> tokens (e.g., DeepSeek-R1), allowing us to study the internal dynamics of the "thinking phase." 1. Monitoring the Thinking Phase We hypothesize that hidden states encode a token's relative position within the thinking phase. To test this, we collect hi

Topics: alpha reasoning theta thinking token

Shop Amazon

Meta hires key OpenAI researcher to work on AI reasoning models

techcrunch.com Maxwell Zeff 2025-08-29 20:13:59

Meta has hired a highly influential OpenAI researcher, Trapit Bansal, to work on its AI reasoning models under the company’s new AI superintelligence unit, a person familiar with the matter tells TechCrunch. OpenAI spokesperson Kayla Wood confirmed to TechCrunch that Bansal had departed OpenAI. Bansal’s LinkedIn page says that he left OpenAI in June. Bansal has worked at OpenAI since 2022 and was a key player in kickstarting the company’s work on reinforcement learning alongside co-founder Ily

Topics: ai bansal meta openai reasoning

Shop Amazon

Learnings from building AI agents

news.ycombinator.com Unknown 2025-08-30 02:45:04

How we made our AI code reviewer stop being so noisy I’m Paul, cofounder of cubic —an "AI-native GitHub." One of our core features is an AI code review agent that performs an initial review pass, catching bugs, anti-patterns, duplicated code, and similar issues in pull requests. When we first released this agent back in April, the main feedback we got was straightforward: it was too noisy. Even small PRs often ended up flooded with multiple low-value comments, nitpicks, or outright false posi

Topics: agent ai comments issues reasoning

Shop Amazon

Learnings from Building AI Agents

news.ycombinator.com Unknown 2025-08-30 07:45:04

How we made our AI code reviewer stop being so noisy I’m Paul, cofounder of cubic —an "AI-native GitHub." One of our core features is an AI code review agent that performs an initial review pass, catching bugs, anti-patterns, duplicated code, and similar issues in pull requests. When we first released this agent back in April, the main feedback we got was straightforward: it was too noisy. Even small PRs often ended up flooded with multiple low-value comments, nitpicks, or outright false posi

Topics: agent ai comments issues reasoning

Shop Amazon

Google’s Gemini transparency cut leaves enterprise developers ‘debugging blind’

venturebeat.com Ben Dickson 2025-09-08 12:00:00

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Google‘s recent decision to hide the raw reasoning tokens of its flagship model, Gemini 2.5 Pro, has sparked a fierce backlash from developers who have been relying on that transparency to build and debug applications. The change, which echoes a similar move by OpenAI, replaces the model’s step-by-step reasoning with a simplified summary.

Topics: ai model models raw reasoning

Shop Amazon

Why Some AI Models Spew 50 Times More Greenhouse Gas to Answer the Same Question

gizmodo.com Natalia Mesa 2025-09-11 03:00:55

Like it or not, large language models have quickly become embedded into our lives. And due to their intense energy and water needs, they might also be causing us to spiral even faster into climate chaos. Some LLMs, though, might be releasing more planet-warming pollution than others, a new study finds. Queries made to some models generate up to 50 times more carbon emissions than others, according to a new study published in Frontiers in Communication. Unfortunately, and perhaps unsurprisingly,

Topics: emissions energy models reasoning study

Shop Amazon

What Apple's controversial research paper really tells us about LLMs

zdnet.com Sabrina Ortiz 2025-09-14 16:06:46

CHRISTOPH BURGSTEDT/SCIENCE PHOTO LIBRARY/Getty Generative AI models quickly proved they were capable of performing technical tasks well. Adding reasoning capabilities to the models unlocked unforeseen capabilities, enabling the models to think through more complex questions and produce better-quality, more accurate responses -- or so we thought. Last week, Apple released a research report called "The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the

Topics: model models paper reasoning thinking

Shop Amazon

Do reasoning AI models really ‘think’ or not? Apple research sparks lively debate, response

venturebeat.com Carl Franzen 2025-09-16 07:02:22

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Apple’s machine-learning group set off a rhetorical firestorm earlier this month with its release of “The Illusion of Thinking,” a 53-page research paper arguing that so-called large reasoning models (LRMs) or reasoning large language models (reasoning LLMs) such as OpenAI’s “o” series and Google’s Gemini-2.5 Pro and Flash Thinking don’t a

Topics: ai apple models paper reasoning

Shop Amazon

Do reasoning models really “think” or not? Apple research sparks lively debate, response

venturebeat.com Carl Franzen 2025-09-18 04:02:22

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Apple’s machine-learning group set off a rhetorical firestorm earlier this month with its release of “The Illusion of Thinking,” a 53-page research paper arguing that so-called large reasoning models (LRMs) or reasoning large language models (reasoning LLMs) such as OpenAI’s “o” series and Google’s Gemini-2.5 Pro and Flash Thinking don’t a

Topics: ai apple models paper reasoning

Shop Amazon

New paper pushes back on Apple’s LLM ‘reasoning collapse’ study

9to5mac.com Marcus Mendes 2025-09-18 04:53:57

Apple’s recent AI research paper, “The Illusion of Thinking”, has been making waves for its blunt conclusion: even the most advanced Large Reasoning Models (LRMs) collapse on complex tasks. But not everyone agrees with that framing. Today, Alex Lawsen, a researcher at Open Philanthropy, published a detailed rebuttal arguing that many of Apple’s most headline-grabbing findings boil down to experimental design flaws, not fundamental reasoning limits. The paper also credits Anthropic’s Claude Opus

Topics: apple lawsen models output reasoning

Shop Amazon

AI flunks logic test: Multiple studies reveal illusion of reasoning

techspot.com Cal Jeffrey 2025-09-20 16:17:00

Bottom line: More and more AI companies say their models can reason. Two recent studies say otherwise. When asked to show their logic, most models flub the task – proving they're not reasoning so much as rehashing patterns. The result: confident answers, but not intelligent ones. Apple researchers have uncovered a key weakness in today's most hyped AI systems – they falter at solving puzzles that require step-by-step reasoning. In a new paper, the team tested several leading models on the Tower

Topics: ai apple logic models reasoning

Shop Amazon

With the launch of o3-pro, let’s talk about what AI “reasoning” actually does

arstechnica.com Unknown 2025-09-22 09:58:43

On Tuesday, OpenAI announced that o3-pro, a new version of its most capable simulated reasoning model, is now available to ChatGPT Pro and Team users, replacing o1-pro in the model picker. The company also reduced API pricing for o3-pro by 87 percent compared to o1-pro while cutting o3 prices by 80 percent. While "reasoning" is useful for some analytical tasks, new studies have posed fundamental questions about what the word actually means when applied to these AI systems. We'll take a deeper l

Topics: million o3 pro reasoning tokens

Shop Amazon

New Apple study challenges whether AI models truly “reason” through problems

arstechnica.com Unknown 2025-09-22 14:56:47

In early June, Apple researchers released a study suggesting that simulated reasoning (SR) models, such as OpenAI's o1 and o3, DeepSeek-R1, and Claude 3.7 Sonnet Thinking, produce outputs consistent with pattern-matching from training data when faced with novel problems requiring systematic thinking. The researchers found similar results to a recent study by the United States of America Mathematical Olympiad (USAMO) in April, showing that these same models achieved low scores on novel mathematic

Topics: apple mathematical models reasoning researchers

Shop Amazon

General Reasoning: Free, open resource for building large reasoning models

news.ycombinator.com Unknown 2025-06-07 07:01:59

for the latest reasoning data and models

Topics: data latest models reasoning

Shop Amazon

Together AI’s $305M bet: Reasoning models like DeepSeek-R1 are increasing, not decreasing, GPU demand

venturebeat.com Sean Michael Kerner 2025-06-07 07:30:38

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More When DeepSeek-R1 first emerged, the prevailing fear that shook the industry was that advanced reasoning could be achieved with less infrastructure. As it turns out, that’s not necessarily the case. At least, according to Together AI, the rise of DeepSeek and open-source reasoning has had the exact opposite effect: Instead of reducing the need for infrastructure, it is

Topics: ai demand infrastructure models reasoning

Shop Amazon

How test-time scaling unlocks hidden reasoning abilities in small language models (and allows them to outperform LLMs)

venturebeat.com Ben Dickson 2025-06-07 07:59:54

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Very small language models (SLMs) can outperform leading large language models (LLMs) in reasoning tasks, according to a new study by Shanghai AI Laboratory. The authors show that with the right tools and test-time scaling techniques, an SLM with 1 billion parameters can outperform a 405B LLM on complicated math benchmarks. The ability to deploy SLMs in complex reasoni

Topics: model models policy reasoning tts

Shop Amazon

Latest Tech News

OpenAI, Google DeepMind and Anthropic sound alarm: ‘We may be losing the ability to understand AI’

How to scale RL to 10^26 FLOPs

Smollm3: Smol, multilingual, long-context reasoner LLM

Overclocking LLM Reasoning: Monitoring and Controlling LLM Thinking Path Lengths

Meta hires key OpenAI researcher to work on AI reasoning models

Learnings from building AI agents

Learnings from Building AI Agents

Google’s Gemini transparency cut leaves enterprise developers ‘debugging blind’

Why Some AI Models Spew 50 Times More Greenhouse Gas to Answer the Same Question

What Apple's controversial research paper really tells us about LLMs

Do reasoning AI models really ‘think’ or not? Apple research sparks lively debate, response

Do reasoning models really “think” or not? Apple research sparks lively debate, response

New paper pushes back on Apple’s LLM ‘reasoning collapse’ study

AI flunks logic test: Multiple studies reveal illusion of reasoning

With the launch of o3-pro, let’s talk about what AI “reasoning” actually does

New Apple study challenges whether AI models truly “reason” through problems

General Reasoning: Free, open resource for building large reasoning models

Together AI’s $305M bet: Reasoning models like DeepSeek-R1 are increasing, not decreasing, GPU demand

How test-time scaling unlocks hidden reasoning abilities in small language models (and allows them to outperform LLMs)

About GoKawiil

Privacy

Advertising

Latest Tech News

Trending Topics

Hot Now

Popular

Emerging

About GoKawiil

Privacy

Advertising