Latest Tech News

Stay updated with the latest in technology, AI, cybersecurity, and more

Filtered by: models Clear Filter

When AIs bargain, a less advanced agent could cost you

This study is part of a growing body of research warning about the risks of deploying AI agents in real-world financial decision-making. Earlier this month, a group of researchers from multiple universities argued that LLM agents should be evaluated primarily on the basis of their risk profiles, not just their peak performance. Current benchmarks, they say, emphasize accuracy and return-based metrics, which measure how well an agent can perform at its best but overlook how safely it can fail. Th

Inside LinkedIn’s AI overhaul: Job search powered by LLM distillation

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more The advent of natural language search has encouraged people to change how they search for information, and LinkedIn, which has been working with numerous AI models over the past year, hopes this shift extends to job search. LinkedIn’s AI-powered jobs search, now available to all LinkedIn users, uses distilled, fine-tuned models trained on

ChatGPT Has Already Polluted the Internet So Badly That It's Hobbling Future AI Development

The rapid rise of ChatGPT — and the cavalcade of competitors' generative models that followed suit — has polluted the internet with so much useless slop that it's already kneecapping the development of future AI models. As the AI-generated data clouds the human creations that these models are so heavily dependent on amalgamating, it becomes inevitable that a greater share of what these so-called intelligences learn from and imitate is itself an ersatz AI creation. Repeat this process enough, a

Is gravity just entropy rising? Long-shot idea gets another look

Isaac Newton was never entirely happy with his law of universal gravitation. For decades after publishing it in 1687, he sought to understand how, exactly, two objects were able to pull on each other from afar. He and others came up with several mechanical models, in which gravity was not a pull, but a push. For example, space might be filled with unseen particles that bombard the objects on all sides. The object on the left absorbs the particles coming from the left, the one on the right absorb

Meta's Llama 3.1 can recall 42 percent of the first Harry Potter book

In recent years, numerous plaintiffs—including publishers of books, newspapers, computer code, and photographs—have sued AI companies for training models using copyrighted material. A key question in all of these lawsuits has been how easily AI models produce verbatim excerpts from the plaintiffs’ copyrighted content. For example, in its December 2023 lawsuit against OpenAI, the New York Times Company produced dozens of examples where GPT-4 exactly reproduced significant passages from Times sto

Chemical knowledge and reasoning of large language models vs. chemist expertise

Benchmark corpus To compile our benchmark corpus, we utilized a broad list of sources (Methods), ranging from completely novel, manually crafted questions over university exams to semi-automatically generated questions based on curated subsets of data in chemical databases. For quality assurance, all questions have been reviewed by at least two scientists in addition to the original curator and automated checks. Importantly, our large pool of questions encompasses a wide range of topics and que

Is Gravity Just Entropy Rising? Long-Shot Idea Gets Another Look

Isaac Newton was never entirely happy with his law of universal gravitation. For decades after publishing it in 1687, he sought to understand how, exactly, two objects were able to pull on each other from afar. He and others came up with several mechanical models, in which gravity was not a pull, but a push. For example, space might be filled with unseen particles that bombard the objects on all sides. The object on the left absorbs the particles coming from the left, the one on the right absorb

Do reasoning AI models really ‘think’ or not? Apple research sparks lively debate, response

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Apple’s machine-learning group set off a rhetorical firestorm earlier this month with its release of “The Illusion of Thinking,” a 53-page research paper arguing that so-called large reasoning models (LRMs) or reasoning large language models (reasoning LLMs) such as OpenAI’s “o” series and Google’s Gemini-2.5 Pro and Flash Thinking don’t a

The 14 Best TVs We’ve Reviewed, Plus Buying Advice (2025)

Saving up for a new screen? Whether you’re a videophile or new to 4K, the best TVs you can buy are bigger, brighter, and cheaper than ever. To help you navigate the dozens of models from LG, Samsung, TCL, Hisense, Sony, Panasonic, and others, we've done intensive testing and watched hundreds of hours of content to grab the standouts from our recent reviews. Below you'll find everything from the best OLED TVs we've ever tested to the best cheap TVs for tight budgets—with plenty of excellent optio

Topics: 4k best models tvs ve

These are the best iPad deals right now, just in case iPadOS 26 made you rethink things

A short while ago, I was browsing Apple deals on Amazon (as one does) – and something stuck out to me. High-end iPad Pros, particularly 12.9-inch models, are surprisingly cheap. I saw M1 models with 1TB and cellular for under $700. Given the recent iPadOS 26 overhaul that makes the iPad much more Mac-like, I figured these deals would be worth a share. While renewed iPad deals are the focus here because of their affordability, new iPad deals are also mentioned at the end. Renewed M1 iPad Pro de

Rethinking AI: DeepSeek’s playbook shakes up the high-spend, high-compute paradigm

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more When DeepSeek released its R1 model this January, it wasn’t just another AI announcement. It was a watershed moment that sent shockwaves through the tech industry, forcing industry leaders to reconsider their fundamental approaches to AI development. What makes DeepSeek’s accomplishment remarkable isn’t that the company developed novel ca

Seven replies to the viral Apple reasoning paper and why they fall short

The Apple paper on limitations in the “reasoning” of Large Reasoning Models, which raised challenges for the latest scaling hypothesis, has clearly touched a nerve. Tons of media outlets covered it; huge numbers of people on social media are discussing. My own post here laying out the Apple paper in historical and scientific context was so popular that well over 150,000 people read it, biggest in this newsletter’s history. The Guardian published an adaptation of my post (“When billion-dollar AI

Beyond GPT architecture: Why Google’s Diffusion approach could reshape LLM deployment

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Last month, along with a comprehensive suite of new AI tools and innovations, Google DeepMind unveiled Gemini Diffusion. This experimental research model uses a diffusion-based approach to generate text. Traditionally, large language models (LLMs) like GPT and Gemini itself have relied on autoregression, a step-by-step approach where each

Do reasoning models really “think” or not? Apple research sparks lively debate, response

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Apple’s machine-learning group set off a rhetorical firestorm earlier this month with its release of “The Illusion of Thinking,” a 53-page research paper arguing that so-called large reasoning models (LRMs) or reasoning large language models (reasoning LLMs) such as OpenAI’s “o” series and Google’s Gemini-2.5 Pro and Flash Thinking don’t a

New paper pushes back on Apple’s LLM ‘reasoning collapse’ study

Apple’s recent AI research paper, “The Illusion of Thinking”, has been making waves for its blunt conclusion: even the most advanced Large Reasoning Models (LRMs) collapse on complex tasks. But not everyone agrees with that framing. Today, Alex Lawsen, a researcher at Open Philanthropy, published a detailed rebuttal arguing that many of Apple’s most headline-grabbing findings boil down to experimental design flaws, not fundamental reasoning limits. The paper also credits Anthropic’s Claude Opus

Frontier AI Models Are Getting Stumped by a Simple Children's Game

Earlier this week, researchers at Apple released a damning paper, criticizing the AI industry for vastly overstating the ability of its top AI models to reason or "think." The team found that the models including OpenAI's o3, Anthropic's Claude 3.7, and Google's Gemini were stumped by even the simplest of puzzles. For instance, the "large reasoning models," or LRMs, consistently failed at Tower of Hanoi, a children's puzzle game that involves three pegs and a number of differently-sized disks t

Zero-Shot Forecasting: Our Search for a Time-Series Foundation Model

Introduction In the last few years, the field of time-series forecasting has seen a fundamental shift. Where we once depended solely on classic statistical methods, think ARIMA, SARIMA, and Prophet, new “foundation” models have emerged, promising to bring the power and flexibility of large language models (LLMs) into the world of time-series data. The allure is obvious: can we build a single, reusable forecasting model that works across a variety of datasets and domains, instead of painstakingl

AI flunks logic test: Multiple studies reveal illusion of reasoning

Bottom line: More and more AI companies say their models can reason. Two recent studies say otherwise. When asked to show their logic, most models flub the task – proving they're not reasoning so much as rehashing patterns. The result: confident answers, but not intelligent ones. Apple researchers have uncovered a key weakness in today's most hyped AI systems – they falter at solving puzzles that require step-by-step reasoning. In a new paper, the team tested several leading models on the Tower

Meta Says Its New AI Model Understands Physical Rules Like Gravity

A new generative AI model Meta released this week could change how machines understand the physical world, opening up opportunities for smarter robots and more, the company said. The new open-source model, called Video Joint Embedding Predictive Architecture 2, or V-JEPA 2, is designed to help artificial intelligence understand things like gravity and object permanence, Meta said. "By sharing this work, we aim to give researchers and developers access to the best models and benchmarks to help

Google Has a New AI-Weather Model for Cyclones. Should Experts Trust It?

On Thursday, Google announced a new advancement powered by artificial intelligence that could change the way we predict hurricanes. Weather Lab is an interactive website that shows live and historic AI weather models, including its latest tropical cyclone model, which includes hurricanes. It was developed by Google DeepMind, the company's London-based AI research lab. The cyclone model can predict the formation, track, intensity, size and shape of the storm. And it can create 50 possible scenar

First thoughts on o3 pro

As “leaked”, OpenAI cut o3 pricing by 80% today (from $10/$40 per mtok to $2/$8 - matching GPT 4.1 pricing!!) to set the stage of the launch of o3-pro ($20/$80, supporting an unverified community theory that the -pro variants are 10x base model calls with majority voting as referenced in their papers and in our Chai episode). o3-pro reports a 64% win rate vs o3 on human testers and does marginally better on 4/4 reliability benchmarks, but as sama noticed, the actual experience expands when you t

Google DeepMind just changed hurricane forecasting forever with new AI model

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Google DeepMind announced Thursday what it claims is a major breakthrough in hurricane forecasting, introducing an artificial intelligence system that can predict both the path and intensity of tropical cyclones with unprecedented accuracy — a longstanding challenge that has eluded traditional weather models for decades. The company launc

Google has a new AI model and website for forecasting tropical storms

is a senior science reporter covering energy and the environment with more than a decade of experience. She is also the host of Hell or High Water: When Disaster Hits Home , a podcast from Vox Media and Audible Originals. Google is using a new AI model to forecast tropical cyclones and working with the US National Hurricane Center (NHC) to test it out. Google DeepMind and Google Research launched a new website today called Weather Lab to share AI weather models that Google is developing. It sa

AI chatbots tell users what they want to hear, and that’s problematic

The world’s leading artificial intelligence companies are stepping up efforts to deal with a growing problem of chatbots telling people what they want to hear. OpenAI, Google DeepMind, and Anthropic are all working on reining in sycophantic behavior by their generative AI products that offer over-flattering responses to users. The issue, stemming from how the large language models are trained, has come into focus at a time when more and more people have adopted the chatbots not only at work as

After all the Pixel battery issues, I don’t think I can keep recommending Google’s phones

Robert Triggs / Android Authority I love a good Pixel. I’ve owned several, defended them in heated group chats, and even converted a few friends to the cause. But lately? I’m hesitating — hard. The elephant in the room, if you haven’t noticed, is batteries. Not in the “oh, it doesn’t last a full day” sense — we’re talking swelling, melting, and potentially exploding. The kind of hardware horror that once tanked a Samsung generation is now infesting Google’s own backyard. If you’ve missed the s

Multiverse Computing raises $215M for tech that could radically lower AI costs

Spanish startup Multiverse Computing on Thursday said it has raised an enormous Series B round of €189 million (about $215 million) on the strength of a technology it calls “CompactifAI.” CompactifAI is a quantum-computing inspired compression technology that is capable of reducing the size of LLMs by up to 95% without impacting model performance, the company said. Specifically, Multiverse offers compressed versions of well-known, open-source LLMs – primarily small models – such as Llama 4 Sco

German startup DeepL says latest Nvidia chips lets it translate the whole internet in just 18 days

DeepL on Wednesday said it was deploying one of the latest Nvidia systems that would allow the German startup to translate the whole internet in just 18 days. This is sharply down from 194 days previously. , DeepL is a startup that has developed its own AI models for and competes with Google Translate. Nvidia is meanwhile looking to expand the customer base for its chips — which are designed to power artificial intelligence applications — beyond hyperscalers such as Microsoft and Amazon. It

Lemony is a plug-and-play device for secure on-premise AI

Lemony launched a simple-looking device to deliver on-premise artificial intelligence to redefine how organizations deploy generative AI. Lemony’s secure, hardware-based node offers enterprise-grade ‘AI in a Box,’ empowering companies to run advanced, end-to-end AI workflows privately, instantly, and without cloud dependence. Lemony has secured a $2M seed funding round led by True Ventures. Lemony’s AI nodes are stackable and scalable, creating small, modular AI compute clusters that support s

Microsoft-backed Mistral launches European AI cloud to compete with AWS and Azure

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more Mistral AI, the French artificial intelligence startup, announced Wednesday a sweeping expansion into AI infrastructure that positions the company as Europe’s answer to American cloud computing giants, while simultaneously unveiling new reasoning models that rival OpenAI’s most advanced systems. The Paris-based company revealed Mistral Co

Apple will let third-party developers use Apple Intelligence models to empower their apps

Apple today announced a new framework, called Foundation Models, that will allow developers to access the same models that power on-device Apple Intelligence. This will let third-party apps easily offer AI features. Because these models run locally using the Apple Silicon chips on your iPhone or iPad, they can be made available wholly offline, and with no associated cloud API costs. This is the first time Apple is giving developers direct access to the power of these on-device foundation model