GoKawiil - Latest Tech News & Aggregated Headlines

Evaluating GPT5's reasoning ability using the Only Connect game show

news.ycombinator.com Alberto Manzi 2025-11-25 05:52:51

Given the proliferation of reasoning models, we wanted to go beyond knowledge-based benchmarks to test reasoning abilities such as pattern recognition, lateral thinking, abstraction, contextual reasoning (accounting for British cultural references), and multi-step inference. In addition to reasoning, we aimed to assess how effectively models make decisions when presented with judgment calls—such as choosing between making an educated guess based on available clues or calling a function to retri

Topics: clues gpt models players reasoning

Shop Amazon

ChatGPT won’t remove old models without warning after GPT-5 backlash

theverge.com Alex Heath 2025-11-26 09:30:16

Posts from this author will be added to your daily email digest and your homepage feed. After the backlash to replacing its 4o model with GPT-5, OpenAI will no longer get rid of old models without a heads up. “In retrospect, not continuing to offer 4o, at least in the interim, was a miss,” Nick Turley, OpenAI’s head of ChatGPT, said on Tuesday. In an interview with The Verge, he said it was surprising to see the “level of attachment” people had to 4o. “It’s not just change that is difficult fo

Topics: 4o gpt model said users

Shop Amazon

OpenAI adds new GPT-5 models, restores o3, o4-mini and it's a mess all over again

bleepingcomputer.com Unknown 2025-11-27 02:24:31

One of the few things many disliked about ChatGPT was the confusing number of models. OpenAI claimed GPT-5 would fix this, but it seems to have made it worse. A new update is rolling out to ChatGPT. It doesn't upgrade GPT-5, but instead adds more options that some of you would love. Previously, GPT-5 had two variants - GPT (auto-rotates between reasoning and non-reasoning) and GPT-Thinking (reasoning). GPT-5 model selector Source: BleepingComputer Today's update populates the ChatGPT 5 mode

Topics: chatgpt gpt legacy model models

Shop Amazon

Evaluating LLMs playing text adventures

news.ycombinator.com Unknown 2025-11-26 14:19:35

What we’ll do is set a low-ish turn limit and see how much they manage to accomplish in that time.1 Another alternative for more linear games is running them multiple times with a turn limit and seeing how often they get past a particular point within that turn limit. Given how much freedom is offered to players of text adventures, this is a difficult test. It’s normal even for a skilled human player to immerse themselves in their surrounding rather than make constant progress. I wouldn’t be su

Topics: achievements limit llm models turn

Shop Amazon

The equality delete problem in Apache Iceberg

news.ycombinator.com Yingjun Wu 2025-11-26 17:27:50

The Equality Delete Problem in Apache Iceberg Yingjun Wu 9 min read · 14 hours ago 14 hours ago -- Listen Share Press enter or click to view image in full size Since last year, Apache Iceberg has been one of the hottest topics in the data infrastructure world. Databricks recently spent $1 billion to acquire Neon, a startup building a serverless Postgres. Snowflake also spent about $250 million to acquire Crunchy Data, a veteran enterprise-grade Postgres provider. These are not random acquisi

Topics: delete deletes equality files iceberg

Shop Amazon

ChatGPT’s model picker is back, and it’s complicated

techcrunch.com Maxwell Zeff 2025-11-27 04:25:19

When OpenAI launched GPT-5 last week, the company said the model would simplify the ChatGPT experience. OpenAI hoped GPT-5 would act as a sort of “one size fits all” AI model with a router that would automatically decide how to best answer user questions. The company said this unified approach would eliminate the need for users to navigate its model picker — a long, complicated menu of AI options that OpenAI CEO Sam Altman has publicly said he hates. But it looks like GPT-5 is not the unified A

Topics: ai gpt model openai users

Shop Amazon

Liquid AI wants to give smartphones small, fast AI that can see with new LFM2-VL model

venturebeat.com Carl Franzen 2025-11-27 10:13:03

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Liquid AI has released LFM2-VL, a new generation of vision-language foundation models designed for efficient deployment across a wide range of hardware — from smartphones and laptops to wearables and embedded systems. The models promise low-latency performance, strong accuracy, and flexibility for real-world applications. LFM2-VL builds o

Topics: ai lfm2 liquid models vl

Shop Amazon

What are Apple’s options for an AI acquisition beyond Perplexity?

9to5mac.com Marcus Mendes 2025-11-27 15:26:30

Since Apple’s latest earnings call, talk of a potential Perplexity acquisition has quieted down (the fact that Perplexity was once again allegedly caught red-handed sidestepping content restrictions didn’t help). Meanwhile, with the ever-increasing number of engineers from its Foundation Models team jumping ship, Apple’s need for fresh talent is getting more urgent by the day. But if Perplexity is a no-go, who else could Apple buy? I used to agree with Jason Snell’s frequent argument on the Up

Topics: ai apple building models perplexity

Shop Amazon

The Equality Delete Problem in Apache Iceberg

news.ycombinator.com Yingjun Wu 2025-11-27 23:27:50

The Equality Delete Problem in Apache Iceberg Yingjun Wu 9 min read · 6 hours ago 6 hours ago -- Listen Share Press enter or click to view image in full size Since last year, Apache Iceberg has been one of the hottest topics in the data infrastructure world. Databricks recently spent $1 billion to acquire Neon, a startup building a serverless Postgres. Snowflake also spent about $250 million to acquire Crunchy Data, a veteran enterprise-grade Postgres provider. These are not random acquisiti

Topics: delete deletes equality files iceberg

Shop Amazon

UK government suggests deleting files to save water

theverge.com Justine Calma 2025-11-28 00:11:39

is a senior science reporter covering energy and the environment with more than a decade of experience. She is also the host of Hell or High Water: When Disaster Hits Home , a podcast from Vox Media and Audible Originals. Posts from this author will be added to your daily email digest and your homepage feed. Can deleting old emails and photos help the UK tackle ongoing drought this year? That’s the hope, according to recommendations for the public included in a press release today from the Nat

Topics: data deleting drought old water

Shop Amazon

OpenAI Scrambles to Update GPT-5 After Users Revolt

wired.com Will Knight 2025-11-27 02:13:08

OpenAI’s GPT-5 model was meant to be a world-changing upgrade to its wildly popular and precocious chatbot. But for some users, last Thursday’s release felt more like a wrenching downgrade, with the new ChatGPT presenting a diluted personality and making surprisingly dumb mistakes. On Friday, OpenAI CEO Sam Altman took to X to say the company would keep the previous model, GPT-4o, running for Plus users. A new feature designed to seamlessly switch between models depending on the complexity of t

Topics: gpt model models openai users

Shop Amazon

Launch HN: Design Arena (YC S25) – Head-to-head AI benchmark for aesthetics

news.ycombinator.com Unknown 2025-11-28 15:10:03

Hi HN, I’m Grace from Design Arena ( https://www.designarena.ai/ ) - we’re building a crowdsourced benchmark for AI-generated visuals (websites, images, video, and more). We put AI models and builder tools in head-to-head comparisons that get voted on by real users from around the world. Think “Hot or Not” for the AI era :) (Btw, when we say real users we mean real users, so you may get a captcha on the site. Sorry, but we have to use every bot protection available! We only want human ratings,

Topics: ai design like make models

Shop Amazon

Are Gesture-Enabled AirPod Live Translations Incoming? iOS 26 Beta Suggests Yes

cnet.com See Full Bio 2025-11-28 14:56:00

Some models of Apple's popular AirPods may soon be able to do live, in-person language translations when you squeeze both stems at the same time. According to an image posted by websites including 9to5Mac, the touch gesture is featured in a system asset that's part of Apple iOS 26 developer beta 6. In the image, the gesture is shown on a pair of AirPods with words in languages including English, Spanish, German, French and Portuguese. In June, Apple showed off AI-powered live translations featu

Topics: airpods apple including live models

Shop Amazon

Deals: 32GB M4 MacBook Air $200 off, Black/Natural Apple Watch Ultra 2 $150 off, AirPods 4 $99, more

9to5mac.com Justin Kahn 2025-11-28 20:45:00

Today’s 9to5Toys Lunch Break deals are now ready to roll starting with the M4 MacBook Air. Alongside entry-level models from $799, we are also still tracking rare $200 price drops on a 24GB model with 1TB of storage an heavily upgraded variant with 32GB of RAM today. Moving over to Apple Watch Ultra 2 – we have both the Natural and Black Titanium models at $150 off the list price as well as ongoing deals on AirPods 4, Apple chargers, iPad A16, and more. Scope it all out down below. Rare price d

Topics: amazon apple m4 models price

Shop Amazon

Nexus: An Open-Source AI Router for Governance, Control and Observability

news.ycombinator.com Unknown 2025-11-28 18:41:12

Today, we're excited to introduce Nexus - a powerful AI router designed to optimize how AI agents interact with multiple MCP tools and Large Language Models. Nexus serves as a central hub that aggregates Model Context Protocol (MCP) servers while providing intelligent LLM routing, security and governance capabilities. Nexus is an AI router that solves two critical challenges in the AI ecosystem: MCP Server Aggregation: Instead of managing connections to multiple MCP servers individually, Nexus

Topics: ai mcp models nexus routing

Shop Amazon

Evaluating LLMs Playing Text Adventures

news.ycombinator.com Unknown 2025-11-28 20:19:35

What we’ll do is set a low-ish turn limit and see how much they manage to accomplish in that time.1 Another alternative for more linear games is running them multiple times with a turn limit and seeing how often they get past a particular point within that turn limit. Given how much freedom is offered to players of text adventures, this is a difficult test. It’s normal even for a skilled human player to immerse themselves in their surrounding rather than make constant progress. I wouldn’t be su

Topics: achievements limit llm models turn

Shop Amazon

This Gemini UI change should’ve been the default from the start (APK teardown)

androidauthority.com Unknown 2025-11-29 21:58:20

Edgar Cervantes / Android Authority TL;DR Google could move Gemini’s AI model switcher to the bottom of your smartphone screen. This would make it easier to switch AI models with one hand compared to the current UI. Google could also bring a UI tweak to Gemini’s video menu. Google’s Gemini chatbot lets you choose between several AI models for your specific needs. You can use the flash models if you value quick, responsive answers, or the Pro models if you need more in-depth answers. Now, it

Topics: gemini google model models ui

Shop Amazon

The GPT-5 rollout has been a big mess

arstechnica.com Unknown 2025-11-29 11:25:34

It's been less than a week since the launch of OpenAI's new GPT-5 AI model, and the rollout hasn't been a smooth one. So far, the release sparked one of the most intense user revolts in ChatGPT's history, forcing CEO Sam Altman to make an unusual public apology and reverse key decisions. At the heart of the controversy has been OpenAI's decision to automatically remove access to all previous AI models in ChatGPT (approximately nine, depending on how you count them) when GPT-5 rolled out to user

Topics: ai chatgpt gpt models openai

Shop Amazon

Show HN: Keeps – Mail a postcard that plays your voice

news.ycombinator.com Unknown 2025-11-29 18:45:09

How long does delivery take? After printing has completed, domestic (USA) delivery typically takes 4-6 business days via USPS First Class Mail. International delivery takes 9-13 business days. We provide tracking so you can follow your postcard's journey. Est. Delivery: Aug 21–Aug 25 How does the voice message work? After creating your postcard, you'll record up to 60 seconds of audio. We generate a unique QR code that's printed on the postcard. Recipients simply scan it with their phone camera

Topics: app days delivery postcard postcards

Shop Amazon

OpenAI is editing its GPT-5 rollout on the fly — here’s what’s changing in ChatGPT

venturebeat.com Carl Franzen 2025-11-29 17:01:44

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now OpenAI’s launch of its most advanced AI model GPT-5 last week has been a stress test for the world’s most popular chatbot platform with 700 million weekly active users — and so far, OpenAI is openly struggling to keep users happy and its service running smoothly. The new flagship model GPT-5 — available in four variants of different speed

Topics: ai gpt models openai users

Shop Amazon

AI summaries can downplay medical issues for female patients, UK research finds

engadget.com Unknown 2025-11-30 15:29:44

The latest example of bias permeating artificial intelligence comes from the medical field. A new study surveyed real case notes from 617 adult social care workers in the UK and found that when large language models summarized the notes, they were more likely to omit language such as "disabled," "unable" or "complex" when the patient was tagged as female, which could lead to women receiving insufficient or inaccurate medical care. Research led by the London School of Economics and Political Sci

Topics: care medical models notes patient

Shop Amazon

OpenAI is testing 3,000-per-week limit for GPT-5 Thinking

bleepingcomputer.com Unknown 2025-11-30 10:49:32

OpenAI has responded to criticism that it shipped GPT-5 with token limits to minimize cost and maximize profit not with words, but rather with a new 3,000-per-week limit. In a series of posts on X, Sam Altman confirmed that OpenAI is working on a 3,000-per-week limit for GPT-5 Thinking messages for Plus users. This will increase the reasoning rate limits available today, but OpenAI does not plan to stop at just this. Sam Altman claims that OpenAI will soon raise all model-class rate limits "a

Topics: gpt models openai sam users

Shop Amazon

Here are all the GPT-5 updates OpenAI has rolled out since launch

zdnet.com Webb Wright 2025-11-30 15:58:00

SOPA Images/Contributor/Getty ZDNET's key takeaways: OpenAI released its long-awaited GPT-5 on Thursday. Some users complained GPT-5 was inferior to its predecessor, 4o. In response, the company announced a flurry of changes. OpenAI released GPT-5, the long-awaited upgrade to the model which powers ChatGPT, Thursday. In typical OpenAI fashion, the release has included plenty of twists, turns, and drama. It was almost inevitable that the new model would disappoint a significant number of pe

Topics: gpt model openai post users

Shop Amazon

Flying With Delta? Crunchyroll Anime Is Coming to Your Flights

cnet.com See Full Bio 2025-11-30 14:20:00

Crunchyroll and Delta Air Lines announced a partnership Monday to offer anime as part of the in-flight entertainment on the airline. The anime addition is set to arrive later this year, and Delta passengers will be able to stream content handpicked from Crunchyroll. SkyMiles members will receive an exclusive perk of a 24-hour free trial of the streaming service that can be used abroad the flight or after touching down. Crunchyroll will hit more than 169,000 seatback screens on Delta aircraft. T

Topics: anime content crunchyroll delta flight

Shop Amazon

Users Were So Addicted to GPT-4o That They Immediately Cajoled OpenAI Into Bringing It Back After It Got Killed

futurism.com Unknown 2025-11-30 22:52:54

Last week, OpenAI startled the world by announcing that its long-awaited GPT-5 would replace all of its previous models, The move sparked outrage. Apart from being severely underwhelmed by the performance of OpenAI's newest offering, power users immediately started to beg CEO Sam Altman to bring back preceding models, often for a reason that had little to do with intelligence, artificial or otherwise: they were attached to it on an emotional level. "Why are we getting rid of the variants and 4

Topics: altman chatgpt models openai users

Shop Amazon

Token growth indicates future AI spend per dev

news.ycombinator.com Ewa Szyszka 2025-11-30 23:59:42

Kilo just broke through the 1 trillion tokens a month barrier on OpenRouter for the first time. Each of the open source family of AI coding tools (Cline, Roo, Kilo) is growing rapidly this month. Part of this growth is caused by Cursor and Claude starting to throttle their users. We wrote about Cursor at the beginning of July and about Claude in the second half of July. Their throttling sent users to the open source family of AI coding tools causing the increases you see in the graphs above. C

Topics: ai costs inference models token

Shop Amazon

Profitable Nigerian food delivery Chowdeck lands $9M from Novastar, Y Combinator

techcrunch.com Tage Kene-Okafor 2025-11-30 23:11:55

Chowdeck, a Lagos-based food delivery startup that has stayed profitable in a notoriously tough and low-margin market, has raised $9 million in Series A funding to launch a quick commerce strategy and expand into more cities in Nigeria and Ghana. The equity round was led by Novastar Ventures, with participation from Y Combinator, AAIC Investment, Rebel Fund, GFR Fund, Kaleo, HoaQ, and others. The investors are betting on the team’s ability to pair local market expertise with execution and turn

Topics: aluko chowdeck delivery food local

Shop Amazon

Deals: Apple Watch Series 10 new low up to $150 off, M4 Pro MacBook Pro $299 off, iPad Air, more

9to5mac.com Justin Kahn 2025-12-01 15:45:11

Today’s 9to5Toys Lunch Break deals are kicking off with the lowest prices we have tracked online for GPS + Cell Apple Watch Series 10 models. Alongside GPS only variants at $100 off, you’ll now find the cell variants at up to $149 off in brand new condition with a full Apple warranty in tow. Those deals also join one of the best prices to date on the M4 Pro MacBook Pro at $299 off the list price, ongoing all-time lows on M3 iPad Air, and more. Everything awaits below. Apple Watch Series 10 Cell

Topics: apple ipad models price pro

Shop Amazon

xAI is testing Grok 4.20 to take on GPT-5, may launch this month

bleepingcomputer.com Unknown 2025-12-01 15:34:44

Elon Musk-owned xAI is testing Grok 4.20, a new model update to Grok 4, which already competes with GPT-5 in some benchmarks, such as ARC-AGI 2. GPT-5 is one of the best models for coding, and it competes with Claude Opus 4.1 head-to-head in some coding challenges. On the other hand, Grok falls a bit short when it comes to building full-fledged apps. But that might change soon as xAI is testing Grok 4.20. In a post on X, Elon Musk teased Grok 4.20 for a late August launch. Previously, Elon

Topics: 20 elon grok model musk

Shop Amazon

Apple brings OpenAI's GPT-5 to iOS and macOS

news.ycombinator.com Unknown 2025-12-01 14:51:53

OpenAI's GPT-5 model went live for most ChatGPT users this week, but lots of people use ChatGPT not through OpenAI's interface but through other platforms or tools. One of the largest deployments is iOS, the iPhone operating system, which allows users to make certain queries via GPT-4o. It turns out those users won't have to wait long for the latest model: Apple will switch to GPT-5 in iOS 26, iPadOS 26, and macOS Tahoe 26, according to 9to5Mac. Apple has not officially announced when those OS

Topics: chatgpt gpt ios model users

Shop Amazon

Latest Tech News

Evaluating GPT5's reasoning ability using the Only Connect game show

ChatGPT won’t remove old models without warning after GPT-5 backlash

OpenAI adds new GPT-5 models, restores o3, o4-mini and it's a mess all over again

Evaluating LLMs playing text adventures

The equality delete problem in Apache Iceberg

ChatGPT’s model picker is back, and it’s complicated

Liquid AI wants to give smartphones small, fast AI that can see with new LFM2-VL model

What are Apple’s options for an AI acquisition beyond Perplexity?

The Equality Delete Problem in Apache Iceberg

UK government suggests deleting files to save water

OpenAI Scrambles to Update GPT-5 After Users Revolt

Launch HN: Design Arena (YC S25) – Head-to-head AI benchmark for aesthetics

Are Gesture-Enabled AirPod Live Translations Incoming? iOS 26 Beta Suggests Yes

Deals: 32GB M4 MacBook Air $200 off, Black/Natural Apple Watch Ultra 2 $150 off, AirPods 4 $99, more

Nexus: An Open-Source AI Router for Governance, Control and Observability

Evaluating LLMs Playing Text Adventures

This Gemini UI change should’ve been the default from the start (APK teardown)

The GPT-5 rollout has been a big mess

Show HN: Keeps – Mail a postcard that plays your voice

OpenAI is editing its GPT-5 rollout on the fly — here’s what’s changing in ChatGPT

AI summaries can downplay medical issues for female patients, UK research finds

OpenAI is testing 3,000-per-week limit for GPT-5 Thinking

Here are all the GPT-5 updates OpenAI has rolled out since launch

Flying With Delta? Crunchyroll Anime Is Coming to Your Flights

Users Were So Addicted to GPT-4o That They Immediately Cajoled OpenAI Into Bringing It Back After It Got Killed

Token growth indicates future AI spend per dev

Profitable Nigerian food delivery Chowdeck lands $9M from Novastar, Y Combinator

Deals: Apple Watch Series 10 new low up to $150 off, M4 Pro MacBook Pro $299 off, iPad Air, more

xAI is testing Grok 4.20 to take on GPT-5, may launch this month

Apple brings OpenAI's GPT-5 to iOS and macOS

About GoKawiil

Privacy

Advertising

Latest Tech News

Trending Topics

Hot Now

Popular

Emerging

About GoKawiil

Privacy

Advertising