GoKawiil - Latest Tech News & Aggregated Headlines

Deploying DeepSeek on 96 H100 GPUs

news.ycombinator.com Unknown 2025-10-26 02:07:28

by: The SGLang Team , May 05, 2025 DeepSeek is a popular open-source large language model (LLM) praised for its strong performance. However, its large size and unique architecture, which uses Multi-head Latent Attention (MLA) and Mixture of Experts (MoE), require an advanced system for efficient serving at scale. In this blog, we explain how we match DeepSeek's inference system performance with SGLang. Our implementation, shown in the figure above, runs on 12 nodes in the Atlas Cloud, each equ

Topics: batch decode deepseek memory sglang

Shop Amazon

A deep dive into Debian 13 /tmp: What's new, and what to do if you don't like it

news.ycombinator.com Unknown 2025-10-26 15:39:44

Debian 13 “Trixie” introduces an important change to /tmp. Traditionally, it’s been just another filesystem, albeit with some special permissions that allows everyone on the system to use it without being able to remove each other’s files. In Trixie, it’s been moved off the disk into memory – specifically a type of memory called tmpfs. To quote the tmpfs man page: The tmpfs facility allows the creation of filesystems whose contents reside in virtual memory. Since the files on such filesystems

Topics: files memory mount tmp tmpfs

Shop Amazon

A Deep Dive into Debian 13 /tmp: What's New, and What to Do If You Don't Like It

news.ycombinator.com Unknown 2025-10-26 20:39:44

Debian 13 “Trixie” introduces an important change to /tmp. Traditionally, it’s been just another filesystem, albeit with some special permissions that allows everyone on the system to use it without being able to remove each other’s files. In Trixie, it’s been moved off the disk into memory – specifically a type of memory called tmpfs. To quote the tmpfs man page: The tmpfs facility allows the creation of filesystems whose contents reside in virtual memory. Since the files on such filesystems

Topics: files memory mount tmp tmpfs

Shop Amazon

Don't Want Gemini to Learn About You? How to Turn That New Feature Off

cnet.com See Full Bio 2025-10-28 14:52:02

The more you chat with Google's Gemini, the better it will get to know you thanks to a new learning feature in the generative AI chatbot. Gemini has already been able to recall past conversations if you ask it to, but this new functionality will allow it to learn your preferences and interact with you in more personalized ways, Google said. But if you don't want an AI to learn about you, you can turn it off. AI chatbots have seen their memories grow longer this year. Other tools, like OpenAI'

Topics: ai feature gemini memory turn

Shop Amazon

The Pixel 10 comes with 12GB of RAM, but Google has locked some of it away

androidauthority.com Unknown 2025-10-30 06:15:09

Robert Triggs / Android Authority The Google Pixel 10 series has plenty of new features for both fans and newcomers. As is Google’s current direction of travel, many of these focus on new or improved AI tools. However, leaning heavily on AI to power smartphones requires a few changes to how we usually think about running apps. Typically, when we open an app, we expect it to appear within a second or so, with the phone quickly fetching it from storage and loading it into RAM. Larger application

Topics: ai google memory pixel ram

Shop Amazon

How much RAM do you actually need in 2025? I broke it down for Windows and Mac users

zdnet.com Cesar Cadenas 2025-10-30 06:53:00

Kerry Wan/ZDNET Follow ZDNET: Add us as a preferred source on Google. ZDNET's key takeaways RAM is an important hardware resource that allows a computer to perform optimally and at fast speeds. Escalating computing demands have made 16GB of RAM the new standard for PCs and laptops if they are to continue performing at their best. Even then, 16GB of memory may not be enough for certain users, so it's important to know when to upgrade. I used to struggle when shopping for a new computer. Ove

Topics: 16gb computer laptop memory ram

Shop Amazon

How procedural memory can cut the cost and complexity of AI agents

venturebeat.com Ben Dickson 2025-10-31 09:37:23

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now A new technique from Zhejiang University and Alibaba Group gives large language model (LLM) agents a dynamic memory, making them more efficient and effective at complex tasks. The technique, called Memp, provides agents with a “procedural memory” that is continuously updated as they gain experience, much like how humans learn from practice.

Topics: agent agents memory memp procedural

Shop Amazon

Samsung announces the Tab S10 Lite, a $349 tablet with an S Pen

engadget.com Unknown 2025-10-31 13:58:23

This week, Samsung introduced a new addition to its tablet lineup with the Tab S10 Lite. It will be available on September 4 and will cost $349. The Lite will be the least expensive of Samsung's current tablet generation; the S10 FE has a starting cost of $500 while costs go as high as $980 for the S10 Ultra . The Tab S10 Lite is 10.9 inches, and it comes in gray, silver or a coral red. Its screen has a 90Hz refresh rate and a maximum brightness of 600 nits. Models can have 6GB memory with 128G

Topics: lite memory s10 tab tablet

Shop Amazon

Framework Laptop 16

news.ycombinator.com Unknown 2025-10-31 16:12:30

NVIDIA® GeForce RTX™ 5070 Laptop GPU 798 AI TOPS Up to 100W TGP (on AC) Up to 50W TGP (on battery) 8GB GDDR7 memory 128-bit memory bus 384GB/s memory bandwidth 2.0GHz base clock and up to 2.4GHz boost USB-C port with DP Alt Mode and charging 4,608 CUDA cores DLSS 4 5th gen tensor cores 4th gen ray tracing 1x 9th gen NVIDIA encoder 1x 6th gen NVIDIA decoder Radeon™ RX 7700S (2nd Gen) 32 compute units Up to 100W TGP (on AC) 8GB 18Gbps GDDR6 memory USB-C port with DP Alt Mode

Topics: 100w gen memory nvidia tgp

Shop Amazon

OOMProf: Profiling on the Brink

news.ycombinator.com Unknown 2025-10-30 07:00:55

It was just a little while past the Sunset Strip They found the girl's body in an open pit Her mouth was sewn shut, but her eyes were still wide Gazing through the fog to the other side "Black River Killer" by Blitzen Trapper Introduction This one's personal! For 15 years working on DBMS systems the OOM killer has led to more than its fair share of debugging rabbit holes. Anyone who's been around the block in Linux systems programming has probably crossed paths with the Linux OOM killer. This

Topics: 753999 memory oom profile program

Shop Amazon

Framework Laptop 16. Upgraded!

news.ycombinator.com Unknown 2025-11-01 17:12:30

NVIDIA® GeForce RTX™ 5070 Laptop GPU 798 AI TOPS Up to 100W TGP (on AC) Up to 50W TGP (on battery) 8GB GDDR7 memory 128-bit memory bus 384GB/s memory bandwidth 2.0GHz base clock and up to 2.4GHz boost USB-C port with DP Alt Mode and charging 4,608 CUDA cores DLSS 4 5th gen tensor cores 4th gen ray tracing 1x 9th gen NVIDIA encoder 1x 6th gen NVIDIA decoder Radeon™ RX 7700S (2nd Gen) 32 compute units Up to 100W TGP (on AC) 8GB 18Gbps GDDR6 memory USB-C port with DP Alt Mode

Topics: 100w gen memory nvidia tgp

Shop Amazon

Memory optimizations to reduce CPU costs

news.ycombinator.com Unknown 2025-11-01 18:42:01

Imagine that you are given the following task, with a file like this: Name,Department,Salary,JoinDate John Smith,Marketing,75000,2023-01-15 Alice Johnson,Finance,82000,2022-06-22 Bob Lee,Sales,68000,2024-03-10 Emma Davis,HR,71000,2021-09-01 You want to turn that into a single list of all the terms in the (potentially very large) file. In other words, you want to turn it into something like this: [ { "term" : "Name" , "position" : 0 , "length" : 4 } , { "term" : "Department" , "position" : 5

Topics: array gc memory public string

Shop Amazon

IBM's Power11 Processor Architecture

news.ycombinator.com Ryan Smith 2025-11-03 06:43:13

Third up on today’s CPU track is IBM. Big Blue is at the conference to talk about its latest generation Power architecture chip, the Power11. IBM starts off by recapping Power. Why it exists, and what IBM’s goals are for the processor and architecture. IBM is very system-focused, rather than focusing on selling just CPUs. 1P and 2P systems, all the way up to 16P “glueless” systems. Recapping the Power release history, Power10 has proven to be very successful for IBM, “beyond our wildest dreams

Topics: ibm memory power power10 power11

Shop Amazon

IBM's Power11 Processor Architecture at Hot Chips 2025

news.ycombinator.com Ryan Smith 2025-11-03 18:43:13

Third up on today’s CPU track is IBM. Big Blue is at the conference to talk about its latest generation Power architecture chip, the Power11. IBM starts off by recapping Power. Why it exists, and what IBM’s goals are for the processor and architecture. IBM is very system-focused, rather than focusing on selling just CPUs. 1P and 2P systems, all the way up to 16P “glueless” systems. Recapping the Power release history, Power10 has proven to be very successful for IBM, “beyond our wildest dreams

Topics: ibm memory power power10 power11

Shop Amazon

The SD Association has an official SD card format utility [Win/OS X/Linux]

news.ycombinator.com Unknown 2025-11-04 03:55:56

SD Memory Card Formatter for Linux ver.1.0.3 for SD/SDHC/SDXC/SDUC The SD Memory Card Formatter formats SD Memory Card, SDHC Memory Card, SDXC Memory Card and SDUC Memory Card (respectively SD/SDHC/SDXC/SDUC Cards) complying with the SD File System Specification created by the SD Association (SDA). It is strongly recommended to use the SD Memory Card Formatter to format SD/SDHC/SDXC/SDUC Cards rather than using formatting tools provided with individual operating systems. In general, formatting

Topics: card memory sd sdhc sdxc

Shop Amazon

In-Memory Filesystems in Rust

news.ycombinator.com André Arko 2025-11-04 06:33:35

In-memory Filesystems in Rust I’ve been working on a CLI tool recently, and one of the things it does is manage files on disk. I have written a lot of file management tests for Bundler, and the two biggest reasons that the Bundler test suite is slow are exec and fstat . Knowing that, I thought I would try to get out ahead of the slow file stat problem by using an in-memory filesystem for testing. A collaborator mentioned being happy with the Go package named Afero for this purpose, and so I se

Topics: filesystem fs memory std vfs

Shop Amazon

AGI is an engineering problem, not a model training problem

news.ycombinator.com Vinci Rufus 2025-11-04 23:18:52

Published: Aug 13, 2025 | at 11:00 AM We’ve reached an inflection point in AI development. The scaling laws that once promised ever-more-capable models are showing diminishing returns. GPT-5, Claude, and Gemini represent remarkable achievements, but they’re hitting asymptotes that brute-force scaling can’t solve. The path to artificial general intelligence isn’t through training ever-larger language models—it’s through building engineered systems that combine models, memory, context, and determ

Topics: memory model models probabilistic systems

Shop Amazon

Writing Speed-of-Light Flash Attention for 5090 in CUDA C++

news.ycombinator.com Thien Tran 2025-11-04 21:29:02

In this post, I will walkthrough how I learned to implement Flash Attention for 5090 in CUDA C++. The main objective is to learn writing attention in CUDA C++, since many features are not available in Triton, such as MXFP8 / NVFP4 MMA for sm120. I also feel this is a natural next step after learning about matmul kernels. Lastly, there are many excellent blogposts on writing fast matmul kernels, but there is none for attention. So I want to take this chance to write up something nicely. Readers

Topics: dim int memory row shared

Shop Amazon

Io_uring, kTLS and Rust for zero syscall HTTPS server

news.ycombinator.com Unknown 2025-11-07 17:51:44

This is my personal blog. The views expressed on these pages are mine alone and not those of my employer. Around the turn of the century we started to get a bigger need for high capacity web servers. For example there was the C10k problem paper. At the time, the kinds of things done to reduce work done per request was pre-forking the web server. This means a request could be handled without an expensive process creation. Because yes, creating a new process for every request used to be somethi

Topics: kernel memory queue server web

Shop Amazon

Show HN: I replaced vector databases with Git for AI memory (PoC)

news.ycombinator.com Unknown 2025-11-10 16:20:11

DiffMem: Git-Based Differential Memory for AI Agents DiffMem is a lightweight, git-based memory backend designed for AI agents and conversational systems. It uses Markdown files for human-readable storage, Git for tracking temporal evolution through differentials, and an in-memory BM25 index for fast, explainable retrieval. This project is a proof-of-concept (PoC) exploring how version control systems can serve as a foundation for efficient, scalable memory in AI applications. At its core, Dif

Topics: agents ai diffmem git memory

Shop Amazon

Parallel Reduce and Scan on the GPU

news.ycombinator.com Maximilian Maldacker 2025-11-06 12:21:37

Parallel reduce and scan on the GPU Introduction GPUs are formidable parallel machines, capable of running thousands of threads simultaniously. They are excellent for embarassily parallel algorithms, but are quite different than the ones on the CPU due to the way GPUs work. You can’t just build and run an application. You need to interact with the GPU driver via one of several APIs available (CUDA, OpenCL, Vulkan, DirectX, OpenGL, etc), manage the device memory, organize the transfers between

Topics: memory reduce scan subgroup sum

Shop Amazon

Learning about GPUs through measuring memory bandwidth

news.ycombinator.com Unknown 2025-11-11 09:35:18

Learning About GPUs Through Measuring Memory Bandwidth At Traverse Research, we need to have a deep understanding of GPU performance to develop our benchmark, Evolve. Additionally, we sometimes do projects for very specific hardware where we need to know all the ins and outs of this hardware. One way we do this is by using microbenchmarks to measure specific parts of the GPU to get new insights. In this article, we will share what we learned from measuring the memory bandwidth of various GPUs.

Topics: byte cache data hardware memory

Shop Amazon

Do Large Language Models Dream of AI Agents?

wired.com Will Knight 2025-11-11 20:00:00

During sleep, the human brain sorts through different memories, consolidating important ones while discarding those that don’t matter. What if AI could do the same? Bilt, a company that offers local shopping and restaurant deals to renters, recently deployed several million agents with the hopes of doing just that. Bilt uses technology from a startup called Letta that allows agents to learn from previous conversations and share memories with one another. Using a process called “sleeptime compu

Topics: agents ai context information memory

Shop Amazon

EloqKV, a distributed database with Redis compatible API (GPLv2 and AGPLv3)

news.ycombinator.com Unknown 2025-11-14 02:35:02

EloqKV EloqKV is a high-performance distributed database with a Redis/ValKey compatible API. It offers features like ACID transactions, full elasticity and scalability, tiered storage, and session-style transaction syntax — all while preserving Redis' simplicity and usability. EloqKV is engineered for developers who need a modern no-compromise database solution to power the next generation of demanding applications in the AI era. Why Choose EloqKV Over Redis? Feature Redis EloqKV High Perform

Topics: data eloqkv memory redis transactions

Shop Amazon

Intel 80286 emulator for Raspberry Pico

news.ycombinator.com Unknown 2025-11-15 15:53:23

🕹️ Pico-286 Project The Pico-286 project is an endeavor to emulate a classic PC system, reminiscent of late 80s and early 90s computers, on the Raspberry Pi Pico (RP2040/RP2350 microcontroller). It aims to provide a lightweight and educational platform for experiencing retro computing and understanding low-level system emulation. 🖥️✨ ⭐ Key Features 🧠 8086/8088/80186/286 CPU Emulation: At its core, the project emulates an Intel cpu up to 286 family. At its core, the project emulates an Intel

Topics: audio drive memory mode pico

Shop Amazon

How much RAM does your PC really need in 2025? I did the math for Windows and Mac users

zdnet.com Cesar Cadenas 2025-11-18 00:41:00

Kerry Wan/ZDNET Get more in-depth ZDNET tech coverage: Add us as a preferred Google source on Chrome and Chromium browsers. I used to struggle when shopping for a new computer. Over time, I learned to narrow things down to what I call the "performance trifecta" -- three main components you should be mindful of when buying a laptop or desktop: processor, storage drive, and RAM. The first two are pretty easy to figure out. A good processor ensures that a computer performs well, and a lot of loca

Topics: computer just laptop memory ram

Shop Amazon

Unlocking Real-Time Supply Chain Analytics with GPU Technology: Q&A with Meher Siddhartha Errabolu

computer.org Laurel Tweed 2025-11-18 04:00:44

As supply chains generate ever-larger datasets and demand faster decisions, traditional central processing unit (CPU)-based systems are approaching their limits. To meet real-time requirements at scale, developers turn to accelerated computing powered by graphics processing units (GPUs). These massive parallel processors reshape how data is accessed, analyzed, and operationalized across the enterprise supply chain. One expert at the forefront of this transformation is Meher Siddhartha Errabolu.

Topics: data gpu memory supply time

Shop Amazon

Nvidia Tilus: A Tile-Level GPU Kernel Programming Language

news.ycombinator.com Unknown 2025-11-15 03:36:46

Tilus: A Tile-Level GPU Kernel Programming Language Documentation | Paper Tilus is a powerful domain-specific language (DSL) for GPU programming that offers: Thread-block-level granularity with tensors as the primary data type. with as the primary data type. Explicit control over shared memory and register tensors (unlike Triton). over shared memory and register tensors (unlike Triton). Low-precision types with arbitrary bit-widths (1 to 8 bits). It also includes automatic tuning, caching,

Topics: arbitrary memory programming tensors tilus

Shop Amazon

The 7 Best Mattress Toppers (2025) Out of Dozens We've Tested: Supportive, Plush, Memory Foam

wired.com Nena Farrell 2025-11-18 14:31:00

Honorable Mentions Not everything we test makes the cut as a pick, but that doesn't mean it's a bad mattress topper. Here are a few that our testers slept on and still got a good night's sleep with, but didn't love as much as the picks above. Avocado Alpaca Topper for $809: If you're looking for a mattress topper that's extra soft, WIRED reviewer Scott Gilbertson recommends the Avocado Alpaca Mattress Topper. He says it's one of the softest things he's ever slept on, and that it's like sleepin

Topics: foam mattress memory soft topper

Shop Amazon

9 Best Pillows (2025) Tested For Side, Back, and Stomach Sleepers

wired.com Nena Farrell 2025-11-18 18:36:00

Compare the Top 5 Pillows Pillow Fill Material Shell Material Adjustable? Cooling? Sizes Available Purple Freeform Adjustable Pillow High-density memory foam and polyester fill blend, plus hyper elastic polymer gel layer Polyester stretch knit with proprietary cooling fibers Yes Yes Standard and king Coop Cool+ Adjustable Pillow Gel-infused memory foam and microfiber Nylon and polyester shell with a memory foam pad and gel, plus a pillowcase made of polyethylene, polyester, and spandex Yes Yes

Topics: foam inches memory pillow pillows

Shop Amazon

Latest Tech News

Deploying DeepSeek on 96 H100 GPUs

A deep dive into Debian 13 /tmp: What's new, and what to do if you don't like it

A Deep Dive into Debian 13 /tmp: What's New, and What to Do If You Don't Like It

Don't Want Gemini to Learn About You? How to Turn That New Feature Off

The Pixel 10 comes with 12GB of RAM, but Google has locked some of it away

How much RAM do you actually need in 2025? I broke it down for Windows and Mac users

How procedural memory can cut the cost and complexity of AI agents

Samsung announces the Tab S10 Lite, a $349 tablet with an S Pen

Framework Laptop 16

OOMProf: Profiling on the Brink

Framework Laptop 16. Upgraded!

Memory optimizations to reduce CPU costs

IBM's Power11 Processor Architecture

IBM's Power11 Processor Architecture at Hot Chips 2025

The SD Association has an official SD card format utility [Win/OS X/Linux]

In-Memory Filesystems in Rust

AGI is an engineering problem, not a model training problem

Writing Speed-of-Light Flash Attention for 5090 in CUDA C++

Io_uring, kTLS and Rust for zero syscall HTTPS server

Show HN: I replaced vector databases with Git for AI memory (PoC)

Parallel Reduce and Scan on the GPU

Learning about GPUs through measuring memory bandwidth

Do Large Language Models Dream of AI Agents?

EloqKV, a distributed database with Redis compatible API (GPLv2 and AGPLv3)

Intel 80286 emulator for Raspberry Pico

How much RAM does your PC really need in 2025? I did the math for Windows and Mac users

Unlocking Real-Time Supply Chain Analytics with GPU Technology: Q&A with Meher Siddhartha Errabolu

Nvidia Tilus: A Tile-Level GPU Kernel Programming Language

The 7 Best Mattress Toppers (2025) Out of Dozens We've Tested: Supportive, Plush, Memory Foam

9 Best Pillows (2025) Tested For Side, Back, and Stomach Sleepers

About GoKawiil

Privacy

Advertising

Latest Tech News

Trending Topics

Hot Now

Popular

Emerging

About GoKawiil

Privacy

Advertising