Latest Tech News

Stay updated with the latest in technology, AI, cybersecurity, and more

Filtered by: gpu Clear Filter

AI startup Nscale came out of nowhere and is blowing away Nvidia CEO Jensen Huang

In this article NVDA Follow your favorite stocks CREATE FREE ACCOUNT Nscale, the UK-headquartered AI infrastructure provider. Courtesy: Nscale Two years ago, Nscale was a brand new startup in the U.K. that had yet to raise any outside funding or officially announce its existence. Last year the London-based company came out of stealth, and in December announced that it had raised its Series A fundraising, totaling $155 million. Now, Nscale finds itself at the center of the action in the hottest

The Asus gaming laptop ACPI firmware bug

The ASUS Gaming Laptop ACPI Firmware Bug: A Deep Technical Investigation If You're Here, You Know The Pain You own a high-end ASUS ROG laptop perhaps a Strix, Scar, or Zephyrus. It's specifications are impressive: an RTX 30/40 series GPU, a top-tier Intel processor, and plenty of RAM. Yet, it stutters during basic tasks like watching a YouTube video, audio crackles and pops on Discord calls, the mouse cursor freezes for a split second, just long enough to be infuriating. You've likely tried a

The Asus Gaming Laptop ACPI Firmware Bug: A Deep Technical Investigation

The ASUS Gaming Laptop ACPI Firmware Bug: A Deep Technical Investigation If You're Here, You Know The Pain You own a high-end ASUS ROG laptop perhaps a Strix, Scar, or Zephyrus. It's specifications are impressive: an RTX 30/40 series GPU, a top-tier Intel processor, and plenty of RAM. Yet, it stutters during basic tasks like watching a YouTube video, audio crackles and pops on Discord calls, the mouse cursor freezes for a split second, just long enough to be infuriating. You've likely tried a

The Evolution of Shaders

Nvidia launched the first GPU in 1999 with the GeForce 256, pioneering hardware T&L. In 2001, the GeForce 3 introduced programmable shaders, marking the shader era. Over 24 years, GPUs advanced massively, from 57 million transistors in NV20 to 92 billion in Blackwell (B100). Shader counts exploded—from 16 in 2007 to over 21,000 in 2025. Unified shaders appeared in 2007, and AI-focused Tensor Cores began in 2017. Despite huge performance gains, GPU prices rose modestly: a high-end card cost $800

The race to build a distributed GPU runtime

For a decade, GPUs have delivered breathtaking data processing speedups. However, data is growing far beyond the capacity of a single GPU server. When your work drifts beyond GPU local memory or VRAM (e.g., HBM and GDDR), hidden costs of inefficiencies show up: spilling to host, shuffling over networks, and idling accelerators. Before jumping straight into the latest distributed computing effort underway at NVIDIA and AMD, let’s quickly level set on what distributed computing is, how it works, a

Intel Arc Pro B50 GPU Launched at $349 for Compact Workstations

Intel has officially expanded its professional GPU portfolio with the launch of the Arc Pro B50, designed specifically for small-form-factor workstations. The card is based on the Battlemage BMG-G21 GPU, configured with 16 Xe2 cores. It comes paired with 16 GB of GDDR6 VRAM clocked at 14 Gbps on a 128-bit memory bus, producing 224 GB/s of effective bandwidth. This configuration ensures that the GPU cores are properly fed while maintaining a low overall power draw. Intel has kept the total board

How to Spot (and Fix) 5 Common Performance Bottlenecks in Pandas Workflows

Slow data loads, memory-intensive joins, and long-running operations—these are problems every Python practitioner has faced. They waste valuable time and make iterating on your ideas harder than it should be. This post walks through five common pandas bottlenecks, how to recognize them, and some workarounds you can try on CPU with a few tweaks to your code—plus a GPU-powered drop-in accelerator, cudf.pandas, that delivers order-of-magnitude speedups with no code changes. Don’t have a GPU on yo

Topics: cudf df gpu memory pandas

Rasterizer: A GPU-accelerated 2D vector graphics engine in ~4k LOC

Rasterizer Inspired by my love of Adobe Flash, I started to work on a GPU-accelerated 2D vector graphics engine for the original iPhone, and then the Mac. Three iterations, and many years later, I have finally released Rasterizer . It is up to 60x faster than the CPU, making it ideal for vector animated UI (press the T key in the demo app to see an example). The 10-year gestation was the result of endlessly iterating over the core problem of efficiently turning vector paths into reference-qual

Topics: app gpu objects path use

What Does will-change In CSS Do?

What Does will-change In CSS Actually Do? I've been using the will-change CSS property for a while now, but I realized I never understood exactly what it does. I knew it was some sort of performance optimization but that's pretty much it. will-change What is will-change? It's a hint to the browser, something along the lines of “hey, I’m about to animate these properties, please get ready.” Browsers may respond by promoting the element to its own GPU compositing layer, pre‑allocating memory,

Powerful GPUs or Fast Interconnects: Analyzing Relational Workloads

Authors: Marko Kabić, Bowen Wu, Jonas Dann, Gustavo Alonso Abstract In this study we explore the impact of different combinations of GPU models (RTX3090, A100, H100, GraceHoppers - GH200) and interconnects (PCIe 3.0, PCIe 4.0, PCIe 5.0, and NVLink 4.0) on various relational data analytics workloads (TPC-H, H2O-G, ClickBench). We present MaxBench, a comprehensive framework designed for benchmarking, profiling, and modeling these workloads on GPUs. Beyond delivering detailed performance metrics,

GPU Prefix Sums: A nearly complete collection

GPU Prefix Sums GPUPrefixSums aims to bring state-of-the-art GPU prefix sum techniques from CUDA and make them available in portable compute shaders. In addition to this, it contributes "Decoupled Fallback," a novel fallback technique for Chained Scan with Decoupled Lookback that should allow devices without forward thread progress guarantees to perform the scan without crashing. The D3D12 implementation includes an extensive survey of GPU prefix sums, ranging from the warp to the device level;

GPUPrefixSums – state of the art GPU prefix sum algorithms

GPU Prefix Sums GPUPrefixSums aims to bring state-of-the-art GPU prefix sum techniques from CUDA and make them available in portable compute shaders. In addition to this, it contributes "Decoupled Fallback," a novel fallback technique for Chained Scan with Decoupled Lookback that should allow devices without forward thread progress guarantees to perform the scan without crashing. The D3D12 implementation includes an extensive survey of GPU prefix sums, ranging from the warp to the device level;

Building A16Z's Personal AI Workstation

In the era of foundation models, multimodal AI, LLMs, and ever-larger datasets, access to raw compute is still one of the biggest bottlenecks for researchers, founders, developers, and engineers. While the cloud offers scalability, building a personal AI Workstation delivers complete control over your environment, latency reduction, custom configurations and setups, and the privacy of running all workloads locally. This post covers our version of a four-GPU workstation powered by the new NVIDIA

Google’s silence on the Pixel 10 GPU is concerning for gamers

Google has just launched the Pixel 10 phones, and there’s plenty to know about them. There are some great new AI features, a groundbreaking IP68 rating for the Pixel 10 Pro Fold, and a telephoto camera for the base model. All four phones are also powered by the Tensor G5 processor. Unfortunately, Google remains deeply silent on a major Tensor G5 feature. This makes me worried about the Pixel 10 phones, especially if you’re a gamer. Pixel 10 GPU: When no news is bad news Google has said almost

Topics: 10 games google gpu pixel

How to Think About GPUs

We love TPUs at Google, but GPUs are great too. This chapter takes a deep dive into the world of NVIDIA GPUs – how each chip works, how they’re networked together, and what that means for LLMs, especially compared to TPUs. This section builds on Chapter 2 and Chapter 5 , so you are encouraged to read them first. What Is a GPU? A modern ML GPU (e.g. H100, B200) is basically a bunch of compute cores that specialize in matrix multiplication (called Streaming Multiprocessors or SMs) connected to a

How to Scale Your Model: How to Think About GPUs

We love TPUs at Google, but GPUs are great too. This chapter takes a deep dive into the world of NVIDIA GPUs – how each chip works, how they’re networked together, and what that means for LLMs, especially compared to TPUs. This section builds on Chapter 2 and Chapter 5 , so you are encouraged to read them first. What Is a GPU? A modern ML GPU (e.g. H100, B200) is basically a bunch of compute cores that specialize in matrix multiplication (called Streaming Multiprocessors or SMs) connected to a

Unlocking Real-Time Supply Chain Analytics with GPU Technology: Q&A with Meher Siddhartha Errabolu

As supply chains generate ever-larger datasets and demand faster decisions, traditional central processing unit (CPU)-based systems are approaching their limits. To meet real-time requirements at scale, developers turn to accelerated computing powered by graphics processing units (GPUs). These massive parallel processors reshape how data is accessed, analyzed, and operationalized across the enterprise supply chain. One expert at the forefront of this transformation is Meher Siddhartha Errabolu.

We Hit 100% GPU Utilization–and Then Made It 3× Faster by Not Using It

We recently used Qwen3-Embedding-0.6B to embed millions of text documents while sustaining near-100% GPU utilization the whole way. That’s usually the gold standard that machine learning engineers aim for… but here’s the twist: in the time it took to write this blog post, we found a way to make the same workload 3× faster, and it didn’t involve maxing out GPU utilization at all. That story’s for another post, but first, here’s the recipe that got us to near-100%. The workload Here at the Daft

I converted this Windows 11 mini PC into a Linux work station - and didn't regret it

Geekom IT15 Mini PC ZDNET's key takeaways The Geekcom IT 15 Mini PC is available on Amazon for $1,100. This tiny form-factor PC has plenty of power to spare for everyday tasks. The only downside is that the IT15 doesn't have a dedicated GPU. $1,199 at Amazon I've always enjoyed a mini PC, and any time I can cobble together a system with a tiny form factor, I feel like a kid at Christmas. The latest mini to grace my desktop was the Geekcom IT 15 Mini PC, a truly diminutive machine with a healt

Topics: gpu intel linux mini pc

I replaced Windows 11 with Linux on this mini PC, and it's already paying off for my workflow

Geekom IT15 Mini PC ZDNET's key takeaways The Geekcom IT 15 Mini PC is available on Amazon for $1,100. This tiny form-factor PC has plenty of power to spare for everyday tasks. The only downside is that the IT15 doesn't have a dedicated GPU. $1,199 at Amazon I've always enjoyed a mini PC, and any time I can cobble together a system with a tiny form factor, I feel like a kid at Christmas. The latest mini to grace my desktop was the Geekcom IT 15 Mini PC, a truly diminutive machine with a healt

Topics: gpu intel linux mini pc

I installed Linux on this mini PC - here's how it transformed my workflow (for the better)

Geekom IT15 Mini PC ZDNET's key takeaways The Geekcom IT 15 Mini PC is available on Amazon for $1,100. This tiny form-factor PC has plenty of power to spare for everyday tasks. The only downside is that the IT15 doesn't have a dedicated GPU. $899 at Amazon I've always enjoyed a mini PC, and any time I can cobble together a system with a tiny form factor, I feel like a kid at Christmas. The latest mini to grace my desktop was the Geekcom IT 15 Mini PC, a truly diminutive machine with a healthy

Topics: gpu intel linux mini pc

GPT-OSS-120B runs on just 8GB VRAM & 64GB+ system RAM

Here is the thing, the expert layers run amazing on CPU ( ~17T/s 25T/s on a 14900K) and you can force that with this new llama-cpp option: --cpu-moe . You can offload just the attention layers to GPU (requiring about 5 to 8GB of VRAM) for fast prefill. KV cache for the sequence Attention weights & activations Routing tables LayerNorms and other “non-expert” parameters No giant MLP weights are resident on the GPU, so memory use stays low. This yields an amazing snappy system for a 120B mod

Topics: gpu layers moe ms tokens

Writing a Rust GPU kernel driver: a brief introduction on how GPU drivers work

This post is the second iteration of a series of posts that provide an in-depth look at the development of Tyr, a state-of-the-art Rust GPU driver for the Linux Kernel, supporting Arm Mali CSF-based GPUs. As promised in the first iteration, we will now explore how GPU drivers work in more detail by exploring an application known as VkCube . As the program name implies, this application uses the Vulkan API to render a rotating cube on the screen. Its simplicity makes it a prime candidate to be u

AMD reveals vanilla RX 9060 with cut-down specs, for pre-built PCs only

What just happened? A non-XT version of the Radeon RX 9060 will likely be the last member of AMD's desktop RX 9000 lineup. Positioning a new product just below the $300 8GB 9060 XT might have allowed Team Red to undercut Nvidia's RTX 5060, but AMD has instead made its new entry-level card exclusive to system integrators. Tom's Hardware recently acquired an AMD press release formally announcing the standard variant of the Radeon RX 9060. While some specifications remain unclear, the upcoming GPU

Topics: 9060 amd gpu rx xt

Oxmiq Labs Inc.: Re-Architecting the GPU Stack: From Atoms to Agents

Campbell – CA. Oxmiq Labs Inc., the all-new GPU software and IP startup founded by one of the world’s top GPU architects and visionaries, Raja Koduri, emerges from stealth after two years of intensive IP development. Raja has assembled a world-class team of GPU and AI architects with over 500 years of combined experience, hundreds of patents, and a collective track record of generating more than $100B in revenue at prior companies. The Opportunity: Modern computing has fundamentally shifte

July Steam Survey: RTX 5000 surge, new top GPU, 4 in 10 participants using AMD CPUs

What just happened? Here's a clear indication that the supply and pricing problems which have plagued Nvidia's RTX 5000 series are easing: the cards experienced a large uptick in user share in the latest Steam survey. However, there's still no sign of AMD's 9000-series in the main GPU chart, where the RX 7600 XT has only just appeared. Elsewhere, we've got a new most-popular card among participants, and AMD processors have passed a milestone. Starting with the main GPU chart, July's results sho

Topics: amd chart gpu main rtx

AMD signals push for discrete NPUs to rival GPUs in AI-powered PCs

Serving tech enthusiasts for over 25 years.TechSpot means tech analysis and advice you can trust Forward-looking: As AI workloads reshape computing, AMD is exploring a dedicated neural processing unit to complement or replace GPUs in AI PCs. This move reflects growing industry momentum toward specialized accelerators that promise faster performance and greater energy efficiency – key factors as PC makers race to deliver smarter, leaner machines. AMD is exploring whether PCs could benefit from

Topics: ai amd discrete gpus npu

Startup and Nobel laureate collaborate to create GPU financial exchange

What just happened? A new financial marketplace aims to offer crucial risk management tools to a resource at the center of the tech industry's explosive growth. If successful, the initiative could make access to high-performance compute more predictable and affordable. The world of artificial intelligence is built on computing power, and at the heart of that engine are graphics processing units. These chips are in such high demand that they have often been compared to oil during the gold rush,

MSI expects to top 10 million motherboard sales for the first time as market rebounds

Bottom line: MSI is staging a decisive comeback in the fiercely competitive motherboard and GPU markets, navigating supply chain challenges and shifting industry demands to reclaim its place alongside top rivals. This rebound signals broader shifts in tech manufacturing and consumer appetite amid rapid innovation. DigiTimes reports MSI is poised for a milestone year, with analysts projecting global motherboard shipments will top 10 million units in 2025. The surge marks a sharp recovery for the

Topics: 2025 ai digitimes gpu msi

Ryzen 7 9800X3D vs. Ryzen 5 7600X: CPU and GPU Scaling Benchmark

Time for a new benchmark series. CPU and GPU scaling tests have been high on the community's wishlist for a while now – and it's time to deliver. Several comparisons are already in the pipeline, including fan favorites like AMD's 5800X3D and Intel's discounted Core Ultra 7 265K, so expect those soon. To kick things off and establish a baseline, we're starting with the Ryzen 7 9800X3D versus the ever-popular Ryzen 5 7600X. It's a clash between a gaming powerhouse and a budget-friendly performanc