Latest Tech News

Stay updated with the latest in technology, AI, cybersecurity, and more

Filtered by: token Clear Filter

PyPI invalidates tokens stolen in GhostAction supply chain attack

The Python Software Foundation team has invalidated all PyPI tokens stolen in the GhostAction supply chain attack in early September, confirming that the threat actors didn't abuse them to publish malware. These tokens are used to publish packages on the Python Package Index (PyPI), a software repository that acts as the default source for Python's package management tools and hosts hundreds of thousands of packages. As PyPI admin Mike Fiedler explained, a GitGuardian employee reported on Sept

One Token to rule them all – Obtaining Global Admin in every Entra ID tenant

While preparing for my Black Hat and DEF CON talks in July of this year, I found the most impactful Entra ID vulnerability that I will probably ever find. This vulnerability could have allowed me to compromise every Entra ID tenant in the world (except probably those in national cloud deployments). If you are an Entra ID admin reading this, yes that means complete access to your tenant. The vulnerability consisted of two components: undocumented impersonation tokens, called “Actor tokens”, that

Tether reveals USAT stablecoin, appoints Bo Hines, former White House advisor, to lead U.S. business

Tether, the issuer of the largest stablecoin, has named a CEO for its U.S. business and is launching a new token for U.S. institutions. The moves underscore Tether's commitment to regulatory engagement and entry into the U.S. The company, once accused of being a criminal's "go-to cryptocurrency" has been rebranding itself as a partner to American lawmakers and law enforcement since pro-crypto President Donald Trump's return to the White House. Bo Hines, who headed the Presidential Council of A

Hackers steal 3,325 secrets in GhostAction GitHub supply chain attack

A new supply chain attack on GitHub, dubbed 'GhostAction,' has compromised 3,325 secrets, including PyPI, npm, DockerHub, GitHub tokens, Cloudflare, and AWS keys. The attack was discovered by GitGuardian researchers, who report that the first signs of compromise on one of the impacted projects, FastUUID, became evident on September 2, 2025. The attack involved leveraging compromised maintainer accounts to perform commits that added a malicious GitHub Actions workflow file that triggers automat

Salesloft says Drift customer data thefts linked to March GitHub account hack

Salesloft said a breach of its GitHub account in March allowed hackers to steal authentication tokens that were later used in a mass-hack targeting several of its big tech customers. Citing an investigation by Google’s incident response unit Mandiant, Salesloft said on its data breach page that the as-yet-unnamed hackers accessed Salesloft’s GitHub account and performed reconnaissance activities from March until June, which allowed them to download “content from multiple repositories, add a gue

Understanding Transformers Using a Minimal Example

The internal mechanisms of Transformer Large Language models (LLMs), particularly the flow of information through the layers and the operation of the attention mechanism, can be challenging to follow due to the vast amount of numbers involved. We humans can hardly form a mental model. This article aims to make these workings tangible by providing visualizations of a Transformer's internal state. Utilizing a minimal dataset and a deliberately simplified model, it is possible to follow the model's

Pearl – An Erlang lexer and syntax highlighter in Gleam

Pearl An Erlang lexer and syntax highlighter for Gleam! Pearl is a lexer and syntax highlighter for Erlang, written in Gleam. The lexer is based on glexer and just , allowing you to convert Erlang source code into tokens. There is also an API which allows you to highlight Erlang code using ansi colours, html or a custom format. Heavily inspired by contour . gleam add pearl@2 import pearl pub fn main ( ) { let code = " -module(hello). -export([hello_world/0]). hello_world() -> io:fwrite( \" H

Google warns that mass data theft hitting Salesloft AI agent has grown bigger

Google is advising users of the Salesloft Drift AI chat agent to consider all security tokens connected to the platform compromised following the discovery that unknown attackers used some of the credentials to access email from Google Workspace accounts. In response, Google has revoked the tokens that were used in the breaches and disabled integration between the Salesloft Drift agent and all Workspace accounts as it investigates further. The company has also notified all affected account hold

Google warns Salesloft breach impacted some Workspace accounts

Google now reports that the Salesloft Drift breach is larger than initially thought, warning that attackers also used stolen OAuth tokens to access a small number of Google Workspace email accounts in addition to stealing data from Salesforce instances. "Based on new information identified by GTIG, the scope of this compromise is not exclusive to the Salesforce integration with Salesloft Drift and impacts other integrations,' warns Google. "We now advise all Salesloft Drift customers to treat

Are OpenAI and Anthropic losing money on inference?

I keep hearing what a cash incinerator AI is, especially around inference. While it seems reasonable on the surface, I've often been wary of these kind of claims, so I decided to do some digging. I haven't seen anyone really try to deconstruct the costs in running inference at scale and the economics really interest me. This is really napkin math. I don't have any experience at running frontier models at scale, but I do know a lot about the costs and economics of running very high throughput s

Are OpenAI and Anthropic Losing Money on Inference?

I keep hearing what a cash incinerator AI is, especially around inference. While it seems reasonable on the surface, I've often been wary of these kind of claims, so I decided to do some digging. I haven't seen anyone really try to deconstruct the costs in running inference at scale and the economics really interest me. This is really napkin math. I don't have any experience at running frontier models at scale, but I do know a lot about the costs and economics of running very high throughput s

An illustrated guide to OAuth

OAuth was first introduced in 2007. It was created at Twitter because Twitter wanted a way to allow third-party apps to post tweets on users' behalf. Take a second to imagine designing something like that today. How would you do it? One way would just be to ask the user for their username and password. So you create an unofficial Twitter client, and present the user a login screen that says "log in with Twitter". The user does so, but instead of logging into Twitter, they're actually sending the

An Illustrated Guide to OAuth

OAuth was first introduced in 2007. It was created at Twitter because Twitter wanted a way to allow third-party apps to post tweets on users' behalf. Take a second to imagine designing something like that today. How would you do it? One way would just be to ask the user for their username and password. So you create an unofficial Twitter client, and present the user a login screen that says "log in with Twitter". The user does so, but instead of logging into Twitter, they're actually sending the

MCP Gateway and Registry

MCP Gateway Model Context Protocol gateway & proxy - unify REST, MCP, and A2A with federation, virtual servers, retries, security, and an optional admin UI. ContextForge MCP Gateway is a feature-rich gateway, proxy and MCP Registry that federates MCP and REST services - unifying discovery, auth, rate-limiting, observability, virtual servers, multi-transport protocols, and an optional Admin UI into one clean endpoint for your AI clients. It runs as a fully compliant MCP server, deployable via P

Wyoming Launches the First State-Issued Stablecoin

Wyoming just became the first U.S. state to launch its own stablecoin, promising lower fees and instant transactions. The Wyoming Stable Token Commission, created in 2023, announced on Tuesday the launch of the mainnet blockchain network for its new Frontier Stable Token (FRNT). The commission says FRNT will make digital payments faster and more secure for individuals, businesses, and institutions around the world. How that’s different from any other stablecoin’s promises is anyone’s guess. “F

Open-Sourced AI Models May Be More Costly in the Long Run, Study Finds

As more businesses adopt AI, picking which model to go with is a major decision. While open-sourced models may seem cheaper initially, a new study warns that those savings can evaporate fast, due to the extra computing power they require. In fact, open-source AI models burn through significantly more computing resources than their closed-source rivals when performing the same tasks, according to a study published Thursday by Nous Research. The researchers tested dozens of AI models, including

That ‘cheap’ open-source AI model is actually burning through your compute budget

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now A comprehensive new study has revealed that open-source artificial intelligence models consume significantly more computing resources than their closed-source competitors when performing identical tasks, potentially undermining their cost advantages and reshaping how enterprises evaluate AI deployment strategies. The research, conducted by

What's the strongest AI model you can train on a laptop in five minutes?

What’s the strongest model I can train on my MacBook Pro in five minutes? I’ll give the answer upfront: the best 5-minute model I could train was a ~1.8M-param GPT-style transformer trained on ~20M TinyStories tokens, reaching ~9.6 perplexity on a held-out split. Here’s an example of the output, with the prompt bolded: Once upon a time, there was a little boy named Tim. Tim had a small box that he liked to play with. He would push the box to open. One day, he found a big red ball in his yard.

The Trump Family’s Crypto Empire Is Expanding—Fast

The Trump family’s crypto empire is making another major move on Wall Street. World Liberty Financial, the crypto firm founded by the Trump family last September, has struck a massive deal that will see a publicly traded fintech company, ALT5 Sigma, purchase up to $1.5 billion of the family’s proprietary cryptocurrency, $WLFI. The deal is a landmark event that further intertwines the Trump family’s private business ventures with President Donald Trump’s increasingly pro-crypto government polic

Claude gets 1M tokens support via API to take on Gemini 2.5 Pro

Claude Sonnet 4 has been upgraded, and it can now remember up to 1 million tokens of context, but only when it's used via API. This could change in the future. This is 5x more than the previous limit. It also means that Claude now supports remembering over 75,000 lines of code, or even hundreds of documents in a single session. Previously, you were required to submit details to Claude in small chunks, but that also meant Claude would forget the context as it hit the limit. With up to a 1 milli

Token growth indicates future AI spend per dev

Kilo just broke through the 1 trillion tokens a month barrier on OpenRouter for the first time. Each of the open source family of AI coding tools (Cline, Roo, Kilo) is growing rapidly this month. Part of this growth is caused by Cursor and Claude starting to throttle their users. We wrote about Cursor at the beginning of July and about Claude in the second half of July. Their throttling sent users to the open source family of AI coding tools causing the increases you see in the graphs above. C

GPT-OSS-120B runs on just 8GB VRAM & 64GB+ system RAM

Here is the thing, the expert layers run amazing on CPU ( ~17T/s 25T/s on a 14900K) and you can force that with this new llama-cpp option: --cpu-moe . You can offload just the attention layers to GPU (requiring about 5 to 8GB of VRAM) for fast prefill. KV cache for the sequence Attention weights & activations Routing tables LayerNorms and other “non-expert” parameters No giant MLP weights are resident on the GPU, so memory use stays low. This yields an amazing snappy system for a 120B mod

Topics: gpu layers moe ms tokens

Apple taught an LLM to predict tokens up to 5x faster in math and coding tasks

A new research paper from Apple details a technique that speeds up large language model responses, while preserving output quality. Here are the details. The nerdy bits Traditionally, LLMs generate text one token at a time. This is slow because each step depends on all the previous ones to keep the output coherent and accurate. If the model is writing a sentence like “ The cat is black ”, it predicts each token in sequence. After writing “ The cat is ”, it looks at everything so far (plus the

How attention sinks keep language models stable

We discovered why language models catastrophically fail on long conversations: when old tokens are removed to save memory, models produce complete gibberish. We found models dump massive attention onto the first few tokens as "attention sinks"—places to park unused attention since softmax requires weights to sum to 1. Our solution, StreamingLLM, simply keeps these first 4 tokens permanently while sliding the window for everything else, enabling stable processing of 4 million+ tokens instead of j

How Attention Sinks Keep Language Models Stable

We discovered why language models catastrophically fail on long conversations: when old tokens are removed to save memory, models produce complete gibberish. We found models dump massive attention onto the first few tokens as "attention sinks"—places to park unused attention since softmax requires weights to sum to 1. Our solution, StreamingLLM, simply keeps these first 4 tokens permanently while sliding the window for everything else, enabling stable processing of 4 million+ tokens instead of j

Window Activation

You click a link in your chat app, your browser with a hundred tabs comes to the front and opens that page. How hard can it be? Well, you probably know by now that Wayland, unlike X, doesn’t let one application force its idiot wishes on everyone else. In order for an application to bring its window to the front, it needs to make use of the XDG Activation protocol. A KWrite window that failed to activate and instead is weeping bitterly for attention in the task bar In essence, an application ca

The Mysterious AI Easter Egg at the Heart of Ari Aster’s ‘Eddington’

Horror wunderkind Ari Aster’s new movie Eddington has divided audiences and inspired plenty of online debate about what exactly the director is trying to say about our collective relationship to technology (hint: it’s probably not good). The story centers around a small town in Texas that descends into social-media-driven chaos during the covid-19 pandemic. The film stars Joaquin Phoenix as local sheriff Joe Cross, who tussles with the town’s mayor, played by Pedro Pascal, while the rest of the

The Revolution of Token-Level Rewards

Training large language models (LLMs) to master complex tasks, especially those requiring structured outputs like generating precise code or engaging in multi-step reasoning, is challenging even for current state of the art (SOTA) models. Reinforcement Learning (RL) offers a powerful theoretical framework for teaching models to do "what works", but applying these techniques to LLMs has been messy to execute in practice. We’ve run into this problem at our startup, Levro. We want to be the easies

KDE Plasma prepares crackdown on focus-stealing window behavior under Wayland

When you purchase through links on our site, we may earn an affiliate commission. Here’s how it works One of the most interesting things about Wayland is how it handles window focus, unlike X11, where focus stealing can be frustrating and even a security risk. Its main advantage is a mechanism that prevents focus stealing. The protocol that plays a role in this is known as "XDG Activation." Here's how it works: Say you double-click a PDF file in your file manager. The file manager first asks t

Inverted Indexes: A Step-by-Step Implementation Guide

Before we start with the implementation, let's talk about why would you actually need an inverted index in a real life. Why would anyone need inverted index at all Imagine you need to create a system that would quickly look up a document, given several words from it - something like a wiki search. Simplest option I can think of would be to scan through each document, marking ones that have all the necessary words. That might work at first, but such solution wouldn't scale,