Latest Tech News

Stay updated with the latest in technology, AI, cybersecurity, and more

Filtered by: bedding Clear Filter

Language models pack billions of concepts into 12k dimensions

In a recent 3Blue1Brown video series on transformer models, Grant Sanderson posed a fascinating question: How can a relatively modest embedding space of 12,288 dimensions (GPT-3) accommodate millions of distinct real-world concepts? The answer lies at the intersection of high-dimensional geometry and a remarkable mathematical result known as the Johnson-Lindenstrauss lemma. While exploring this question, I discovered something unexpected that led to an interesting collaboration with Grant and a

Language Models Pack Billions of Concepts into 12k Dimensions

In a recent 3Blue1Brown video series on transformer models, Grant Sanderson posed a fascinating question: How can a relatively modest embedding space of 12,288 dimensions (GPT-3) accommodate millions of distinct real-world concepts? The answer lies at the intersection of high-dimensional geometry and a remarkable mathematical result known as the Johnson-Lindenstrauss lemma. While exploring this question, I discovered something unexpected that led to an interesting collaboration with Grant and a

Language Models Pack Billions of Concepts into 12,000 Dimensions

In a recent 3Blue1Brown video series on transformer models, Grant Sanderson posed a fascinating question: How can a relatively modest embedding space of 12,288 dimensions (GPT-3) accommodate millions of distinct real-world concepts? The answer lies at the intersection of high-dimensional geometry and a remarkable mathematical result known as the Johnson-Lindenstrauss lemma. While exploring this question, I discovered something unexpected that led to an interesting collaboration with Grant and a

How big are our embeddings now and why?

Sep 1 2025 #embeddings #openai #anthropic #huggingface #dimensionality A few years ago, I wrote a paper on embeddings. At the time, I wrote that 200-300 dimension embeddings were fairly common in industry, and that adding more dimensions during training would create diminishing returns for the effectiveness of your downstream tasks (classification, recommendation, semantic search, topic modeling, etc.) I wrote the paper to be resilient to changes in the industry since it focuses on fundamenta

What Is Down Alternative and Who Should Buy It? Experts Explain (2025)

When shopping for new bedding, you'll undoubtedly run into both natural down and materials described as down alternatives. This prompts a lot of questions. Is down or down alternative better? What are the differences between them? Why is one more expensive than the other? Which is easier to care for? Which is warmer? It can all be very confusing. As evinced in our down comforter buying guide, not to mention other stories in our sleep directory, there are plenty of options for high-quality down

We Hit 100% GPU Utilization–and Then Made It 3× Faster by Not Using It

We recently used Qwen3-Embedding-0.6B to embed millions of text documents while sustaining near-100% GPU utilization the whole way. That’s usually the gold standard that machine learning engineers aim for… but here’s the twist: in the time it took to write this blog post, we found a way to make the same workload 3× faster, and it didn’t involve maxing out GPU utilization at all. That story’s for another post, but first, here’s the recipe that got us to near-100%. The workload Here at the Daft

Show HN: Bolt – A super-fast, statically-typed scripting language written in C

⚡ Bolt A lightweight, lightning-fast, type-safe embeddable language for real-time applications. import print , error , Error from core import abs , epsilon from math // The return type of safe_divide is inferred to be `Error | number` fn safe_divide ( a : number , b : number ) { if abs ( b ) < epsilon { return error ( "Cannot divide by zero!" ) } return a / b } match let result = safe_divide ( 10 , 5 ) { is Error { // The type of result is narrowed in this branch! print ( "Failed to divide:" ,

Gemini Embedding: Powering RAG and context engineering

Since announcing the general availability of our Gemini Embedding text model, we've seen developers rapidly adopt it to build advanced AI applications. Beyond traditional use cases like classification, semantic search, and retrieval-augmented generation (RAG), many are now using a technique called context engineering to provide AI agents with complete operational context. Embeddings are crucial here, as they efficiently identify and integrate vital information—like documents, conversation histor

New embedding model leaderboard shakeup: Google takes #1 while Alibaba’s open source alternative closes gap

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Google has officially moved its new, high-performance Gemini Embedding model to general availability, currently ranking number one overall on the highly regarded Massive Text Embedding Benchmark (MTEB). The model (gemini-embedding-001) is now a core part of the Gemini API and Vertex AI, enabling developers to build applications such as sema

All AI models might be the same

Project CETI is a large-scale effort to decode whale speech. If AI models do learn a universal language, we might be able to use it to talk to whales. Growing up, I sometimes played a game with my friends called “Mussolini or Bread.” It’s a guessing game, kind of like Twenty Questions. The funny name comes from the idea that, in the space of everything, ‘Mussolini’ and ‘bread’ are about as far away from each other as you can get. One round might go like this: Is it closer to Mussolini or bre

All AI Models Might be The Same

Project CETI is a large-scale effort to decode whale speech. If AI models do learn a universal language, we might be able to use it to talk to whales. Growing up, I sometimes played a game with my friends called “Mussolini or Bread.” It’s a guessing game, kind of like Twenty Questions. The funny name comes from the idea that, in the space of everything, ‘Mussolini’ and ‘bread’ are about as far away from each other as you can get. One round might go like this: Is it closer to Mussolini or bre

LGND wants to make ChatGPT for the Earth

The Earth is awash in data about itself. Every day, satellites capture around 100 terabytes of imagery. But making sense of it isn’t always easy. Seemingly simple questions can be fiendishly complex to answer. Take this question that is of vital economic importance to California: How many fire breaks does the state have that might stop a wildfire in its tracks, and how have they changed since the last fire season? “Originally, you’d have a person look at pictures. And that only scales so far,”

Muvera: Making multi-vector retrieval as fast as single-vector search

Neural embedding models have become a cornerstone of modern information retrieval (IR). Given a query from a user (e.g., “How tall is Mt Everest?”), the goal of IR is to find information relevant to the query from a very large collection of data (e.g., the billions of documents, images, or videos on the Web). Embedding models transform each datapoint into a single-vector “embedding”, such that semantically similar datapoints are transformed into mathematically similar vectors. The embeddings are