How big are our embeddings now and why?
Sep 1 2025 #embeddings #openai #anthropic #huggingface #dimensionality A few years ago, I wrote a paper on embeddings. At the time, I wrote that 200-300 dimension embeddings were fairly common in industry, and that adding more dimensions during training would create diminishing returns for the effectiveness of your downstream tasks (classification, recommendation, semantic search, topic modeling, etc.) I wrote the paper to be resilient to changes in the industry since it focuses on fundamenta