Find Related products on Amazon

Shop on Amazon

An LLM Query Understanding Service

Published on: 2025-09-05 06:46:59

We need to be cheating at search with LLMs. Indeed I’m teaching a whole course on this in July. With an LLM we can implement in days what previously took months. We can take apart a query like “brown leather sofa” into the important dimensions of intent — “color: brown, material: leather, category:couches” etc. With this power all search is structured now. Even better we can do this all without calling out to OpenAI/Gemini/…. We can use simple LLMs running in our infrastructure making it faster and cheaper. I’m going to show you how. Let’s get started. Follow along in this repo. The service - wrapping an open source LLM We’ll start by deploying a FastAPI app that calls an LLM. The code below is just a dummy “hello world” app talking to an LLM. We send a chat message over JSON, the LLM comes up with a response and we send it back. Here’s the basic service: from fastapi import FastAPI , Request from fastapi.responses import JSONResponse from llm_query_understand.llm import LargeL ... Read full article.