Lessons from building an AI data analyst

AI/ML Data Analytics Malloy Malloy

TL;DR

Text-to-SQL is not enough. Answering real user questions requires going the extra mile like multi-step plans, external tools (coding) and external context.

Answering real user questions requires going the extra mile like multi-step plans, external tools (coding) and external context. Context is the product. A semantic layer (we use Malloy ⎋) encodes business meaning and sharply reduces SQL complexity.

A semantic layer (we use Malloy ⎋) encodes business meaning and sharply reduces SQL complexity. Use a multi-agent, research-oriented system. Break problems down using context / domain knowledge, retrieve precisely, write code, interact with the environment and learn from it.

Break problems down using context / domain knowledge, retrieve precisely, write code, interact with the environment and learn from it. Retrieval is a recommendation problem. Mix keyword, embeddings, and a fine-tuned reranker; optimise for precision, recall, and latency.

Mix keyword, embeddings, and a fine-tuned reranker; optimise for precision, recall, and latency. Benchmarks ≠ production. Users expect human-level answers, drill-downs, and defensible reasoning, not just pass@k.

Users expect human-level answers, drill-downs, and defensible reasoning, not just pass@k. Latency and quality are a tight bar. Route between fast and reasoning models; cache aggressively; keep contexts short. Continuous model evaluation is needed to avoid drifts as new models are launched.

The short story

I spent years on ML for Analytics and Knowledge Discovery at Google and Twitter. For the past 3 years I've been building an AI data analyst at Findly (findly.ai ⎋). We entered Y Combinator with a different idea, but quickly realised the real problem for most teams wasn't "lack of data" — it was data discovery and use.

... continue reading