Tech News
← Back to articles

Launch HN: Webhound (YC S23) – Research agent that builds datasets from the web

read original related products more articles

We're the team behind Webhound ( https://webhound.ai ), an AI agent that builds datasets from the web based on natural language prompts. You describe what you're trying to find. The agent figures out how to structure the data and where to look, then searches, extracts the results, and outputs everything in a CSV you can export.

We've set up a special no-signup version for the HN community at https://hn.webhound.ai - just click "Continue as Guest" to try it without signing up.

Here's a demo: https://youtu.be/fGaRfPdK1Sk

We started building it after getting tired of doing this kind of research manually. Open 50 tabs, copy everything into a spreadsheet, realize it's inconsistent, start over. It felt like something an LLM should be able to handle.

Some examples of how people have used it in the past month:

Competitor analysis: "Create a comparison table of internal tooling platforms (Retool, Appsmith, Superblocks, UI Bakery, BudiBase, etc) with their free plan limits, pricing tiers, onboarding experience, integrations, and how they position themselves on their landing pages." (https://www.webhound.ai/dataset/c67c96a6-9d17-4c91-b9a0-ff69...)

Lead generation: "Find Shopify stores launched recently that sell skincare products. I want the store URLs, founder names, emails, Instagram handles, and product categories." (https://www.webhound.ai/dataset/b63d148a-8895-4aab-ac34-455e...)

Pricing tracking: "Track how the free and paid plans of note-taking apps have changed over the past 6 months using official sites and changelogs. List each app with a timeline of changes and the source for each." (https://www.webhound.ai/dataset/c17e6033-5d00-4e54-baf6-8dea...)

Investor mapping: "Find VCs who led or participated in pre-seed or seed rounds for browser-based devtools startups in the past year. Include the VC name, relevant partners, contact info, and portfolio links for context." (https://www.webhound.ai/dataset/1480c053-d86b-40ce-a620-37fd...)

Research collection: "Get a list of recent arXiv papers on weak supervision in NLP. For each, include the abstract, citation count, publication date, and a GitHub repo if available." (https://www.webhound.ai/dataset/e274ca26-0513-4296-85a5-2b7b...)

... continue reading