← Back to articles

An NSFW filter for Marginalia search

2026-03-30 | original

read original more articles

Why This Matters

This article highlights the development of a fast, CPU-efficient NSFW filter for Marginalia Search, addressing the challenge of balancing speed and accuracy in content moderation. By implementing a neural network approach from scratch, the project emphasizes the importance of tailored solutions for real-time search environments, especially when integrating safety filters without compromising performance.

Key Takeaways

Developing effective NSFW filters requires balancing speed and accuracy, especially for search engines.
Using lightweight models like fasttext enables faster deployment without heavy dependencies.
Creating custom neural network solutions can better meet specific industry needs compared to off-the-shelf models.

… optional, that is.

I’ve been working on an NSFW filter for Marginalia Search, as that is something some people have asked for, primarily API consumers.

The search engine has had some domain based filtering for a while, based on the UT1 lists, but that isn’t a very comprehensive approach.

We’ll land on a single hidden layer neural network approach, implemented from scratch, but before landing on that, many other things were tried along the way.

This is largely an abbreviated account of the way there.

There is a tension between speed and generality in classification.

Building something that is both fast and reasonably correct in its assessments is incredibly fiddly work, even if the solution itself is often pretty straightforward.

The main limiting constraint for a filter that runs in a search engine is that it needs to be really fast and run well on CPUs.

This immediately disqualifies transformer-based models and other state-of-the art approaches, capable as they are they check neither of those boxes.

Fasttext

... continue reading

Explore topics: facebook fasttext marginalia search neural network ut1