From the personal blog of interface expert Bruce Ediger:
Early in March 2025, I noticed that a web crawler with a user agent string of
meta-externalagent/1.1 (+https://developers.facebook.com/docs/sharing/webmasters/crawler)
was hitting my blog's machine at an unreasonable rate.
I followed the URL and discovered this is what Meta uses to gather premium, human-generated content to train its LLMs. I found the rate of requests to be annoying.
I already have a PHP program that creates the illusion of an infinite website. I decided to answer any HTTP request that had "meta-externalagent" in its user agent string with the contents of a bork.php generated file...
This worked brilliantly. Meta ramped up to requesting 270,000 URLs on May 30 and 31, 2025...
... continue reading