Anyone got a link to a robots.txt that “blocks” all the “AI” stuff?
@movq@www.uninformativ.de I have this one as per some article I read some time ago… But just like the robots.txt I don’t think you have any grantee that it would be honored, you might even have a better chance hunting for and blocking user-agents.
@movq@www.uninformativ.de It looks like this one actually reads the robots.txt … it did a couple of times over the past few weeks.
“GET /robots.txt HTTP/1.1” 304 0 “-” “Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +https://openai.com/gptbot)”
Hey @movq@www.uninformativ.de !! here’s an article you might find interesting: Blocking Bots with Nginx … this person is actually blocking AI
Bots based on a list of User Agents in an interesting way. 👍
@aelaraji@aelaraji.com Hmmm looks like the core idea is to intercept requests, Inspect the UserAgent
header and respond accordingly.
Can we trust the bots not to fake their identity? 🤔
@movq@www.uninformativ.de me neither 🤦♂️