it's time to standardize AI MINERS
"standard means collaborative"
Work in progress
Html meta & data-legal tag
<meta name="robots" content="noai, noimageai">
<meta name="CCBot" content="nofollow">
<meta name="legals" content="nomining">
<meta name="legals" content="notranslation">
<meta name="legals" content="noai">
legals.txt works better for humans (why disallow),
robots.txt is better for bots (what disallow).
Example: I want to allow indexing, but not translations or ai minining: so works GoogleOther bot.
<span data-legal="nomining" title="Legals: No Mining Allowed">No mining, No AI</span>
<span data-legal="noai" title="Legals: No AI Allowed">No mining, No AI</span>
can be useful:
<span data-legal="genai" title="AI generated">AI generated</span>
Robots.txt
robots.txt: AmazonBot, GoogleOther - Follow: Cyberciti.biz
User-agent: GPTBot
Disallow: /
User-agent: ChatGPT-User
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: PerplexityBot
Disallow: /
User-agent: anthropic-ai
Disallow: /
User-agent: Claude-Web
Disallow: /
User-agent: ClaudeBot
Disallow: /
GoogleOther is Google's bot for research and AI, you can stop in robots.txt They divided from GoogleBot for transparency, but Other is not clear.