abstract:block_ai
Block AI crawlers
robots.txt
файл
# I know this can just be _totally_ ignored by crawlers # But let's hope they behave well :) # Code: https://github.com/ellie/notes # Source: https://darkvisitors.com/ # OpenAI, ChatGPT # https://platform.openai.com/docs/gptbot User-agent: GPTBot Disallow: / # Google AI (Bard, etc) # https://developers.google.com/search/docs/crawling-indexing/overview-google-crawlers User-agent: Google-Extended Disallow: / # Block common crawl # I have mixed feelings on this one, but many models are trained on this data # It is also used to bootstrap new search indices though # https://commoncrawl.org/ccbot User-agent: CCBot Disallow: / # Facebook # https://developers.facebook.com/docs/sharing/bot/ User-agent: FacebookBot Disallow: / # Cohere.ai # https://darkvisitors.com/agents/cohere-ai User-agent: cohere-ai Disallow: / # Perplexity # https://docs.perplexity.ai/docs/perplexitybot User-agent: PerplexityBot Disallow: / # Anthropic # https://darkvisitors.com/agents/anthropic-ai User-agent: anthropic-ai Disallow: / # ...also anthropic # https://darkvisitors.com/agents/claudebot User-agent: ClaudeBot Disallow: /
Sitemap: https://coryd.dev/sitemap.xml User-agent: * Disallow: User-agent: AdsBot-Google User-agent: Amazonbot User-agent: anthropic-ai User-agent: Applebot User-agent: AwarioRssBot User-agent: AwarioSmartBot User-agent: Bytespider User-agent: CCBot User-agent: ChatGPT-User User-agent: ClaudeBot User-agent: Claude-Web User-agent: cohere-ai User-agent: DataForSeoBot User-agent: Diffbot User-agent: FacebookBot User-agent: FriendlyCrawler User-agent: Google-Extended User-agent: GoogleOther User-agent: GPTBot User-agent: img2dataset User-agent: ImagesiftBot User-agent: magpie-crawler User-agent: Meltwater User-agent: omgili User-agent: omgilibot User-agent: peer39_crawler User-agent: peer39_crawler/1.0 User-agent: PerplexityBot User-agent: PiplBot User-agent: scoop.it User-agent: Seekr User-agent: YouBot Disallow: /
Other great posts on the subject:
abstract/block_ai.txt · Last modified: 2024/05/14 09:30 by Denis Evsyukov