Sitemap: https://www.ebooks.com/sitemap/sitemap-index.xml ############################################################ # CORE ALLOWLIST — Search, media, previews, and utilities ############################################################ # Google (search + media + tools) User-agent: Googlebot User-agent: Googlebot-Image User-agent: Googlebot-Video User-agent: Googlebot-News User-agent: Mediapartners-Google User-agent: Google-InspectionTool User-agent: GoogleOther User-agent: GoogleOther-Image User-agent: GoogleOther-Video # Microsoft / Bing User-agent: Bingbot User-agent: BingPreview User-agent: MicrosoftPreview User-agent: AdIdxBot # Apple / DuckDuckGo / Yahoo User-agent: Applebot User-agent: DuckDuckBot User-agent: DuckDuckGo-Favicons-Bot User-agent: Slurp # OpenAI search crawler User-agent: OAI-SearchBot # Regional engines User-agent: Yandex User-agent: YandexBot User-agent: YandexImages User-agent: YandexMobileBot User-agent: Baiduspider User-agent: Baiduspider-image User-agent: NaverBot User-agent: Yeti User-agent: SeznamBot User-agent: Qwantify User-agent: MojeekBot # Social/link previews User-agent: facebookexternalhit User-agent: Facebot User-agent: Twitterbot User-agent: WhatsApp User-agent: LinkedInBot User-agent: Pinterest # Claude (explicitly allowed) User-agent: ClaudeBot User-agent: Claude-SearchBot User-agent: Claude-Web # Perplexity (experiment: allow) User-agent: PerplexityBot # Shared allow rules for all the above Allow: / Disallow: /account/* Disallow: /cart/* Disallow: /cj.asp Disallow: /en-*/cj.asp Disallow: /api/user/* Disallow: /api/book/*/review ############################################################ # AI TRAINING / BROAD DATA HARVESTERS ############################################################ # Google-Extended — allowed for experiment User-agent: Google-Extended Allow: / Disallow: /account/* Disallow: /cart/* Disallow: /cj.asp Disallow: /en-*/cj.asp Disallow: /api/user/* Disallow: /api/book/*/review User-agent: GPTBot Allow: / Disallow: /account/* Disallow: /cart/* Disallow: /cj.asp Disallow: /en-*/cj.asp Disallow: /api/user/* Disallow: /api/book/*/review # Block the rest User-agent: Applebot-Extended User-agent: Perplexity-User User-agent: CCBot # Common Crawl User-agent: Bytespider # ByteDance User-agent: PetalBot # Huawei User-agent: YouBot # You.com User-agent: Diffbot User-agent: omgili User-agent: Omgilibot User-agent: Amazonbot Disallow: / ############################################################ # SEO CRAWLERS / SITE AUDIT TOOLS — BLOCK by default # (Temporarily allow only for your own audits) ############################################################ User-agent: AhrefsBot User-agent: MJ12bot # Majestic User-agent: DotBot # Moz User-agent: SemrushBot User-agent: SemrushBot-OCOB User-agent: Botify User-agent: OnCrawl User-agent: Screaming Frog SEO Spider Disallow: / ############################################################ # CATCH-ALL — Block everything else ############################################################ User-agent: * Disallow: /