IP = 100.52.55.54
robots.txtSitemap: https://ui.adsabs.harvard.edu/sitemap/sitemap_index.xml
User-agent: *
Disallow: /v1/
Disallow: /resources
Disallow: /core
Disallow: /tugboat
Disallow: /link_gateway/
Disallow: /search/
Disallow: /execute-query/
Disallow: /status
Allow: /help/
Allow: /about/
Allow: /blog/
# there may be a more elegant way to do this, but I fear that if we just use a
# single regexp such as /abs/*/* we may miss out on indexing links containing
# DOIs or arXiv ids, so we do it the pedantic way
Allow: /abs/
Disallow: /abs/*/citations
Disallow: /abs/*/references
Disallow: /abs/*/coreads
Disallow: /abs/*/similar
Disallow: /abs/*/toc
Disallow: /abs/*/graphics
Disallow: /abs/*/metrics
Disallow: /abs/*/exportcitation
User-agent: AI2Bot
Disallow: /
User-agent: Ai2Bot-Dolma
Disallow: /
User-agent: Amazonbot
Disallow: /
User-agent: anthropic-ai
Disallow: /
User-agent: Applebot
Disallow: /
User-agent: Applebot-Extended
Disallow: /
User-agent: Brightbot 1.0
Disallow: /
User-agent: Bytespider
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: Claude-Web
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: cohere-training-data-crawler
Disallow: /
User-agent: Crawlspace
Disallow: /
User-agent: Diffbot
Disallow: /
User-agent: FacebookBot
Disallow: /
User-agent: FriendlyCrawler
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: GoogleOther
Disallow: /
User-agent: GoogleOther-Image
Disallow: /
User-agent: GoogleOther-Video
Disallow: /
User-agent: GPTBot
Disallow: /
User-agent: iaskspider/2.0
Disallow: /
User-agent: ICC-Crawler
Disallow: /
User-agent: ImagesiftBot
Disallow: /
User-agent: img2dataset
Disallow: /
User-agent: ISSCyberRiskCrawler
Disallow: /
User-agent: Kangaroo Bot
Disallow: /
User-agent: Meta-ExternalAgent
Disallow: /
User-agent: Meta-ExternalFetcher
Disallow: /
User-agent: omgili
Disallow: /
User-agent: omgilibot
Disallow: /
User-agent: PanguBot
Disallow: /
User-agent: Scrapy
Disallow: /
User-agent: SemrushBot-OCOB
Disallow: /
User-agent: SemrushBot-SWA
Disallow: /
User-agent: Sidetrade indexer bot
Disallow: /
User-agent: Timpibot
Disallow: /
User-agent: VelenPublicWebCrawler
Disallow: /
User-agent: Webzio-Extended
Disallow: /
User-agent: YouBot
Disallow: /
Look up this url in the url tool
https://ui.adsabs.harvard.edu/.well-known/acme-challenge: 404 text/html
https://ui.adsabs.harvard.edu/.well-known/csvm: 404 text/html
https://ui.adsabs.harvard.edu/.well-known/nostr.json: 404 text/html
https://ui.adsabs.harvard.edu/.well-known/security.txt: 404 text/html
https://ui.adsabs.harvard.edu/.well-known/traffic-advice: 404 text/html