# Hello fellow programmers # thank you for even being aware of robots.txt # # http://en.wikipedia.org/wiki/Robots_exclusion_standard # # The policy at SNPedia.com is simple # # Bots are allowed to scrape anything below /index.php/ # such as # /index.php/Gout # /index.php/Rs728404 # /index.php/Rs6413458(T;T) # # Bots are NOT ALLOWED to scrape anything below /index.php? # such as # /index.php?title=Rs5742904 # /index.php?title=Rs5742904&action=formedit # # because there is an infinite number of pages, many of of them # are quite cpu demanding and they don't cache well. # # If you need that sort of stuff, you need to read # http://snpedia.com/index.php/Bulk # and # http://bots.snpedia.com/api.php User-agent: * Disallow: /index.php? Disallow: /api.php? Disallow: /images/ Disallow: /skins/ Disallow: /index.php/Special Disallow: /cgi-bin/ Disallow: /files/promethease/extra Disallow: *& crawl-delay: 3 #recently commented out again #User-agent: Mediapartners-Google #Disallow: User-agent: ia_archiver Allow: /*&action=raw User-agent: Exabot Disallow: / User-agent: XoviBot Disallow: / User-agent: Scrapy Disallow: / User-agent: rogerbot Disallow: / # You might also find this helpfule # http://en.wikipedia.org/wiki/Sitemaps Sitemap: http://snpedia.com/sitemap/sitemap-index-snpediadb.xml # too many 404s User-agent: AhrefsBot Crawl-Delay: 90 # will allow one page every 90 seconds # but if I'm still annoyed later... #User-agent: AhrefsBot #Disallow: /