Is yahoo slurp misbehaving?
Complaints
Webmasters complain that yahoo slurp is not tripping the filter given in robots.txt for the following
User-agent: msnbot
Disallow: /bloop/
Disallow: /blop/
User-agent: googlebot
Disallow: /bloop/
Disallow: /blop/
User-agent: Slurp
Disallow: /bloop/
Disallow: /blop/
User-agent: *
Disallow: /shop/
Disallow: /forum/
Disallow: /cgi-bin/
Yahoo slurp obeys the agent specific rule and hence does not crawl the directories /bloop/ and /blop/directories where as it crawled the directories in the generic rule.
Discussions and Suggestions
Webmasters figured out that not only yahoo but even other search engines behaved in the same manner.
All the SE BOTS and slurps do not obey the Generic rule.
Suggestion 1: Put the wildcard ones first that has generic rule to less specific bots and then place the agent specific code.
Suggestion 2: main theme behind this is that the major bots get to their corresponding user agents and the rest carry on with wild card user agents.
Conclusion
All bots do trip and filter the specifications. They are tripped by the server configuration.
The fact is that major bots and slurps get to their respective user agents and the rest continue with the generic rule.
The exact rule for the above case would be
User-agent: Slurp
Disallow: /bloop/
Disallow: /blop/
Disallow: /shop/
Disallow: /forum/
Disallow: /cgi-bin/
Disallow: /badbottrap/
User-agent: msnbot
Disallow: /bloop/
Disallow: /blop/
Disallow: /shop/
Disallow: /forum/
Disallow: /cgi-bin/
Disallow: /badbottrap/
User-agent: googlebot
Disallow: /bloop/
Disallow: /blop/
Disallow: /shop/
Disallow: /forum/
Disallow: /cgi-bin/
Disallow: /badbottrap/
User-agent: *
Disallow: /shop/
Disallow: /forum/
Disallow: /cgi-bin/
Disallow: /badbottrap/
This can be shortened to
User-agent: Slurp
User-agent: msnbot
User-agent: googlebot
Disallow: /bloop/
Disallow: /blop/
Disallow: /shop/
Disallow: /forum/
Disallow: /cgi-bin/
Disallow: /badbottrap/
User-agent: *
Disallow: /shop/
Disallow: /forum/
Disallow: /cgi-bin/
Disallow: /badbottrap/
No comments yet.
Leave a comment
Blogroll
Categories
- AI Search & SEO
- author rank
- Authority Trust
- Bing search engine
- blogger
- CDN & Caching.
- Content Strategy
- Core Web Vitals
- Experience SEO
- Fake popularity
- gbp-optimization
- Google Adsense
- Google Business Profile Optimization
- google fault
- google impact
- google Investigation
- google knowledge
- Google panda
- Google penguin
- Google Plus
- Google Search Console
- Google Search Updates
- Google webmaster tools
- google-business-profile
- google-maps-ranking
- Hummingbird algorithm
- infographics
- link building
- Local SEO
- local-seo
- Mattcutts Video Transcript
- Microsoft
- Mobile Performance Optimization
- Mobile SEO
- MSN Live Search
- Negative SEO
- On-Page SEO
- Page Speed Optimization
- pagerank
- Paid links
- Panda and penguin timeline
- Panda Update
- Panda Update #22
- Panda Update 25
- Panda update releases 2012
- Penguin Update
- Performance Optimization
- Sandbox Tool
- search engines
- SEO
- SEO Audits
- SEO Audits & Monitoring
- SEO cartoons comics
- seo predictions
- SEO Recovery & Fixes
- SEO Reporting & Analytics
- seo techniques
- SEO Tips & Strategies
- SEO tools
- SEO Trends 2013
- seo updates
- Server Optimization
- Small Business Marketing
- social bookmarking
- Social Media
- SOPA Act
- Spam
- Technical SEO
- Uncategorized
- User Experience (UX)
- Webmaster News
- website
- Website Security
- Website Speed Optimization
- Yahoo




