Search Engine Optimization Services
 
SEO Link Building Services Articles Genie Magic Web Design Contact Us Seo Tools

 

Tuesday, March 25, 2008

MSN crawler team upgrades software that crawles

MSN search live team has upgraded their web crawler which crawls millions pages per day. They included some new interesting feature which reduces the load placed on some innocent servers.

2 important features as described by MSN search blog


HTTP Compression: HTTP compression allows faster transmission time by compressing static files and application responses, reducing network load between your servers and our crawler. We support the most common compression methods: gzip and deflate as defined by RFC 2616 (see sections 14.11 and 14.39). Compression is currently supported by all major browsers and search engines. Use this online tool to check your server for HTTP compression support.
The following links provide configuration information for IIS, and Apache.
Configure Compression in IIS
Configure Apache using GZIP or using deflate
Conditional Get: We support conditional get as defined by RFC 2616 (Section 14.25), generally we will not download the page unless it has changed since the last time we crawled it. As per the standard, our crawler will include the "If-Modified-Since" header & time of last download in the GET request and when available, our crawler will include the "If-None-Match" header and the ETag value in the GET request. If the content hasn't changed the web server will respond with a 304 HTTP response
To check if your site already supports the "If-Modified-Since" HTTP header, you can use this online tool to check your server for HTTP Conditional Get support. Alternatively, you can check using Fiddler for Internet Explorer, or Live Headers for Firefox. Each of these tools allows you to create a custom GET request and send it to your server. You'll want to make sure that your request includes the "If-Modified-Since" header like the following simplified sample:
GET /sa/3_12_0_163076/webmaster/webmaster_layout.css HTTP/1.1
Host: webmaster.live.com
If-Modified-Since: Tue, 22 Jan 2008 01:28:49 GMT
You should receive a server response similar to the following simplified sample:
HTTP/1.x 304 Not Modified
Check out MSDN for more information on using

                                Earthlink Netscape Netvouz RawSugar Shadows Sphinn StumbleUpon Yahoo MyWeb

0 Comments:

Post a Comment

<< Home

Previous Posts & Archives
Search Our Site
Featured Links
PageRank 10 sites
Services
 Search Engine Optimization
 Web Design
 Link Building
 Search Engine Marketing
 Internet Marketing
 SEO Consulting
 Ecommerce  Implementation.
 Pay Per Click Services
 Graphic Design.
 Shopping Feeds Optimization.
 Shopping Cart Customization
 Product Development.
 Online Forms & Database       Integration.
 PHP Programming  Services
 Programming Services Java,J2EE
 .NET Application Development      Programming Services
 Business Process OutSourcing
 Offshore Outsourcing
Articles
Company

 

 

 

 

Search Engine Optimization SEO Company | Privacy Policy | Term of Service | Copyright
Search Engine Genie is an Ethical Search Engine Optimization Company Specializing in Search Engine Marketing, Search Engine Promotion and Search Engine Ranking Services.