Search Engine Optimization Services
 
SEO Link Building Services Articles Genie Magic Web Design Contact Us Seo Tools

 

Sunday, April 20, 2008

Crawl Date in Google’s Cache: Matt Cutts Video Transcript



Ok everybody we have a new illustration today. Vanessa Fox of Google webmaster central blog talked about this some people like to learn visually , some people like to learn screen shots, so I thought ill make a little movie so this is going to be a multi media presentation the 2 media we are buying today are skill and peanut butter red ones. So lets talk about Googlebot and how it crawls the web. First off what are the red imminent represent, well everyone knows red is bad so these are going to be 404s. The Googlebot is crawling around the web and it sees a 404 sucks it down and then later on it will come back to try to check it again.

So what are the purples mean well everybody knows purple means a http status code of 200 OK, That's the only thing that it could possibly represent. So in other words Googlebot comes along and it sucks up the page and we got the page just fine. So we got a 404 we got couple http 200s so life is pretty good next, now lets talk about the cache crawl date and what they represent. So we are not able to tell that easily but this is purple we got two greens , purple and the rest greens. So what do you think the green imminent represent? Everybody knows the green imminent are great we know it's the good ones so green represent a status code of 304. So in a browser Googlebot comes to a page they say hey I want to copy this page or you can just tell me if the page has been modified since I indexed and that the page if the page has not been modified since a certain date you can get 304 status back saying that this page hasn't changed and all that Googlebot has to do is to ignore that page. SO this is what Googlebot does , this is going forward in time so in other words we crawl a page we get 200, the next 2 times Googlebot crawl the page it gets a 304 which is the If Modified Since that said that the page hasn't really changed. And later on then here the webmaster actually changed the page and we see this purple that again means the page has been changed since the last crawl and now we get a 200 since the page is actually fetched.

Now going forward the page didn't change so the web server is smart enough to return a 304 status code for each one of the visits by Googlebot. Now the thing that is interesting is if you want to check whether Googlebot cached the page it will show the last date that the page was last retrieved. But the interesting thing is that until recently the post that we checked on this date and this date it will still give us the very first time that we fetched that page. Now you fetch the page again and it would show this cache crawl date and this would continue and may be for 6 months if the page and the page hasn't change we would still show the old cache crawl date. So the change in policy in what we are doing is if we check on this date and on this date to see if the page has changed we will now show that date in the cache crawl date. So in other words as Googlebot comes along , slipping stuff along it might used to a page which might look pretty old we update that so as we know about even if the page is changed or not we update the crawl date in the cached page so the pages look more fresh in the cache crawl date even for the fact we are showing the date to reflect in the fact that we have actually recently checked the pages has changed.

Labels: , , ,

                                Earthlink Netscape Netvouz RawSugar Shadows Sphinn StumbleUpon Yahoo MyWeb

0 Comments:

Post a Comment

<< Home

Previous Posts & Archives
Search Our Site
Featured Links
PageRank 10 sites
Services
 Search Engine Optimization
 Web Design
 Link Building
 Search Engine Marketing
 Internet Marketing
 SEO Consulting
 Ecommerce  Implementation.
 Pay Per Click Services
 Graphic Design.
 Shopping Feeds Optimization.
 Shopping Cart Customization
 Product Development.
 Online Forms & Database       Integration.
 PHP Programming  Services
 Programming Services Java,J2EE
 .NET Application Development      Programming Services
 Business Process OutSourcing
 Offshore Outsourcing
Articles
Company

 

 

 

 

Search Engine Optimization SEO Company | Privacy Policy | Term of Service | Copyright
Search Engine Genie is an Ethical Search Engine Optimization Company Specializing in Search Engine Marketing, Search Engine Promotion and Search Engine Ranking Services.