<div style="display:inline;float:right;margin-left:1em"><g:plusone href="https://www.searchenginegenie.com/blog-seo/maximum-page-limits-search-engine-crawlers-can-crawl/"></g:plusone></div>
<div style="display:inline;float:right;margin-left:1em"><g:plusone href="https://www.searchenginegenie.com/blog-seo/maximum-page-limits-search-engine-crawlers-can-crawl/"></g:plusone></div>
{"id":394,"date":"2008-04-03T10:48:00","date_gmt":"2008-04-03T14:48:00","guid":{"rendered":"http:\/\/www.searchenginegenie.com\/blog-seo\/maximum-page-limits-search-engine-crawlers-can-crawl\/"},"modified":"2012-09-20T03:16:39","modified_gmt":"2012-09-20T07:16:39","slug":"maximum-page-limits-search-engine-crawlers-can-crawl","status":"publish","type":"post","link":"https:\/\/www.searchenginegenie.com\/blog-seo\/maximum-page-limits-search-engine-crawlers-can-crawl\/","title":{"rendered":"Maximum page limits Search Engine crawlers can crawl"},"content":{"rendered":"<p>One of the most asked question in forums and in message boards what is the maximum depth a search engine can crawl for a page. What is the maximum size Of html \/pdf that can indexed by top search engines.<\/p>\n<p>1. For google: As per our latest research Google has a maximum crawl and cache depth of 1 MB only ( excluding images \/ graphics ) . It Used to be 100 kb then they increased to 250 kb then to 500kb and the latest update is 1 MB per file.<\/p>\n<p>2. Yahoo overtakes Google by a long way, Their indexing and caching limit is 5 MB, check the screen shots below,<\/p>\n<p><img decoding=\"async\" src=\"http:\/\/www.searchenginegenie.com\/images-company\/yahoo-4mb.gif\" \/><\/p>\n<p><img decoding=\"async\" src=\"http:\/\/www.searchenginegenie.com\/images-company\/yahoo-5mb.gif\" \/><\/p>\n<p>3. MSN Search engine: Its very unpredictable for MSN but from our experiment MSN can cache upto 3 MB, we never tested about that probably someone in their search quality team can answer that.<\/p>\n<p>I dont think we worry about any more search engines. I am sure at some point this data is useful for anyone out there. I know we do have some PDFs and large doc to be indexed. Its very important we know the cache limit for that,<\/p>\n<p>SEO Blog Team,<\/p>\n","protected":false},"excerpt":{"rendered":"<p>One of the most asked question in forums and in message boards what is the maximum depth a search engine can crawl for a page. What is the maximum size Of html \/pdf that can indexed by top search engines. 1. For google: As per our latest research Google has a maximum crawl and cache [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[10],"tags":[],"class_list":["post-394","post","type-post","status-publish","format-standard","hentry","category-search-engines"],"_links":{"self":[{"href":"https:\/\/www.searchenginegenie.com\/blog-seo\/wp-json\/wp\/v2\/posts\/394","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.searchenginegenie.com\/blog-seo\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.searchenginegenie.com\/blog-seo\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.searchenginegenie.com\/blog-seo\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.searchenginegenie.com\/blog-seo\/wp-json\/wp\/v2\/comments?post=394"}],"version-history":[{"count":1,"href":"https:\/\/www.searchenginegenie.com\/blog-seo\/wp-json\/wp\/v2\/posts\/394\/revisions"}],"predecessor-version":[{"id":1336,"href":"https:\/\/www.searchenginegenie.com\/blog-seo\/wp-json\/wp\/v2\/posts\/394\/revisions\/1336"}],"wp:attachment":[{"href":"https:\/\/www.searchenginegenie.com\/blog-seo\/wp-json\/wp\/v2\/media?parent=394"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.searchenginegenie.com\/blog-seo\/wp-json\/wp\/v2\/categories?post=394"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.searchenginegenie.com\/blog-seo\/wp-json\/wp\/v2\/tags?post=394"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}