<div style="display:inline;float:right;margin-left:1em"><g:plusone href="https://www.searchenginegenie.com/blog-seo/how-can-i-optimize-for-deep-web-crawling/"></g:plusone></div>
<div style="display:inline;float:right;margin-left:1em"><g:plusone href="https://www.searchenginegenie.com/blog-seo/how-can-i-optimize-for-deep-web-crawling/"></g:plusone></div>
{"id":899,"date":"2011-05-10T07:18:01","date_gmt":"2011-05-10T11:18:01","guid":{"rendered":"http:\/\/www.searchenginegenie.com\/blog-seo\/?p=899"},"modified":"2011-05-10T07:18:01","modified_gmt":"2011-05-10T11:18:01","slug":"how-can-i-optimize-for-deep-web-crawling","status":"publish","type":"post","link":"https:\/\/www.searchenginegenie.com\/blog-seo\/how-can-i-optimize-for-deep-web-crawling\/","title":{"rendered":"How can I optimize for &#8220;deep web&#8221; crawling?"},"content":{"rendered":"<p style=\"text-align: center;\"><object classid=\"clsid:d27cdb6e-ae6d-11cf-96b8-444553540000\" width=\"560\" height=\"349\" codebase=\"http:\/\/download.macromedia.com\/pub\/shockwave\/cabs\/flash\/swflash.cab#version=6,0,40,0\"><param name=\"allowFullScreen\" value=\"true\" \/><param name=\"allowscriptaccess\" value=\"always\" \/><param name=\"src\" value=\"http:\/\/www.youtube-nocookie.com\/v\/Ob4hLE9_Etk?fs=1&amp;hl=en_US&amp;rel=0\" \/><param name=\"allowfullscreen\" value=\"true\" \/><embed type=\"application\/x-shockwave-flash\" width=\"560\" height=\"349\" src=\"http:\/\/www.youtube-nocookie.com\/v\/Ob4hLE9_Etk?fs=1&amp;hl=en_US&amp;rel=0\" allowscriptaccess=\"always\" allowfullscreen=\"true\"><\/embed><\/object><\/p>\n<p style=\"text-align: justify;\">We have a question from Brighton, Danny asks \u201cWhat are Google\u2019s plans for indexing the deep web? Are there best practices for form construction to optimize for this?\u201d<\/p>\n<p style=\"text-align: justify;\">Great question! We recently published a paper in VLDB which I believe stands for Very Large Databases, that talks exactly about our criteria all the ways though we tried to do it safely so if there are people who don\u2019t want their forums to be crawled we won\u2019t crawl them. So there are various simple things that you can do. So rather than having text that has to be filled out like a zip code if you could make it a drop down for example that\u2019s much more helpful. If you could make it so that it\u2019s not a huge form with 20 things to fill out but more like one drop down or two drop downs that\u2019s going to be lot easier as well. I definitely encourage you to go read the paper there\u2019s nothing sooper dooper confidential in it. And of course if you can make it that you are not part of the deep web you can take those pages that\u2019s your database and have a HTML site map so that people can reach all the different pages on your site by crawling through categories or geographic areas, then we don\u2019t have to fill out forms. And Google is a pretty good company about being able to index the deep web through forums but not every search engine does that. And so if you can expose that database somewhere where people can get to all the pages on your site just by clicking not by submitting a form then you are going to open yourself up to an even wider audience. If you could do that, that\u2019s what I recommend. But if you can\u2019t do that then I\u2019ll say check out this paper from the VLDB conference where the team talked about it in more detail.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>We have a question from Brighton, Danny asks \u201cWhat are Google\u2019s plans for indexing the deep web? Are there best practices for form construction to optimize for this?\u201d Great question! We recently published a paper in VLDB which I believe stands for Very Large Databases, that talks exactly about our criteria all the ways though [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[14],"tags":[],"class_list":["post-899","post","type-post","status-publish","format-standard","hentry","category-mattcutts-video-transcript"],"_links":{"self":[{"href":"https:\/\/www.searchenginegenie.com\/blog-seo\/wp-json\/wp\/v2\/posts\/899","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.searchenginegenie.com\/blog-seo\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.searchenginegenie.com\/blog-seo\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.searchenginegenie.com\/blog-seo\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.searchenginegenie.com\/blog-seo\/wp-json\/wp\/v2\/comments?post=899"}],"version-history":[{"count":2,"href":"https:\/\/www.searchenginegenie.com\/blog-seo\/wp-json\/wp\/v2\/posts\/899\/revisions"}],"predecessor-version":[{"id":916,"href":"https:\/\/www.searchenginegenie.com\/blog-seo\/wp-json\/wp\/v2\/posts\/899\/revisions\/916"}],"wp:attachment":[{"href":"https:\/\/www.searchenginegenie.com\/blog-seo\/wp-json\/wp\/v2\/media?parent=899"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.searchenginegenie.com\/blog-seo\/wp-json\/wp\/v2\/categories?post=899"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.searchenginegenie.com\/blog-seo\/wp-json\/wp\/v2\/tags?post=899"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}