How can I optimize for “deep web” crawling?

We have a question from Brighton, Danny asks “What are Google’s plans for indexing the deep web? Are there best practices for form construction to optimize for this?”

Great question! We recently published a paper in VLDB which I believe stands for Very Large Databases, that talks exactly about our criteria all the ways though we tried to do it safely so if there are people who don’t want their forums to be crawled we won’t crawl them. So there are various simple things that you can do. So rather than having text that has to be filled out like a zip code if you could make it a drop down for example that’s much more helpful. If you could make it so that it’s not a huge form with 20 things to fill out but more like one drop down or two drop downs that’s going to be lot easier as well. I definitely encourage you to go read the paper there’s nothing sooper dooper confidential in it. And of course if you can make it that you are not part of the deep web you can take those pages that’s your database and have a HTML site map so that people can reach all the different pages on your site by crawling through categories or geographic areas, then we don’t have to fill out forms. And Google is a pretty good company about being able to index the deep web through forums but not every search engine does that. And so if you can expose that database somewhere where people can get to all the pages on your site just by clicking not by submitting a form then you are going to open yourself up to an even wider audience. If you could do that, that’s what I recommend. But if you can’t do that then I’ll say check out this paper from the VLDB conference where the team talked about it in more detail.

No comments yet.

Leave a comment

Request a Free SEO Quote