Why does Google index blogs faster than other sites?

Lots of questions from the UK! Lee Willis from Cumbria, UK asks “Why does Google crawl/index blogs (specifically sites notified by “WordPress XMLRPC pings”) so much faster than a “normal” site submitting a revised Sitemap. What is the impact of that on the overall “quality” of the index?

Well, we always try to maximize the quality, the relevance and the accuracy of your index, and you want to make a distinction between crawling and indexing because site map submission does not guarantee that we will crawl the URLs on that list, it is very helpful to help us discover new URLs or to make canonicalization decisions. But we don’t guarantee that if you submit a site map we will go ahead and crawl it. There have been some people who did some experiments where they saw that happen but I’m not going to confirm it or deny that for the policies can always change on how exactly we do use site map submissions. But crawling and indexing is different, so if you do a ping a lot of time Google will come and crawl you but often its Google blog search because if you are doing those WordPress or web blogs or FeedBurner pings those pings are often those sort of things that are blogs. So the blog search can come and crawl you five minutes later. But then if you show up, you might show up in the blog search corpus not in our main web index corpus. So just because you get crawled it doesn’t mean that you are getting any sort of index boost or anything like that. We do sort of try to rationally decide what is the best quality of data, how do we get that, sometimes its crawling stuff immediately like with blog search you have very fast, very real time sort of results. And sometimes it’s taking site maps, it might result in crawling at a different pace or you may not get any boost at all but we do use that information not to waste and help us try to improve canonicalization and help us try to improve the quality of the index. So I wouldn’t say Ping that’s the way it automatically get crawled or anything like that. We make great content you get to be well known we will probably crawl you relatively frequently and see updated content any time you make a good change.

Are Google SERPs moving to Ajax?

Here’s a question from Owen in London. Owen in London asks “Can you confirm if the Google SERPs are moving to AJAX, http://tinyurl.com/be5shp, if so do you think it’ll affect analytics which rely on the keyword information being in the URL?”

So, Google did roll out a change few weeks ago, which is for a very smaller percentage of users, very small like under 1% right now are doing almost what you might call java script in hand search results. So you show up on Google’s page and as you are typing you can do neat things on java script and so you can try to make things faster and you can try to make things smoother for users, there’s a lot of really smart stuff you can do. The team really didn’t think about refers and how that might break analytics packages and stuff downstream. So it’s a very small percentage of people this is being trialed on and people are thinking about, are there ways to have refers anything that you do is so useful to have refers. So ten years from now, if refers are not the conventional browser sense then may be browsers can return everything after the poun sign. For example even though after the # mark or after the poun sign isn’t officially part of the URL or URI and if browsers were to pass that along then that would help all sorts of refers and analytical packages. So the way that I think about it right now is we have try experiments as how to make search results better and faster and cleaner and it’s not the intent to break refers but we have to keep trying out new things and we do want to have the ability where analytics packages can still continue to work.

More than one H1 on a page: good or bad?

A very short to the point question from Erin, south of Boston. Erin asks “More than one H1 on a page: good or bad?”

Well, if there’s a logical reason to have multiple sections then it’s not bad to have multiple H1’s. I would pay attention to overdoing it, if your entire page is H1 that looks pretty cruddy. So don’t do all H1s and then you see assessed to look like regular tags. Because we see people who are competitors complain about that if users ever turn off CSS or if CSS doesn’t load it looks really bad. It’s alright to have a little bit of H1 here and then maybe there’s two sections on a page so may be have a little bit of H1 here. But you should really use it for the headers or the headings which is what the intent is and not to just throw H1 everywhere you can on the page. Because I can tell if you just throw H1 everywhere on the page people have tried to abuse that and so our algorithms have tried to take that into account so it doesn’t really do you that much good. So I would use it where it makes sense and more sparingly but you can have it multiple times.

Will Google add guest accounts to Webmaster Tools?

Here is a question from Ian M from United Kingdom. “Is Google planning to create read-only “Guest accounts” for Webmaster Tools? Many clients (particularly in heavily regulated industries e.g. banks) are very reluctant to provide access to a third party.”

Great feature suggestion! I have no idea! Because the Webmaster Tools team they have to plan their resources and what they work on just like any other team and I can see a valid use for this, at the same time there are other things that the tools folks are working on that are really useful. Some people want infrastructure updates so that back link reports always rock solid or new data is really really fresh and it’s hard to play that off. So it’s a valid suggestion, I appreciate the suggestion I don’t know what level priority they give that, because there’s probably a relatively limited impact compared to making reports rock solid or overhauling our UI’s and things like that, that’s going to be useful for every single person not just for a smaller faction. But that’s something I can imagine as doing in the future so we’ll definitely take that into account and we appreciate the suggestion.

Does the position of keywords in the URL affect ranking?

Interesting question from Adeel from Manchester, UK “Does the position of keywords in the URL have a significant impact: example.com/keyword/London is better than example.com/London/keyword?

Truthfully, I wouldn’t really obsess about it at that level detail, it does help a little bit to have keywords in the URL but it doesn’t help so much that you should go stuffing a ton of keywords into your URL. If there is a convenient way that is good for users, where you have four – five keywords, that might be worthwhile. But I wouldn’t obsess about it to the level of how deep is the URL in the path or how am I combining it. For example, on my blog when I do a post I’ll take the first 4-5 words or 2-3 words related to that post and I’ll use that as the URL. But you don’t need to make 7,8,10 or 20 words because that just looks like spamming users and people will probably not click through as much in the first place. So position is going to be very very second order kind of thing of keywords and URLs. I would not worry about that so much as having great content that people want to link to and people want to find out about.

Is redirecting a large number of domains suspicious?

Cweave from Dallas asks a really interesting question “When permanently redirecting (301) a large number of domains (read: more than 10) to 1 domain does Google flag this as suspicious? What considerations does Google look at? For the purposes of this question let’s assume this as a consolidation move.

I think there is plenty of valid reasons why somebody might do this. For example, if you Google, there are ton of people who have registered Google typos and we try to get those as we don’t want people to get confused to get malware. So we end up with a portfolio with lots of Google related domains even things like Google sex and Google porn. And so I think it’s perfectly logical to have misspelling of Google and all that stuff just to a 301 to Google’s home page. So that’s what I think Cweave was talking about when they said consolidation move. At the same time if we see a ton of 301’s all going to one domain, then we might take a look at that, you could certainly imagine someone trying to abuse that or do spam so we could take a second look or scrutinize that. But if all you are doing is trying to consolidate misspellings or a bunch of brands, and by brands I mean a bunch of domains that you have registered that are very in-severe domain and you really only have that one domain, I don’t foresee that being a problem because people would check it out or someone reported it as a spam thing, if we took a look we’ll just see oh yeah they are just consolidating their brand. So Google might take a look but I don’t consider that to be a large problem.

Will Google use non-link references as a signal?

This question comes from Boston, EA; Eric Enge asks “Do you think web search will ever make use of references (web site mentions that are not links) as a ranking signal?

So there are two answers. The first one is, I never want to take a ranking signal off the table like, I’ve joked, that if the face of the moon can help us rank the search results better, I’m willing to use the face of the moon. At the same time think about how people would attack the use of references. Right now a lot of people rely on getting links, if all they have to do is example.com in text and then you can leave as comments all over the web and all over blogs and all over forums it will almost be anywhere you can stamp any user generated content people would be leaving those references. So that’s the sort of reason why you might be skeptical about why we’d use this sort of signal, because people could abuse that sort of thing they could just leave mentions of the URLs even if they can’t generate links. But I’ll say we are willing to look at it, we would run the analysis we would say is there a way to pull out that signal from that noisy data where we could find a way to improve it. But that would definitely be a sort of thing that people would try to abuse it.

Should large corporations use rel=canonical?

Terry Cox from Orlando, Florida asks “In regards to the new canonicalization tag, does it make sense for large corporations to consider placing that tag on every page due to marketing tracking codes and large levels of duplicate URLs like faceted pages and load balancing servers?

So this is a great question, should you put the canonical tag on every single page, well there is a short term answer and a long term answer. Short term answer is I would probably say not right now, take a little bit of time, study your site architecture, think about URL normalization, beautification, whatever you want to call it think about the structure of URLs you want to have and take a few weeks or few months or couple of months you sort of assess where you want to go. I don’t think you should throw the canonical tag on every single page on your site immediately and move it around, because it is a powerful tool and people do have the ability to shoot themselves on their foot. So on the plus side we’ve seen a quarter of a million pages show up within just a few days, where people are using this canonicalization tag, which is fantastic. It is good to see the traction and the adoption move very quickly. On the down side we have seen one company, very large company, computer company, I won’t call them out by name, where they had a home page and the home page was doing a redirect and they also had a canonical tag and the canonical tag pointed to a page that we hadn’t crawled at all and so those sort of cases can be very difficult to try to do the right thing, and we do the right thing. But it can take us a couple of days to sort it out or go and find that URL and crawl it. So I wouldn’t just jump in deep in the pool without doing some planning. The longer term answer is that it doesn’t hurt to have this on every single page of your site. Ideally you’d find other ways to solve the canonicalization but it doesn’t hurt to say on every single page this page maps to this canonicalized, very pretty, very preferred version of this URL. But what you want to do is, you want to make sure that, it is absolute URLs, ideally goes in one hub, it’s a logical system that you designed you haven’t just jumped out and started playing around with. I don’t see any harm in having that sort of thing because we’ll just follow those, what we almost think of is many 301 redirects within that site and we’ll try to canonicalize according to those suggestion. We don’t guarantee that we’ll do it but it should work just fine with no problems. So feel free to do that but take some time and plan it out a little bit.

Will Google provide a rank-checking service?

Mark Lykle, from Oslo, Norway asks “When will Google create a software similar to Web Position so that SEOs, spam fighters and regular webmaster can check rankings etc. without violating the guidelines? Why not make a better product instead of going to war against these programs?”

Well, I wouldn’t call it going to war; I mean our guidelines have said the same things that they have said for 5, 6 or 7 years, which is essentially please don’t hit us with automated queries. And the reason that we’ve said that is because people do hit us with automated queries and that takes up some server capacity. So if someone is scrapping Google if we know that person then we may write to them and say, hey please stop scrapping, it does violate our guidelines, it does takes server capacity, we’d appreciate it if you wouldn’t scrap us. And then we do have automated system to protect ourselves against denial service attacks, scrappers, there are some viruses, Trojans and malware that try to spread themselves, by doing searches on Google on vulnerable software and so we try to find those things and block it. So if something is taking a sizable amount of our server resources we do have automated systems that attempt to stop that. That said we do have tools for example in the webmaster tools council at google.com/webmasters we can sign in and you can see the sorts of words that you are ranking for and the sorts of words that people click through on through your site for. I think we have a philosophy that it doesn’t do you as good as to pay really a ton of attention to ranking reports. It’s much better to look at your server logs to look at what are the queries that people are really showing up for and may be try to find queries that you rank at number 5 or number 4 that you can rank at number 2 or number 3 or 1 or queries that you rank on the second page that can be moved to the first page. You can also look at those queries and try to improve your ROIs, so if one percent of the people who land on your site convert into people who subscribe you newsletters or buy your products. If you can improve that so that more people convert, that’s a much faster way to improve your bottom line than just trying to rank for everything when it isn’t necessarily relevant. So I think it’s a little bit of philosophy that we don’t want to encourage people to get obsessed with their rankings when in fact they should be paying attention to what they already have in their server logs and thinking about how to convert better and thinking about those sorts of terms rather than getting obsessed with rankings. That said I would support if we had more ability for people to see the sorts of things that they rank for in Google’s Webmaster Council, it’s just a question of resources, is it better to support something like a canonical link tag, which takes the engineer working on it or ranking reports. At least historically we have said lets have all these newer features let’s show all of your back links lets show you what does your latency looks like when Google boff fetches your page and not concentrate or obsess about ranking reports. That’s the little bit of background how we feel about it.

Two questions about nofollow

Let’s talk a little bit about nofollow. Here are a few questions regarding this: Vince Samios from UK asks “Do you feel the widespread and blanket use of nofollow tags is devaluing Google’s search algorithms?”

Let me inter-check before I finish the question, even though SEO’s may feel like nofollow is everywhere on the web, if you look at the percentage of links that have nofollow, it’s actually a pretty minuscule percentage. So nofollows aren’t that common on the web compared to how the perception of them might be.

(Continues with the question) “Examples such as Wikipedia, where all external links are nofollow. Does Wikipedia mean nothing to Google’s algorithms?”

And Jonaths from Brighton, UK asks “Do Google take into account quality factors from nofollowed links when the links come from well established authority websites, such as Wikipedia?”

We are not trusting or taking into account the links from Wikipedia because they are nofollows. So don’t bother to spamming Wikipedia, it’s not going to make any difference in search engine rankings if you get a link because, that will be nofollow. If you have a great resource and people find it via Wikipedia and it’s just fantastic and people link to that because of that, or you getting traffic from a link in terms of direct surfers or visitors, then that might benefit your site. But it’s not going to get any search engine ranking boost just because Wikipedia links to you with those nofollow links. Now let me take a one slight detour and mention that, if a particular site does have trust in the person who is making the link then there is plenty of good reasons to make that link flow page rank and take the nofollow off. For example, Wikipedia has experimented with all kinds of different ways to improve their process, may be anonymous said that it has to be approved before they go live. So you could certainly imagine a scenario which Wikipedia editor, who is very trusted, who had made a ton of edits without them ever being reverted there are other editors they have asked for, however they want to define trust those links might for example take the nofollow off. So a very simple thing when you are being under attack from a spam register at that nofollow tag and then it doesn’t benefit spammers anymore. But if you run a blog or forum or Wikipedia or whatever and you can come up with a good metric to say, ok these are the links that we do trust that we do think that are editorially given and are valuable for users then there is plenty of good reasons to go ahead and say make those links flow page ranks. But in general nofollow links are relatively small percentage of the web and it does prevent lot of sites from getting spammed. We don’t use those links from Wikipedia currently, but if Wikipedia want it to put them on newly asked policies and place, I would definitely support that.

Request a Free SEO Quote