SEO Forum
 
SEO Problems SEO Knowledge Database Seo Forum Search engine optimization
  FAQ   Search   Memberlist   Usergroups   Register   Profile   Log in to check your private messages   Log in 
google

 
Post new topic   Reply to topic    Seo FORUM » Google Search Engine Forum View previous topic :: View next topic  
google
 PostPosted: Mon Oct 15, 2007 7:07 am Reply with quote  
Message
  werror

Joined: 15 Oct 2007
Posts: 18

I originally discussed this at HighRankings and am in the process of refining my thoughts on the subject. Matt Cutts has mentioned "reputation" a couple of times on his blog. A lot of people speak about "reputation", and it has accumulated many contexts within the search engine and search optimization industries.

I'll start out with a few definitions.

Web site (aka website) - A collection of Web pages that form a collective whole. A Web site may cover one topic or many topics but it is structurally unified in the creator's view and intent.

Host - Any domain (domain.name) or sub-domain (sub.domain.name). This expression is used in the technical patents and papers published by search engineers in the academic and professional communities. I don't believe it should be equated with "Web site". A host may contain many distinct, separately owned Web sites within its content (e.g., Geocities.com, Stormpages.com, et. al.).

Link Popularity - The measure of a page's importance or value to the Web community as determined by the raw number of links pointing to it. Variations on link popularity have been proposed, such as qualifying links before counting them, disqualifying links, and normalizing links that point to secondary pages by counting them as if they point to the main pages of sites.

Click Popularity - The measure of a page's importance or value to the surfing communing as determined by the raw number of clicks on the page's URL in a directory or search engine. Variations have been proposed such as qualifying clicks by time spent on target sites, whether users click on the BACK button, etc.

PageRank - Larry Page and Sergey Brin's controversial method for measuring the importance or value of a site to both the Web and surfing communities as determined by the number and value of links pointing to a document. The PageRank is a probabilistic measurement of the chances that a surfer will land on a given page by randomly clicking on lnks. The combined sum of all PageRanks cannot exceed 1 (probabilities are measured as values between 0 and 1).

PageRank is arbitrarily assigned evenly to all indexed documents (individual Web pages, not sites or hosts). A series of iterative processes then follows in which PageRank values are adjusted on the basis of the value of links pointing to the documents. A link's value is equal to the current PageRank of its mother document divided by the number of normalized links contained in the document as adjusted by an arbitrary damping factor. It is assumed that normalization includes discarding duplicate links so that each document is treated as pointing to any other document only once.

Links pointing to unindexed documents are discarded until the last iteration, when the unindexed documents are assigned their PageRanks.

Documents with no outbound links are treated as if they link to every other document in the collection.

In the Link Popularity model all links share a single value. In the PageRank model, each link's value is dependent upon its document's PageRank value and the number of outbound links on the document.

Mike Grehan reports that engineers at Ask and Yahoo! believe Google has not yet (fully) implemented PageRank. Google does not actually rank search results by PageRank, except in its directory. But they do claim that PageRank is one of more than 100 factors used to rank search results. Several technical papers published by academic and professional search engineers have proposed methods for reducing the amount of time and resources required to calculate PageRank for the Web.

Link Popularity, Click Popularity, and PageRank are all vulnerable to manipulation. PageRank researchers have proposed a variety of methods for refining the PageRank calculation process to account for (and filter out) manipulative links.

Like other people who have followed these issues through the years, based on my own study of the various technical papers and patents, my feeling is that Google, Yahoo!, Ask, and MSN probably maintain core setx of Good or Trusted Sites from which value (like PageRank) is conferred out to other sites. Based on comments made by Matt Cutts and other people, my feeling is that these search engines also maintain core sets of Flagged or Suspicious Sites from which outgoing value is reduced or blocked.

Google seems to be taking punitive action against Web sites in one of three ways:

Delisting - This is the most radical action. A document is completely removed from the searchable index and won't even come up for its own title tag or URL.

Penalization - A document appears in the index but won't rank well for any search expressions except the most obscure. You usually have to find the document by URL.

Devaluation - A document appears in the index and may even rank well on the basis of its own factors. But none of its outbound links confer PageRank or reputation. The links may still be crawled, but Matt has not indicated whether this is so.

Google also affects page performance indirectly as a consequence of taking punitive actions against other sites. That is, innocent documents may experience:

Scope Reduction - Having done nothing wrong, a document suddenly loses position or all ranking for one or more (but not all) of its targeted search expressions.

Rank Loss - Having done nothing wrong, a document suddenly loses significant position within the search results.

Rank Depression - Having done nothing wrong, a document suddenly loses all position within the search results. It suffers as if it has been Penalized for no apparent reason.

Scope Reduction, Rank Loss, and Rank Depression are believed to be due to the loss of the value of inbound links from other documents. Very specifically, the July 2005 and October 2005 Google updates either delisted or devalued many documents matching the footprint (form and structure) of directory pages. Not all directory page-like documents were devalued or delisted, but many were. The most notable delistings were for SpamAd pages created to rank solely for the purpose of presenting third-party ads to visitors.

Every document appears to have an inherent Reputation within Google's internal database. This Reputation may consist of a single valuation or it may be a function of several separate, distinct valuations. Document Reputation appears to reflect something like PageRank (Importance), Trust (Conferring value), and Status (Good, Unknown, or Bad).


Laughing
 PostPosted: Wed Mar 12, 2008 5:58 pm Reply with quote  
Message
  siteadmin
Site Admin

Joined: 18 Dec 2005
Posts: 301

GOod point hang around someone will be in a better position to discuss it,
Post new topic   Reply to topic    Searchenginegenie.com Forum Index » Google Search Engine Forum

Page 1 of 1
All times are GMT

Display posts from previous:

  

Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Search Engine Optimization SEO Company