Click Here for a free website!


Google Basics: Links to Google provided information.
General Google Links
Google Site Map
Google TOS
Google Privacy Policy
Google Help Central

Technology Google Links
Google Technology Overview
PageRank Explained
Search Features

ToolBar, SPAM, PR

Google Toolbar
Internet Explorer add-on that directly accesses Googles search functions, translation functions and shows the user a graphical approximation of a pages rank (PageRank: Googles determined value of you webpage as determined by Google.)

Google Webmaster Information

Webmasters Guide Index' target='new'>Google Facts & Fiction

Add/Remove sites/content

Add Url
Add your root url here, Google will find your other pages by crawling (following) your links.

Remove Content
Does Google have certain pages that you DIDN'T want indexed? This tells you how to remove it.

SPAM Report :: Spam Report Generic link to Google's spam form.
Any unethical technique used by an seo to gain better search engine ranking should be reported via the Spam Report link as requested by Google. Be sure the spam is clear cut and not just a site that you would like to see removed.

You can also report from any SERP - at the bottom of the first page there is a link "Dissatisfied with your results? Help us improve" When you click that link (the 'help us improve part') you will be taken to a custom spam report that tells Google what search you did to find the spam. If you do use the generic form please be sure to state what your search was, the offending url(s) and any other information. Keep it simple and factual, short and to the point.

Common Forum Questions & Answers:

1) Q: What is the 'Google Dance'

A: The 'Google Dance' is what the monthly update is commonly called. It refers to the update process that is visible in the index. The update takes about a week to complete and during that time you will see the listings vary wildly from new data to old while positions will jump up and down (and sometimes completely out) in a SERP as well.

2) Q: When does Google update?

A: Google has a standard monthly update which generally occurs at the end of every month. The data for this update is gathered during the preceding weeks. Google has been known to delay updates (or process them sooner than expected) when the mood strikes. Chances are delays are due to new algorithms being implemented and tested.

3) Q: What is "Everflux, Freshbot, Freshcrawl"

A: Everflux is when the SERPs appear to change (sites move up and down) while the update is NOT in effect. This is brought about by the Freshbot (or Freshcrawl) and is Googles way of ensuring that their data is the 'freshest' on the web. When the phenomena first occurred it was identifiable by the word 'Fresh!' and a date appearing on the last line of your listing between the URL and the 'Cached' link. The addition of 'Fresh!' the word was misleading to the public at large as it appeared to be a comment on the quality of the data not the recency of the crawl (which was what it was) so the word was quickly removed but the date remains. When you see this date attached to your listing you know that it is the product of a 'Fresh crawl' or 'Everflux'. What does this do for you? This enables Google to present the most recent version of your site in their listings and can cause your listing to move up or down in the SERPs. This is not a permanent change of your information within their database and if your site is not re-crawled by the 'freshbot' any changes associated with this crawl will fall out in approximately 48hrs. If your site is new to Google and is crawled by the 'freshbot' Note: there is no official bot called 'freshbot' it is a term used in forums to describe the googlebot that affects these specific changes within SERPs so don't email them for information regarding the 'freshbot' as they will respond that it doesn't exist! (because it doesn't)

4) Q: When will Google spider my site and how can I tell?
Q: What is googlebot?

A: Googlebot (Google's spider) finds sites by following links from other sites. The more links to your site they find the the chance of Googlebot finding you. They also follow links from their addurl form, but prefer to find links to you instead. You can check to see if/when Googlebot has visited by reviewing your site's logs located on your server. When Googlebot visits you will see a reference to 'The little robot that sends out to spider website content GoogleBot' or '' or something similar. They announce who they are. For more on locating and viewing your server logs you need to speak to your hosting company. Many common website statistics (site stats) programs will gather this information and present an easy to read report for you.

5) Q: What is PageRank?

A: PageRank is Google's method of ranking web pages. Google basically counts the number of incoming links to your web PAGE and (considers various other factors) to determine the importance of your web page and where it should rank. Keep in mind that this is only one factor in ranking.

6) Q: Will Google index my (asp, php, html, xml, etc) site?

A: Probably. Here is a list of what Google indexes as of 9/12/02 (from

HyperText Markup Language (html, pdf, asp, jsp, hdml, shtml, xml, cfml)
Adobe Portable Document Format (pdf)
Adobe PostScript (ps)
Lotus 1-2-3 (wk1, wk2, wk3, wk4, wk5, wki, wks, wku)
Lotus WordPro (lwp)
MacWrite (mw)
Microsoft Excel (xls)
Microsoft PowerPoint (ppt)
Microsoft Word (doc)
Microsoft Works (wks, wps, wdb)
Microsoft Write (wri)
Rich Text Format (rtf)
Text (ans, txt)
Usenet messages

7) Q: I reported spam I found in Google, why is it still there?

A: Google does not remove spam just because it was reported. They review the report as soon as they can to determine if it really is spam (imagine, what would you do if it was that easy and your competitor got you removed). If it is particularly bad they may manually remove the offensive site or page but they do try and avoid this when ever possible. If it is not then the site or page will most likely become test material for new filter tweaks. Google always prefer to adjust their filters and algorithms to catch spam.

[Common listing description questions]
- Can I change my description?
- What is my description?
- That's not the description I wanted!

A: The description Google provides with your listing is comprised of what they call Snippets. Snippets are created by taking the searched term and pulling that, along with the surrounding text, from your text data on your page. This Snippet is variable and changes with the search term.

If your site/url has an ODP/DMOZ listing then Google will list the ODP description below the Snippet, indicated by the word 'Description:' preceding it. The only way to alter this is by getting your ODP description altered.

9) What are internal links, external links, backlinks?

A: Internal links are links to your own site from withing your own site. External links are links to your site from someone else's site. Backlinks describes the links that Google shows as counting towards your site. Via the Google ToolBar there is a button that states "Backward Links" which when clicked is supposed to show you sites linking to the page your browser is currently viewing. Note that this button does not show all links indexed or credited to your site. Common belief is that it only shows links with a PR4 or above and that it also only shows 1/2 of the links credited to your site/page.

10) What do links have to do with Google?

A: Links, internal and external, help Googlebot find the pages within your site so that it can index them. Links also count, to various degrees depending on the link, towards your PageRank.

11) What is ?

A: Google offers wireless services for several devices and go so far as to convert your html site for viewers using wireless appliances (such as cell phones and pda's). Google Wireless, wireless user guide
Google's PageRank Explained
Not long ago, there was just one well-known PageRank Explained paper, to which most interested people referred when trying to understand the way that PageRank works. In fact, I used it myself. But when I was writing the PageRank Calculator, I realized that the original paper was misleading in the way that the calculations were done. It uses its own form of PageRank, which the author calls "mini-rank". Mini-rank changes Google's PageRank equation for no apparent reason, making the results of the calculations very misleading.

Even though the author abandoned mini-rank as a result of this and another paper, the original, unchanged paper is still available on the web. So if you come across a PageRank Explained paper that uses "mini-rank", it has been superceded and is best ignored.

What is Page Rank?

PageRank is a numeric value that represents how important a page is on the web. Google figures that when one page links to another page, it is effectively casting a vote for the other page. The more votes that are cast for a page, the more important the page must be. Also, the importance of the page that is casting the vote determines how important the vote itself is. Google calculates a page's importance from the votes cast for it. How important each vote is is taken into account when a page's PageRank is calculated.

PageRank is Google's way of deciding a page's importance. It matters because it is one of the factors that determines a page's ranking in the search results. It isn't the only factor that Google uses to rank pages, but it is an important one. From here on in, we'll occasionally refer to PageRank as "PR".


Not all links are counted by Google. For instance, they filter out links from known link farms. Some links can cause a site to be penalized by Google. They rightly figure that webmasters cannot control which sites link to their sites, but they can control which sites they link out to. For this reason, links into a site cannot harm the site, but links from a site can be harmful if they link to penalized sites. So be careful which sites you link to. If a site has PR0, it is usually a penalty, and it would be unwise to link to it.

How is Page Rank Calculated?

To calculate the PageRank for a page, all of its inbound links are taken into account. These are links from within the site and links from outside the site.

PR(A) = (1-d) + d(PR(t1)/C(t1) + ... + PR(tn)/C(tn))

That's the equation that calculates a page's PageRank. It's the original one that was published when PageRank was being developed, and it is probable that Google uses a variation of it but they aren't telling us what it is. It doesn't matter though, as this equation is good enough.
In the equation 't1 - tn' are pages linking to page A, 'C' is the number of outbound links that a page has and 'd' is a damping factor, usually set to 0.85.

We can think of it in a simpler way:-

a page's PageRank = 0.15 + 0.85 * (a "share" of the PageRank of every page that links to it)

"share" = the linking page's PageRank divided by the number of outbound links on the page.

A page "votes" an amount of PageRank onto each page that it links to. The amount of PageRank that it has to vote with is a little less than its own PageRank value (its own value * 0.85). This value is shared equally between all the pages that it links to.

From this, we could conclude that a link from a page with PR4 and 5 outbound links is worth more than a link from a page with PR8 and 100 outbound links. The PageRank of a page that links to yours is important but the number of links on that page is also important. The more links there are on a page, the less PageRank value your page will receive from it.

If the PageRank value differences between PR1, PR2,.....PR10 were equal then that conclusion would hold up, but many people believe that the values between PR1 and PR10 (the maximum) are set on a logarithmic scale, and there is very good reason for believing it. Nobody outside Google knows for sure one way or the other, but the chances are high that the scale is logarithmic, or similar.

If so, it means that it takes a lot more additional PageRank for a page to move up to the next PageRank level that it did to move up from the previous PageRank level. The result is that it reverses the previous conclusion, so that a link from a PR8 page that has lots of outbound links is worth more than a link from a PR4 page that has only a few outbound links.
Whichever scale Google uses, we can be sure of one thing. A link from another site increases our site's PageRank. Just remember to avoid links from link farms.

Note that when a page votes its PageRank value to other pages, its own PageRank is not reduced by the value that it is voting. The page doing the voting doesn't give away its PageRank and end up with nothing. It isn't a transfer of PageRank. It is simply a vote according to the page's PageRank value. It's like a shareholders meeting where each shareholder votes according to the number of shares held, but the shares themselves aren't given away. Even so, pages do lose some PageRank indirectly, as we'll see later.

Ok so far? Good. Now we'll look at how the calculations are actually done.
For a page's calculation, its existing PageRank (if it has any) is abandoned completely and a fresh calculation is done where the page relies solely on the PageRank "voted" for it by its current inbound links, which may have changed since the last time the page's PageRank was calculated.

The equation shows clearly how a page's PageRank is arrived at. But what isn't immediately obvious is that it can't work if the calculation is done just once. Suppose we have 2 pages, A and B, which link to each other, and neither have any other links of any kind. This is what happens:- Mario Dorizas

Step 1: Calculate page A's PageRank from the value of its inbound links

Page A now has a new PageRank value. The calculation used the value of the inbound link from page B. But page B has an inbound link (from page A) and its new PageRank value hasn't been worked out yet, so page A's new PageRank value is based on inaccurate data and can't be accurate.

Step 2: Calculate page B's PageRank from the value of its inbound links

Page B now has a new PageRank value, but it can't be accurate because the calculation used the new PageRank value of the inbound link from page A, which is inaccurate.

It's a Catch 22 situation. We can't work out A's PageRank until we know B's PageRank, and we can't work out B's PageRank until we know A's PageRank.

Now that both pages have newly calculated PageRank values, can't we just run the calculations again to arrive at accurate values? No. We can run the calculations again using the new values and the results will be more accurate, but we will always be using inaccurate values for the calculations, so the results will always be inaccurate.

The problem is overcome by repeating the calculations many times. Each time produces slightly more accurate values. In fact, total accuracy can never be achieved because the calculations are always based on inaccurate values. 40 to 50 iterations are sufficient to reach a point where any further iterations wouldn't produce enough of a change to the values to matter. This is precisiely what Google does at each update, and it's the reason why the updates take so long.

One thing to bear in mind is that the results we get from the calculations are proportions. The figures must then be set against a scale (known only to Google) to arrive at each page's actual PageRank. Even so, we can use the calculations to channel the PageRank within a site around its pages so that certain pages receive a higher proportion of it than others.


You may come across explanations of PageRank where the same equation is stated but the result of each iteration of the calculation is added to the page's existing PageRank. The new value (result + existing PageRank) is then used when sharing PageRank with other pages. These explanations are wrong for the following reasons:-

1. They quote the same, published equation - but then change it from PR(A) = (1-d) + d(......) to PR(A) = PR(A) + (1-d) + d(......)
It isn't correct, and it isn't necessary.

2. We will be looking at how to organize links so that certain pages end up with a larger proportion of the PageRank than others. Adding to the page's existing PageRank through the iterations produces different proportions than when the equation is used as published. Since the addition is not a part of the published equation, the results are wrong and the proportioning isn't accurate.

According to the published equation, the page being calculated starts from scratch at each iteration. It relies solely on its inbound links. The 'add to the existing PageRank' idea doesn't do that, so its results are necessarily wrong. Mario Dorizas

Internal linking

Fact: A website has a maximum amount of PageRank that is distributed between its pages by internal links.

The maximum PageRank in a site equals the number of pages in the site *

1. The maximum is increased by inbound links from other sites and decreased by outbound links to other sites. We are talking about the overall PageRank in the site and not the PageRank of any individual page. You don't have to take my word for it. You can reach the same conclusion by using a pencil and paper and the equation.

Fact: The maximum amount of PageRank in a site increases as the number of pages in the site increases.

The more pages that a site has, the more PageRank it has. Again, by using a pencil and paper and the equation, you can come to the same conclusion. Bear in mind that the only pages that count are the ones that Google knows about.

Fact: By linking poorly, it is possible to fail to reach the site's maximum PageRank, but it is not possible to exceed it.

Poor internal linkages can cause a site to fall short of its maximum but no kind of internal link structure can cause a site to exceed it. The only way to increase the maximum is to add more inbound links and/or increase the number of pages in the site.

Cautions: Whilst I thoroughly recommend creating and adding new pages to increase a site's total PageRank so that it can be channeled to specific pages, there are certain types of pages that should not be added. These are pages that are all identical or very nearly identical and are known as cookie-cutters. Google considers them to be spam and they can trigger an alarm that causes the pages, and possibly the entire site, to be penalized. Pages full of good content are a must.

Inbound links

Inbound links (links into the site from the outside) are one way to increase a site's total PageRank. The other is to add more pages. Where the links come from doesn't matter. Google recognizes that a webmaster has no control over other sites linking into a site, and so sites are not penalized because of where the links come from. There is an exception to this rule but it is rare and doesn't concern this article. It isn't something that a webmaster can accidentally do.

The linking page's PageRank is important, but so is the number of links going from that page. For instance, if you are the only link from a page that has a lowly PR2, you will receive an injection of 0.15 + 0.85(2/1) = 1.85 into your site, whereas a link from a PR8 page that has another 99 links from it will increase your site's PageRank by 0.15 + 0.85(7/100) = 0.2095. Clearly, the PR2 link is much better - or is it?

Once the PageRank is injected into your site, the calculations are done again and each page's PageRank is changed. Depending on the internal link structure, some pages' PageRank is increased, some are unchanged but no pages lose any PageRank.

It is beneficial to have the inbound links coming to the pages to which you are channeling your PageRank. A PageRank injection to any other page will be spread around the site through the internal links. The important pages will receive an increase, but not as much of an increase as when they are linked to directly. The page that receives the inbound link, makes the biggest gain.

It is easy to think of our site as being a small, self-contained network of pages. When we do the PageRank calculations we are dealing with our small network. If we make a link to another site, we lose some of our network's PageRank, and if we receive a link, our network's PageRank is added to. But it isn't like that. For the PageRank calculations, there is only one network - every page that Google has in its index. Each iteration of the calculation is done on the entire network and not on individual websites.

Because the entire network is interlinked, and every link and every page plays its part in each iteration of the calculations, it is impossible for us to calculate the effect of inbound links to our site with any realistic accuracy. Mario Dorizas

Outbound links

Outbound links are a drain on a site's total PageRank. They leak PageRank. To counter the drain, try to ensure that the links are reciprocated. Because of the PageRank of the pages at each end of an external link, and the number of links out from those pages, reciprocal links can gain or lose PageRank. You need to take care when choosing where to exchange links.
When PageRank leaks from a site via a link to another site, all the pages in the internal link structure are affected. (This doesn't always show after just 1 iteration).

The page that you link out from makes a difference to which pages suffer the most loss. Without a program to perform the calculations on specific link structures, it is difficult to decide on the right page to link out from, but the generalization is to link from the one with the lowest PageRank.

Many websites need to contain some outbound links that are nothing to do with PageRank. Unfortunately, all 'normal' outbound links leak PageRank. But there are 'abnormal' ways of linking to other sites that don't result in leaks. PageRank is leaked when Google recognizes a link to another site. The answer is to use links that Google doesn't recognize or count. These include form actions and links contained in javascript code.

Form actions

A form's 'action' attribute does not need to be the url of a form parsing script. It can point to any html page on any site. Try it.

<form name="myform" action="">
<a href="javascript:document.myform.submit()">Click here</a>

To be really sneaky, the action attribute could be in some javascript code rather than in the form tag, and the javascript code could be loaded from a 'js' file stored in a directory that is barred to Google's spider by the robots.txt file.

Example: <a href="javascript:goto('wherever')">Click here</a>

Like the form action, it is sneaky to load the javascript code, which contains the urls, from a seperate 'js' file, and sneakier still if the file is stored in a directory that is barred to googlebot by the robots.txt file.

The "rel" attribute
As of 18th January 2005, Google, together with other search engines, is recognising a new attribute to the anchor tag. The attribute is "rel", and it is used as follows:-

<a href="" rel="nofollow">link text</a>

The attribute tells Google to ignore the link completely. The link won't help the target page's PageRank, and it won't help its rankings. It is as though the link doesn't exist. With this attribute, there is no longer any need for javascript, forms, or any other method of hiding links from Google.

So how much additional PageRank do we need to move up the toolbar?

First, let me explain in more detail why the values shown in the Google toolbar are not the actual PageRank figures. According to the equation, and to the creators of Google, the billions of pages on the web average out to a PageRank of 1.0 per page. So the total PageRank on the web is equal to the number of pages on the web * 1, which equals a lot of PageRank spread around the web.

The Google toolbar range is from 1 to 10. (They sometimes show 0, but that figure isn't believed to be a PageRank calculation result). What Google does is divide the full range of actual PageRanks on the web into 10 parts - each part is represented by a value as shown in the toolbar. So the toolbar values only show what part of the overall range a page's PageRank is in, and not the actual PageRank itself. The numbers in the toolbar are just labels.

Whether or not the overall range is divided into 10 equal parts is a matter for debate - Google aren't saying. But because it is much harder to move up a toolbar point at the higher end than it is at the lower end, many people (including me) believe that the divisions are based on a logarithmic scale, or something very similar, rather than the equal divisions of a linear scale.

Let's assume that it is a logarithmic, base 10 scale, and that it takes 10 properly linked new pages to move a site's important page up 1 toolbar point. It will take 100 new pages to move it up another point, 1000 new pages to move it up one more, 10,000 to the next, and so on. That's why moving up at the lower end is much easier that at the higher end.
In reality, the base is unlikely to be 10. Some people think it is around the 5 or 6 mark, and maybe even less. Even so, it still gets progressively harder to move up a toolbar point at the higher end of the scale.

Note that as the number of pages on the web increases, so does the total PageRank on the web, and as the total PageRank increases, the positions of the divisions in the overall scale must change. As a result, some pages drop a toolbar point for no 'apparent' reason. If the page's actual PageRank was only just above a division in the scale, the addition of new pages to the web would cause the division to move up slightly and the page would end up just below the division. Google's index is always increasing and they re-evaluate each of the pages on more or less a monthly basis. It's known as the "Google dance". When the dance is over, some pages will have dropped a toolbar point. A number of new pages might be all that is needed to get the point back after the next dance.

The toolbar value is a good indicator of a page's PageRank but it only indicates that a page is in a certain range of the overall scale. One PR5 page could be just above the PR5 division and another PR5 page could be just below the PR6 division - almost a whole division (toolbar point) between them.


Domain names and Filenames To a spider,,, and are different urls and, therefore, different pages. Surfers arrive at the site's home page whichever of the urls are used, but spiders see them as individual urls, and it makes a difference when working out the PageRank. It is better to standardize the url you use for the site's home page. Otherwise each url can end up with a different PageRank, whereas all of it should have gone to just one url.

If you think about it, how can a spider know the filename of the page that it gets back when requesting ? It can't. The filename could be index.html, index.htm, index.php, default.html, etc. The spider doesn't know. If you link to index.html within the site, the spider could compare the 2 pages but that seems unlikely. So they are 2 urls and each receives PageRank from inbound links. Standardizing the home page's url ensures that the Pagerank it is due isn't shared with ghost urls.

Example: Go to my UK Holidays and UK Holiday Accoommodation site - how's that for a nice piece of link text ;). Notice that the url in the browser's address bar contains "www.". If you have the Google Toolbar installed, you will see that the page has PR5. Now remove the "www." part of the url and get the page again. This time it has PR1, and yet they are the same page.

Actually, the PageRank is for the unseen frameset page. When this article was first written, the non-www URL had PR4 due to using different versions of the link URLs within the site. It had the effect of sharing the page's PageRank between the 2 pages (the 2 versions) and, therefore, between the 2 sites. That's not the best way to do it. Since then, I've tidied up the internal linkages and got the non-www version down to PR1 so that the PageRank within the site mostly stays in the "www." version, but there must be a site somewhere that links to it without the "www." that's causing the PR1.

Imagine the page, The index page contains links to several relative urls; e.g. products.html and details.html. The spider sees those urls as and Now let's add an absolute url for another page, only this time we'll leave out the "www." part - This page links back to the index.html page, so the spider sees the index pages as Although it's the same index page as the first one, to a spider, it is a different page because it's on a different domain. Now look what happens. Each of the relative urls on the index page is also different because it belongs to the domain. Consequently, the link stucture is wasting a site's potential PageRank by spreading it between ghost pages.

Adding new pages
There is a possible negative effect of adding new pages. Take a perfectly normal site. It has some inbound links from other sites and its pages have some PageRank. Then a new page is added to the site and is linked to from one or more of the existing pages. The new page will, of course, aquire PageRank from the site's existing pages.

The effect is that, whilst the total PageRank in the site is increased, one or more of the existing pages will suffer a PageRank loss due to the new page making gains. Up to a point, the more new pages that are added, the greater is the loss to the existing pages. With large sites, this effect is unlikely to be noticed but, with smaller ones, it probably would. So, although adding new pages does increase the total PageRank within the site, some of the site's pages will lose PageRank as a result. The answer is to link new pages is such a way within the site that the important pages don't suffer, or add sufficient new pages to make up for the effect (that can sometimes mean adding a large number of new pages), or better still, get some more inbound links.


The Google toolbar
If you have the Google toolbar installed in your browser, you will be used to seeing each page's PageRank as you browse the web. But all isn't always as it seems. Many pages that Google displays the PageRank for haven't been indexed in Google and certainly don't have any PageRank in their own right. What is happening is that one or more pages on the site have been indexed and a PageRank has been calculated. The PageRank figure for the site's pages that haven't been indexed is allocated on the fly - just for your toolbar. The PageRank itself doesn't exist.

It's important to know this so that you can avoid exchanging links with pages that really don't have any PageRank of their own. Before making exchanges, search for the page on Google to make sure that it is indexed.

Some people believe that Google drops a page's PageRank by a value of 1 for each sub-directory level below the root directory. E.g. if the value of pages in the root directory is generally around 4, then pages in the next directory level down will be generally around 3, and so on down the levels. Other people (including me) don't accept that at all. Either way, because some spiders tend to avoid deep sub-directories, it is generally considered to be beneficial to keep directory structures shallow (directories one or two levels below the root).

ODP and Yahoo!
It used to be thought that Google gave a Pagerank boost to sites that are listed in the Yahoo! and ODP (a.k.a. DMOZ) directories, but these days general opinion is that they don't. There is certainly a PageRank gain for sites that are listed in those directories, but the reason for it is now thought to be this:-

Google spiders the directories just like any other site and their pages have decent PageRank and so they are good inbound links to have. In the case of the ODP, Google's directory is a copy of the ODP directory. Each time that sites are added and dropped from the ODP, they are added and dropped from Google's directory when they next update it. The entry in Google's directory is yet another good, PageRank boosting, inbound link. Also, the ODP data is used for searches on a myriad of websites - more inbound links!

Listings in the ODP are free but, because sites are reviewed by hand, it can take quite a long time to get in. The sooner a working site is submitted, the better. For tips on submitting to DMOZ.
Google's Fresh Crawl explained
Google does two types of crawl:- the main crawl and the fresh crawl. The main crawl is done once a month; the fresh crawl is done more-or-less daily, but only some pages are crawled. Google is still experimenting with which sites and pages to crawl and how deep to crawl. Neither type of crawl puts any new pages into Google's main index. That only happens at the next update - at the conclusion of the next Google Dance. Fresh crawls can be distinguished from main crawls by the IP addresses used by Googlebot. Fresh crawl: 64.68.82...; Main crawl: 216.239.46...

The fresh crawl recrawls pages that are already in the index, picking up new pages along the way. Fresh-crawled new pages are evaluated in some way and inserted into the search results straight away, which means that new pages can be found by surfers almost immediately, even though they are not yet in Google's main index. A new page can be added to a site today and traffic could start arriving on it within hours.

Also, updated pages that are already in Google's main index, are re-evaluated in some way and inserted into the search results in places that reflect the changes. E.g. the day after the link to this site's SEO Copywriting page was placed on the index page, the index page showed up at #3 for the search term "seo copywriting". The index page was well established in Google's main index, but the SEO copywriting part of it was new, and was given the "fresh" treatment. Very soon after that, the SEO copywriting page itself was 'fresh' ranked at #1.

This is good news for surfers and webmasters, although some websites can suffer for a while due to fresh-crawled new pages pushing them down the rankings.

In practise, many new fresh-crawled pages enjoy a flury of traffic while they are not in the main index. When they have been included in the main index, they take their place in the rankings according to their evaluated merit, and the traffic tends to be reduced unless the page actually merits its 'fresh' ranking, of course.

At the time of writing, the fresh crawl is still new, but my theory of the experience of a new page is this:- Sometime during a month, the new page is found by Google and fresh-crawled. It is evaluated in some way and placed in a 'fresh' index. From there it is inserted into the rankings, according to its 'fresh' evaluation.

The page is involved in the next end-of-month dance but, because it hasn't yet been main-crawled, it isn't included in the actual update and isn't placed in the main index. It continues to be a 'fresh' page.

Then the main crawl gets underway. If the page still exists, it is crawled and will be included in the following update, when it will enter the main index. During this period, it may keep the 'fresh' ranking that it achieved provided that other new pages don't come along to push it down. It is only after the page enters the main index that it's true ranking is seen. Because of the page's revised evaluation when entering the main index, traffic from it is likely to drop. That's assuming that the page didn't really merit its 'fresh' ranking.

It should be noted that Google is continually updating the rankings and 'fresh' rankings are very volatile in that they come, go and change during a page's 'fresh' period.

As I said, the fresh crawl is still quite new and not yet fully understood. The experience of a new page from fresh crawl to main index is what I believe I have observed, but my conclusions could easily be wrong. The reason I believe that new pages don't enter the main index until the dance and update after they have been main-crawled, even though they have usually been involved in one dance, is because Google still shows no links to them until after the update following their first main crawl. This is my theory of a new page's experience but, like any theory, it may need to be revised in the light of new observations.


As of the New Year 2003 update, Google is applying Toolbar PR0 (zero PageRank) values to some new pages. PR0 normally indicates that a page has been penalized, but these PR0s are not penalties. From my observations, it appears that the values apply to pages that have been fresh-crawled and have gone through an update following the fresh-crawl. Such pages don't get into the main index until after they have been main-crawled and gone through the update after that. It appears that, between the two updates, Google applies PR0 to the pages.

The reason for it may be to do with how Google inserts 'fresh' pages into the rankings or it may be for some other reason entirely. Also, it may be that different PR values are applied to different pages, but it is brand new and, as yet, I have seen only PR0 values applied.
The Definitive Secret to SEO Revealed
There are two physical sensations that are at once difficult to describe to anyone who has never experienced them, and yet, quite universal to anyone who has. One is the sense of urbane anticipation that comes from silently flicking your hole cards to request a "hit" at the blackjack table, and the other is the surprisingly solid weight of moving a quality chess piece across the board.

I first learned to play chess when I was about ten, and immediately fell in love with the game. For several years, I even thought I was pretty competent, regularly beating most of the other kids in the neighborhood. When I left the playgrounds of elementary school, however, for the halls of junior high, I quickly discovered otherwise. My first lesson in humility came at the hands of my algebra teacher, who not only beat me in about twelve moves, but did it while carrying on an animated conversation with half a dozen other students. I swear, to this day, Mr. Kirsten never more than glanced at the board before launching devastating attack after attack into my defensives. He taught me something important that day, and it wasn't just about chess.

Knowing something and mastering something are two very different things.

I don't much go in for hype, and heaven knows the title to this article smacks badly of it, but I honestly believe the title is nonetheless accurate. There really is a Secret to SEO. And I'm going to tell you what it is. I doubt it's going to set the world on fire, however, because knowing the secret probably won't give you immediate mastery of search engine optimization. Knowing and mastering are two different things.

The Definitive Secret to SEO is ... keywords.

I know. It sounds so simple, doesn't it? Well, learning how to move a knight or bishop is simple, too, at least until someone puts a can of whoop on you like my algebra teacher did. There are complexities in them thar hills, my friend. Let's start at the beginning ...

----- What Are You Selling? -----

This is probably the most important business question you will ever ask yourself. Unfortunately, it's also often one of the most complex to analyze and the most difficult to answer. Historically, business owners have been warned that there are two different levels to the question, both of pretty obvious importance. Harley Davidson sells motorcycles, and that's one level, what one might call the surface level. When you're building factories or contracting with suppliers, it's an important level to understand.

Beyond that, however, Harley doesn't really sell vehicles. It sells prestige and status.

That distinction is of paramount importance when it comes time to actually sell the product to a living, breathing human being. The customer, of course, doesn't walk into a store and ask to take some status on a test drive. The customer "thinks" he wants a motorcycle, but if that's all you try to sell him, you're badly missing the boat. Your sales pitch, the company literature, the image projected in advertising - everything you do has to sell what the customer is really there to buy. And that ain't just a bike.

This is old news, of course, available in any good business book. As a netrepreneur, you still need to understand what you are selling. You need to differentiate between what the customer thinks they are buying and what they actually want to buy. You will then make or purchase the former, while all of your sales and marketing efforts will be directed at the latter.

The Web, however, adds a third level to the question.

This third level exists where the line between the other two levels becomes blurry. It exists because, on the Internet, people have to FIND YOU before they can buy from you.

Technically, this is equally true in the brick-and-mortar world and is a problem faced by every business. People still need to find your business. One solution, often an expensive one, is to rely on branding and advertising. Another solution, one especially used by local businesses, is something called cross-advertising. The smart businessman doesn't just buy space in the Yellow Pages, but rather buys space in multiple categories in the Yellow Pages. Different people will look for a business in different ways.

The problem of people finding your site becomes a third level, of equal importance to the other two levels, because of the unique nature of the Internet. Unlike the Yellow Pages of the brick-and-mortar world, alphabetical categories for a business just don't work very well. People can still find your site through branding, through frequent advertising, even occasionally through browsing. But most of the people have become much more accustomed to SEARCHING for what they want on the Web.

Let's say you wanted to buy a Harley Davidson on the Internet. How would you find one?

Chances are good you would go to your favorite search engine. If you enter "motorcycles" as your search term, however, you're going to be quickly inundated with several million irrelevant results. Even though that is the first level answer to our "What are you selling?" question, it's far too broad a keyword to get the buyer to where they want to go.

The second level answer is even more useless. If they enter "status" or "prestige" (which we know is what the customer really wants to buy) it's highly unlikely they'll ever find a site to buy their Harley.

The third level answer to our question is usually where the other two levels meet. They might enter "expensive motorcycle," or maybe "motorcycle sportster," or perhaps even "American motorcycle chopper hog."

NOW, we're talking keywords -- the words or phrases people actually use to search for something.

(Yea, I know the first search term they use will probably be " Harley." That is not where our first and second level answers merge, but rather is the result of very effective branding. It's important, but doesn't help you much unless the name of your business has been just as heavily branded. Someday soon, we'll talk about the importance of branding, but right now we'll continue to concentrate on the more short-term importance of keywords.)

----- Keywords are the Key -----

What are you selling? When you find the answers for all three levels to this question, you're well on your way to developing a keyword list. Just like goodwill, this is an intangible business asset.

Most people think of a keyword list in the same way a fourteen-year-old thinks of a chess game. Move the knight here, the bishop there, and it's a wham bam wanna-play-again game. You can play chess like that, and yea, you can put together a keyword list in maybe thirty minutes, but neither approach will make you a master.

Developing a good keyword list is a complicated, very iterative process that requires time and effort. But that's not a bad thing. If it was too easy, after all, your keyword list wouldn't be worth much as a business asset. The time and effort are investments that can eventually pay big dividends.

Unfortunately, to some extent, keywords fall into the chicken and egg paradox. Which comes first? You can't really fine-tune your keywords until you have an active, running business. But you shouldn't even think about starting a business until you have chosen your primary keywords. Conundrum city!

However, while you can't fine-tune or finalize your keywords you start using them, what you can do is understand some basic principles and concepts. With those as your foundation, there is hope you can select at least the most important keywords for your site.

----- Building a Foundation (or, This is IMPORTANT Stuff!) -----

If we were to graph the relative importance of keywords across the Web, they would be best described by a standard bell curve.

At one end of the spectrum are the keywords that are so highly competitive as to often be useless to you. Go to Google and search for "travel." Notice all the sponsored links? There might be a few government or educational sites (.gov or .edu), but you'll probably notice that most of the results are extremely well known, very long established web sites. Unless you have a few million dollars in the bank, I wouldn't recommended trying to compete with Microsoft or Travelocity for this particular keyword.

This side of our bell curve is filled by the words and phrases that are the most searched on the Internet. They include travel, mp3, jobs, sex (of course), music, food and many more. Millions of people enter these words into a search engine every single day. And thousands of web sites are trying to capture that very big audience. It's a bit like the lottery. The rewards for winning seem gargantuan, but your chances are abysmally slim. You compete for these keywords at your own risk!

Later, I'll offer some tools and references to help you recognize the highly competitive keywords. Much later, on another day, we'll also talk about ways you CAN learn to compete with the big boys with a margin of success. After all, you can't win the lottery unless you at least buy a ticket.

(Did anyone notice I switched from selling motorcycles to selling something in the travel industry? The best keyword for our hypothetical timepiece site would still include "harley," and I wanted to avoid complicating this discussion with trademarks.)

At the other end of our bell curve are keywords that are so highly targeted that John Q. Public will never think to use them when searching. The keyword "peregrination" may be a delightful synonym for travel, but it won't bring you a lot of visitors.

In the middle of our bell curve lies the bull's eye. These are the words and phrases people actually will USE when searching for products or services. And that brings us to another, albeit slightly different, way of looking at the same thing.

----- Painting a Target on Your Customer -----

Contrary to popular opinion, you don't really want visitors. You want customers.

If you could build a web site that came up on the first page when someone searched for "britney spears," you might get a ton of visitors. But you'd be a bit daft to think you could sell many Harley Davidson bikes. The traffic you received would be very untargeted and your bandwidth costs would probably far exceed the occasional and random sale.

Selecting the keywords for which you will optimize your web site is how you prequalify your traffic. Say that three times, real slow. Let it sink in. It's important.

If you select keywords that target your traffic, your conversion rate (how many people visit versus how many people buy) should be fairly high. But, uh, not too high. If your keywords are TOO targeted, you're almost certainly missing a large chunk of your potential market. Gee, that sounds a bit like a curve, too, doesn't it?

It's not really a different curve, though - it's a subset of the same one we talked about previously. The subset includes only the keywords specific to your product or service. At one end are the keywords with the potential to bring hoards of largely untargeted visitors to your web site, at the other end are the words that will bring a very few highly targeted people, and in the middle are the ones that represent an equitable ROI (Return on Investment is something we'll discuss in just a minute).

Unless you're selling U.S. dollar bills for eighty cents each, you will never want to promote keywords too far to the left on the curve. This is a critical point to understand, so let's talk about it for a minute.

----- Using ROI to Select Your Keywords -----

Simplified, we can say Return On Investment = Profit / Cost * 100.

Let's say you get 1,000 visitors per day to your site from the keyword "e-commerce," and it costs you one U.S. penny for each one. That cost comes from several different places, but we don't need to worry about those yet. You just invested $10 USD. Unfortunately, these are very untargeted visitors and you manage to sell only one e-commerce widget, for $5 USD. Your Return on Investment (ROI) is -50%, resulting in a net loss of five bucks. Oops.

Generic keywords can bring lots of traffic, but the traffic is very poorly qualified. They don't stick around your site, and they rarely buy anything. They are window shopping, plain and simple. That doesn't mean you don't want them, of course, because they do have potential value down the road. It just means you don't want to spend all your resources attracting them and, because these terms are often highly competitive ones, they DO require a lot of resources.

Increasing your search engine rank for the keyword "e-commerce" won't help!

Getting 10,000 equally untargeted visitors every day only means you lose $50 instead of $5. When you're losing money, you can't make it up on volume. A negative ROI is always bad. You are too far to the left on the Keyword ROI curve and the only solution is to move to the right, to a more highly targeted keyword.

Generic keywords don't always go negative, of course. But they do invariably result in a lower ROI, sometimes too low for a business to survive.

Okay, so you make some changes to your web site copy, targeting the keyword "e-commerce widgets in Southern Michigan." You now get 10 visitors a day to your site, and it takes a week to sell one of those visitors an e-commerce widget. Those seventy visitors still cost a penny a piece, seventy cents, and you still made $5 on your single sale. Your net profit is $4.30, with a very healthy ROI of 614%.

If you can increase your SE rank for "e-commerce widgets in Southern Michigan," you should get more visitors and your ROI will likely remain the same. Ergo, more money in the cash register. However, if you already have the number one rank for that keyword (and chances are VERY good you would), you'll have to learn to live on $4.30 a week profit. You have a great ROI, but miserable cash flow. Oops.

Of course, what's happened here is that you optimized your site for a keyword too far to the right on our curve. It is highly targeted, with good converstion and a really impressive ROI, but there just isn't enough volume because very few people search that way.

So, you go back and optimize for the keyword "e-commerce widget." You get 500 visitors a day and sell two widgets. That's a net profit of $5 a day, or a ROI of 100%. Note that your ROI has dropped from a very high 614%, but you're making substantially more money. (We are grossly over-simplifying ROI by including only variable costs, but it should still serve to make the point.)

Selecting your keywords is the foundation upon which everything else in SEO must be built.

----- Search Engine Keywords -----

The secret to SEO is to not select a single keyword, or even a handful of keywords. You wants scores, hundreds, possibly even thousands. Instead of a few highly visible keywords that bring in a ton of untargeted traffic, you want a hundred less competitive keywords that TOGETHER can bring in almost as much very targeted traffic.

So, how do you choose the right keywords. The first step in selecting the keywords for your site is to make a whole lot of educated guesses.

Try to find your competitors using just a search engine. Keep good records of what keywords you try and what works best. Note the rank for each keyword for each competitor. Did they come up number one on the search? Or were they buried on page nine? A spreadsheet makes an excellent tool for this early analysis.

The only keywords that really matter are the ones that work, but you can also often get an idea of what your competitors "think" are good keywords. Most browsers will let you right-click on your competitor's page and do a "View Source." Look near the top of their source code for something like META NAME="keywords" and then examine the Contents field that immediately follows it. These are the keywords your competitor has chosen to target, and they might give you a few fresh ideas. If nothing else, you can use these Meta Keywords to fuel additional searches.

Once you've built a list of keywords, take it and set it aside.

Depending on your industry and your own history within that industry, what you used to find your competitors might be of dubious quality. There's a real good chance you know too much to be a "typical" customer.

It's equally true that your competitors may be similarly blinded by their own industry experience. You may be finding their sites only because they targeted the wrong keywords, the keywords on the far end of the curve that only insiders would typically use. If you make the same mistake, the only people who will ever find your site will be your competition!

Ask your friends, someone outside your industry, to do the same thing you already did. Have them find your competitors using a search engine, while maintaining good records of what works and what doesn't work. They should record not only what keywords were used, but the position and page of the relevant sites found. The more people you enlist, the better will be your initial keywords list.

When you compare your multiple lists (yours and those of your friends), and combine them where applicable, you'll have the nexus of a keyword list. This is NOT your keyword list, though. It belongs to your competitors. It's how your competitors can be found. If you use this list, you'll be on a par with your competition (assuming you use it well). That's cool, but we want a birdie.

----- Extending your Keyword List -----

Now that you have a starting point using people (yourself and friends), we're going to see if we can use some software tools to extend the list.

To see this software in action, so we can better understand what to expect, we'll first look at it using example keywords. Go to AltaVista ( ) and do a keyword search for "poetry." When the first page of results are returned, look for the RELATED SEARCHES section to the right. This is an often unrealized and untapped gold mine!

When people go to AltaVista and search for a generic keyword like "poetry," they very often don't like the results. Too much forest and not enough trees. So they try another search, a refinement of their previous, in hopes of getting better results. Indeed, studies have shown that most searches refine their attempts two or three times before getting good results. AltaVista tracks those refinements and when it finds a refined search used by a lot of people, it offers that as a suggestion to others. So, we know that people who start out searching for "poetry" will also very often search for "love poetry" and "poetry contests."

Can you spell i-n-v-a-l-u-a-b-l-e?

We did this first with a simple example, because AltaVista won't offer suggestions unless your keyword is fairly generic. You should try it with each item in your keyword list, but don't be surprised (or discouraged) if many or even most don't return a suggestion list. That just means you already have nicely targeted keywords. Where you do get a suggestion list from AV, however, you should use those to extend your list.

If you think this kind of research is time-consuming, entering each of your keywords into AltaVista, I almost hate to tell you that you're just getting started.

AltaVista isn't the only search engine that includes a suggestion list, and you need to duplicate your efforts on a number of others. Here's a list of some that will help you extend your keyword list - but you should also realize that search engines change almost daily and this list will never be all-inclusive. Every time you use a search engine, check to see if it has been updated to include the Suggestion feature. Then use it!


Getting' tired, yet? I hope not, because we still have a few more stops to make.

Some day soon in the Gazette, we're going to spend time talking about what many call the PPC (Pay Per Click) Search Engines, but we need to understand a little about them right now. There are a lot of PPC Engines, but the two largest are Overture and Google. PPC simply means that when a visitor clicks on the link returned in a search, the web site (who has now become an advertiser) will pay the search engine a few pennies (or dollars) for the visit. How much they pay can be determined in several different ways, none of which matter to us right now.

What is important to us is that the PPC Engines want people to advertise their keywords.

To entice web sites to advertise, some are willing to tell us how many times a specific keyword has been used in their search engine. Virtually all of the PPC Engines will at least give us some good ideas on selecting keywords that are both popular and relevant. Additionally, there are a few tools available designed specifically for PPC bidding.

WordSpot, for example, offers a paid service where they will create a fairly comprehensive report for your specific keywords. A similar paid service can be found at WordTracker, along with a free trial. Google AdWords offers a tool called Keyword Suggestions that gives you a rough idea of popularity (and does offer excellent variations for your keywords), though it doesn't give a true numerical index of the keyword popularity. Like Google's tool, Overture's will offer variations of your keywords, but it will also give a true numerical count of the number of times each keyword was entered by a human searcher. A relatively new tool, at Digital Point Solutions, brings the results of WordTracker and Overture together for a convenient comparison.

Here are the links:







----- Putting It All Together -----

You now have the genesis of a Keyword List. Conservatively, you should have several hundred keyword phrases, and several thousand isn't at all unusual. So far, we've been learning how to move knights and bishops around the board. Building a comprehensive list of relevant keywords is a lot of work, but it's still just work.

It's time now, to move into Mastery.

To refine your list, we should ideally look at the ROI for each Keyword you've gathered. But, uh, what if you don't HAVE any returns on your investment, yet? That's what makes this a chicken and egg paradox. Your best keywords will depend on how much profit they bring, we can't determine profit before we promote the web site, but we need keywords before we start promoting. A vicious circle, indeed.

Remember our bell curve? One side is occupied by keywords that bring a lot of traffic but not necessarily a corresponding number of sales. Most of the people on the Internet make the mistake of optimizing for one of these keywords, often making those terms very competitive. I mean, only so many sites can get a first page ranking, right? You want to avoid these keywords, at least for right now. Let someone else fight over those. Your investment will be too high, your return too low.

The other side of the bell curve is occupied by keywords that don't bring a ton of visitors, but which otherwise convert really well. These are a real goldmine, but often represent a whole lot of work (we'll talk about why in a minute).

In the middle of the curve are the keywords that bring a decent amount of traffic that results in a decent amount of sales.

Your job, now, is to go through your list of keywords and assign each of them a priority number. Make the most competitive, highest traffic phrase number one. Then find the second most competitive, highest traffic keyword. Keep going until every keyword in your list is assigned a number, then sort the list accordingly. Obviously, there's going to a whole lot of guessing. How well you guess, in large part, defines your mastery of SEO. That's the part no one can teach you.

Ready? Go to the last keyword on your list, the least competitive, lowest traffic one, and type it into Google. Visit each of the ten sites on the first page and see if you can determine why the search engine though that site was relevant to that keyword. Ask yourself, could you build a page that was more relevant? If you can honestly answer yes, if you are confident you could get a web page listed in the top ten, go to the next keyword in your list and repeat the whole process. You want to work your way up your list of keywords until you are no longer certain of success.

THAT is where you want to start optimizing your web site.

You've just found the point on your bell curve that takes into account relevance, an estimate of ROI based on competitiveness, and your own gut feelings for your current ability to get results.

Create a web page optimized for that specific keyword. Get some links to it, both internal and, if possible, external. Then, while you are waiting for it to be visited and indexed, move down your list to the next keyword you feel confident is attainable. Repeat this process until you run out of keywords.

Of course, in truth, the process is going to depend on the type of web site you're building. You can't just build pages randomly, and the structure of the site will determine the best way to proceed. However, within that structure, you need to always keep your real goal in mind.

The definitive secret to SEO is deceptively simple, just like learning to play chess. Create a bell curve, pick a point on the curve, then optimize everything to one side of that point. In all likelihood, there won't be a single keyword on that side of the curve that will make your SEO campaign a huge success. Taken collectively, however, they're going to be dynamite. Instead of getting a lot of traffic from just a few high profile keywords, you're going to be getting a trickle from bunches and bunches of keywords, each of which will bring you highly targeted visitors. Make sense?

Here's the good part.

By the time you've exhausted the keywords on one side of your originating point, you should have a really solid foundation. You've built a ton of great content, which in turn, should have brought you some good links. It's now time to return to that point on the curve and start building web pages for the keywords going in the other direction. With your new foundation, you're going to find it much easier to start attaining some of those more competitive terms.

The secret to SEO is keywords, and the secret to keywords is targeted diversity.

* Epilogue (or, Do Not Open Until Christmas) : As your SEO campaign progresses over time, you want to periodically revisit your keyword list and convert all those guessed ROI numbers into real ones. Use your server logs to track which keywords are bringing in visitors and, where possible, which visitors are converting to customers. The more you know about your keywords, the more success you can expect.

Highly Competitive Keywords
In our lead article this issue, we briefly talked about highly competitive keywords. These are the search terms that are among the most frequently used on the Internet, words like "travel" and "music." Most of the time you'll want to avoid targeting these keywords. Competition for these keywords is fierce and going head-to-head with the Big Boys is often a mistake. It will be hard to get a decent ROI because your Investment will necessarily be high to compete.

To avoid them, obviously you need to know what they are. And that's my excuse for providing this list of resources where you can find the most common search terms used across the Internet. Beyond excuses, however, it can be just plain fun to discover what interests the world this week.

Wordspot ( ) is my personal favorite. They used to provide a list of the top 200 keywords directly on their site, but have recently changed their policy and now give it only when you sign up for their newsletter. But that's okay. With the newsletter coming directly to your email box, you won't have to remember to occasionally check on their list. As with all things on the Internet, their list of words should be taken with a grain of salt. Their methodology is good, always getting better, but certainly not perfect.

CNet ( ) offers a list of the top 100 keywords from their own utility which should again (need I say it?) be taken with a grain of salt. While this list is derived from a very substantial pool of data, it doesn't necessarily represent the Internet's "typical" searcher. CNet, after all, tends to attract the more computer-savvy user.

The Lycos 50 ( ) is updated weekly to reflect the top 50 searches at Lycos. I particularly like the way they also show how many weeks a keyword has made the list, as this is obviously important information. If your product or service is seasonal, be sure to check out their archives, too.

Yahoo Buzz Index ( ) is very similar to the Lycos 50, but is limited to the top 20 keywords from the extensive Yahoo database.

Google Zeitgeist ( ) pulls from an absolutely huge database of search queries, is extremely well organized (especially if you're interested in non-English keywords), but is sadly limited to just the Top 10 keywords in multiple categories

Article By Mario Dorizas