Google Webmaster Tools: Part 2, Diagnostic Reports

Once you have set up your site with Google Webmaster Tools, you’ll be able to view two types of reports: Diagnostic Reports and Statistics Reports. In this post, I’ll review the information available in the Diagnostic Tab.

Under the Diagnostic tab, there are three main sections: Summary, Crawl errors, and Tools. Let’s take a look at each one below:

Summary

The summary page showing the following:
1. Whether pages from your site are included in the Google index.
2. The date Googlebot last successfully accessed the home page of your site.
3. Whether you have submitted a sitemap to Google.
4. Crawl errors Google found, including:

  • HTTP errors¬
  • Not found
  • URLs not followed
  • URLs restricted by robots.txt
  • URLs timed out
  • Unreachable URLs

This error report is useful in helping you identifying incorrect links to your site, especially from internal links.

Crawl errors

The Web crawl report under Crawl errors shows any crawl errors in more detail. The Mobile Web report shows any crawl errors for your mobile site in CHTML, WML/XHTML.

Tools

robots.txt analysis: This report shows whether Google found a robots.txt file in your site. You can also experiment changing the content of robots.txt file and see how that affect Google’s crawlers.

Manage site verification: This report displays information webmasters need for verifying that they are indeed the owner of the site.

Preferred domain: Google allows you to specify whether you want Google to think www.sitename.com and sitename.com are the same. This is the one functionality that I think is the most valuable for Google Webmaster Tools. Since you have no control on how other people link to your site, you’ll want to make sure that Google knows that links to www.sitename.com and sitename.com are the same (this should usually be the case). This way, your site can get full credit for all the incoming links.

Google Webmaster Tools: Part 1, Setting Up

In this post, I’ll talk about how to set up your site in Google Webmaster Tools. In subsequent posts, I’ll look at the reports available in Google Webmaster Tools.

To add a site to Google Webmaster Tools, do the following:

1. Go to http://www.google.com/webmasters/sitemaps.

2. Login with your Google account.

3. Type in your site (starting with http://) into the text box and click on the OK button.

4. Google will show you some initial information it has on the site, such as whether pages from this site are included in Google’s index, and the date Googlebot last accessed your home page.

5. Click on the “Verify your site” link to verify that you are the owner of the site.

6. There are two ways to verify: Add a meta tag to the site’s homepage (Google will tell you what the meta tag should look like), or upload a HTML file to the site’s root directory (Google will tell you the file name to use). Choose your method, and you’ll be given directions to set up properly.

7. Once you have either added a meta tag to the site’s homepage, or upload a HTML file to the site’s root directory, you can click on the “Verify” button. You don’t need to do this right away — you can always come back later when you are ready.

8. You may also submit a sitemap to Google. To add a sitemap, click on the “Add a Sitemap” link for the site after you log in to Google Webmaster Tools.

9. Next, select whether you are submitting a regular web sitemap or a mobile sitemap.

10. Specify the location of the sitemap in the textbox.

11. Before you click on the “Add Web Sitemap” button, you’ll then generate a sitemap for your site. A simple way to generate a sitemap is covered in an earlier article titled Creating a Simple Google Sitemap. Once it’s generated and uploaded to your site, click on the “Add Web Sitemap” button.

12. That’s it. In the next post, I’ll take a look at the reports you can see in Google Webmaster Tools.

Yahoo Site Explorer

Yahoo Site Explorer (http://siteexplorer.search.yahoo.com) is a service provided by Yahoo that shows what the Yahoo search engine knows about your site, specifically which pages are indexed, and the number of inbound links. If you register, you can submit a feed to Yahoo to ensure that Yahoo knows all your pages.

The most useful part of Yahoo Site Explorer is the number of inbound links. As we all know, the link: command provided by Google is notoriously inaccurate, and it appears to me that the count returned by Yahoo Site Explorer is more reliable. In addition, Yahoo has made it easy to exclude inbound links from the same domain or subdomain. This is helpful as webmasters often want to know both the total number of inbound links, as well as the inbound links from external sites. It’s true that the above information was already available on Yahoo before Site Explorer, but it is much easier to get at the information now.

One nice feature about Yahoo Site Explorer is that it’s easy to explore different URL’s. For example, as you are looking through all the pages that link to a particular site, a “Explore URL” button appears as you mouse over each page. You can click on that button and instantly get the information for that page. That was very convenient for me.

I also authenticated my site (you’ll need to use your Yahoo ID) to see what additional information I can get. I found out basically the only benefit of authenticating your site is that you can send a feed (basically a sitemap) to Yahoo, and Yahoo tells you when its crawler last accessed that feed. This is less than what I was hoping for. For example, Yahoo does not tell you it cannot find a page listed in the feed, nor does it tell you how many times your site was clicked on from within Yahoo Search.

Overall, Yahoo Site Explorer is a useful product for users to find out more information about a site/page. As a webmasters tool, it lags behind Google’s Webmasters Tool product. I would recommend that you authenticate your site through Yahoo Site Explorer so that you can submit your feed to ensure Yahoo picks up your new pages, but that’s pretty much it.

Comparing Major Search Engines

When it comes to the complexity of search engine algorithms, it is known that MSN is the least sophisticated, Yahoo (Inktomi) is better than MSN, and Google is the most advanced. With that in mind, it is slightly surprisingly to me that I’d find the following rankings for my SQL Tutorial site:

Query Term = SQL Tutorial
Google rank: 3
Yahoo rank: 9
MSN rank: 12

Note that those rankings are all coming from the .com version of the site.

The relative rankings were somewhat unexpected because I had applied all the basic SEO techniques to this site, which should lead to the site ranking well on MSN and Yahoo, which focus more on page content than Google does. But as you can see, this is not the case.

I ran across an article by Aaron Wall of SEO Book.com on search engine relevancy, which shed some light on this matter. Aaron pointed out that Yahoo and MSN results tend to favor commercial sites, while Google favors information/content sites. As my site is clearly content-oriented (confirmed by doing a search at Yahoo Mindset), this explains why it ranks better on Google than on Yahoo and MSN.

Google Trends

Google Trends (trends.google.com) is a product from Google that allows users to look at how query term volumes have changed over time.

To use Google Trends, type one or multiple query terms (separated by comma) into the search box, and click the “Search Trends” button. A graph displaying the relative number of times this query term(s) was searched since the beginning of 2004 is shown. You can also view the query term distribution by region (this is basically country), city, and language.

Using Google Trends, a user is able to:

1) Understand the relative search volume of a query term over time.
2) Understand seasonality of query terms.
3) Compare multiple query terms.
4) Examine how the above trends vary with geography and language.

Many people can benefit from this tool, from SEOs doing keyword search to marketers understanding seasonality trends. Personally, I use this tool sometimes just to satisfy my own curiosity. For example, by typing in “nfl, nba”, I found that NFL is more popular in the U.S. than NBA. In virtually all other countries, though, NBA is more popular (i.e., searched more frequently) than NFL.

One thing to note is that Google doesn’t give out the absolute number of times each query term is searched. This does not diminish this product’s effectiveness.

Creating a Simple Google Sitemap

There is a way to tell Google what pages exist on your website by creating a sitemap in XML format. Note that this is different from a sitemap HTML page, even though both share the same purpose of making sure the search engine sees your pages. The main difference is that your human visitors will not be viewing this XML sitemap file, where as the HTML sitemap file is geared towards human visitors and search engine bots alike.

There are several pieces of information you can tell Google about your pages, and I will only discuss the most useful portion — listing all the pages. Other information, such as frequency of page updates, is used by Google only as a suggestion, and I don’t use it myself.

To list all your pages is pretty easy, and you can easily generate that file on your own. The syntax is as follows:

<?xml version=”1.0″ encoding=”UTF-8″?>
<urlset xmlns=”http://www.google.com/schemas/sitemap/0.84″>
¬ ¬ <url>
¬ ¬ ¬ ¬ <loc>[URL 1]</loc>
¬ ¬ </url>
¬ ¬ <url>
¬ ¬ ¬ ¬ <loc>[URL 2]</loc>
¬ ¬ </url>
¬ ¬ …
</urlset>

Simply replace [URL 1], [URL 2], etc, with your own pages.

That’s it! Next, save this file as sitemap.xml. Then, you’ll want to create an account in Google to use its Webmaster Tool to inform Google about the presence of this sitemap file. This way, you can ensure that Google knows about all of your pages. Going forward, whenever you add new pages to your site, just append to the sitemap.xml file and Google will know about those pages.

Google Analytics

I have been testing Google Analytics for a couple of months now, and thought it appropriate to share my experience with this tool.

Setup

You’ll sign up using your Google/Gmail account. There used to be a wait time before Google will give you access, but now you can instantly use Google Analytics. For each page you want to track, you will stick a short Javascript snippet at the end of your HTML page. This can be time-consuming if you have lots of pages and no way to do it automatically.

Update Frequency

Data seems to be updated twice a day, once just past noon and once just past midnight (Pacific Time).

Type of Metrics Reported

At the summary level, it shows Visits/Pageviews, New/Returning Visitors, Visitor Geography, and Visits by Source (referer). If you are only interested in a high-level understanding of your traffic, this summary page by itself would be sufficient. There are additional reports that would allow you to dig deeper. In addition, Google Analytics also provides a way to track goals (basically you are specifying a webpage as your destination page, or the end of your funnel) as well as integration with AdWords.

Other Comments

There are some concerns in the webmaster community that Google might be having too much of your site’s information, and one day Google will turn this against the webmasters. Personally, I am not concerned. My views are:

1) Allowing Google to get more information on my visitors could in the future help them drive more traffic to my site, so actually it might be a positive.

2) Google can already track a lot of those information via the Google toolbar anyway, so using Google Analytics for website traffic analysis really doesn’t pose an additional risk.

3) Google has repeatedly said that the data store for Google Analytics is separate from all other systems at Google, so the webmasters should not worry. Actually I am a bit more suspicious about this claim, because the fact that data is housed separately doesn’t guarantee that other groups within Google cannot access that information.

Summary

At the high level, this is a nice tool for someone who doesn’t have another way of analyzing his/her site traffic. Setup is straightforward, and the reporting interface is easy to use. At the same time, given that it only gives you the top channels (for example, top referers), Google Analytics is not sufficient for detailed web site traffic analysis. For that, you’ll want to go with a third-party tool or write your own script. In the future, I’ll talk about my experience in analyzing my site traffic.

Name for Danny Sullivan’s Next Conference

Writing about the possibility that Danny Sullivan might start his own conference got me thinking… what would be a good name for that conference? Below are some of my ideas:

Search Engine Look
Search Engine Monitor
Search Engine Series

But I think he should use the name of…

Search Engine World

This would be most fitting, as participation of SES from outside of the US is definitely growing, and even though the most dominant search engines are still US based, there are a number of local search engines that are doing well as the local level. This name would most represent the global nature of this conference.

Danny, what do you think?