Sunday, July 10, 2011

SEO Tip: Dynamic Google XML Sitemaps Generation in C#

Sitemaps are a way to tell Google about pages on your site we might not otherwise discover. In its simplest terms, a XML Sitemap—usually called Sitemap, with a capital S—is a list of the pages on your website. Creating and submitting a Sitemap helps make sure that Google knows about all the pages on your site, including URLs that may not be discoverable by Google's normal crawling process.

Table of Contents



Introduction


As the above paragraph says, a Sitemap is useful for Googlebot because it contains the addresses of the pages of your website, particularly those it might not be able to crawl by itself. But there's also more to that. Each URLs can have some attributes specified when added to the Sitemap so there's not only the "what" but also the "how". For example, you can specify the priority of an URL (more on that in a moment) so you have some control on how your site could be crawled. And nowadays, the every bit you can squeeze out of a SEO technique, the better it is. In fact, there are many well renowned that are talking about Google Sitemaps. So it is really something to consider when globally establishing a SEO strategy for your site.

Sitemaps XML Format and Protocol


A Sitemap is an XML file containing the addresses of the pages of a website (all of them or only a subset). If a page is not in your Sitemap but is reachable by the Googlebot, don't worry Google will index it anyway (considering it is not excluded in a robot.txt file). So it is not mandatory to have all your pages listed in your Sitemap if you begin to use one. Here's what an XML Sitemap looks like:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>http://blog.mikecouturier.com/</loc>
    <lastmod>2005-01-01</lastmod>
    <changefreq>monthly</changefreq>
    <priority>0.9</priority>
  </url>
  <!-- [...] -->
</urlset>

After having defined you Sitemap, it must reside on your web server at the same address of your site and be submitted in the Google Webmaster Tools (explained later).

The meaning of each property for an url is detailed on the sitemaps.org website.

The Goal


The real purpose of this article is to show you how to dynamically generate in C# the content of your Sitemap. It is particularly useful when you have a single page for example, that is responsible to show every product in your database. Instead of hand editing your Sitemap every time your inventory changes, with my code, you can easily query your database and add each of the URLs dynamically!

Outputting a Sitemap from an aspx page is also very convenient because when Google will query your page to obtain the Sitemap, it will always have the latest version of the skeleton of your site.

Anyways if it does not makes sense to you right now, just read the remaining of the article to get a clearer picture!

Generating a Dynamic Sitemap in C#



Step #1 - Create a Page That Will Serve Your XML Sitemap

Let's say you create a page called sitemap.aspx at the root of you site. All you have to do is to use the classes I provide at the end of the article to dynamically generate your Sitemap like this:

protected void Page_Load(object sender, EventArgs e)
{
  Sitemap sitemap = new Sitemap();

  sitemap.Add(new Location()
  {
    Url = "http://blog.mikecouturier.com/",
    LastModified = DateTime.UtcNow.AddDays(-1)
  });

  sitemap.Add(new Location()
  {
    Url = "http://blog.mikecouturier.com/2011/07/create-zoomable-images-using-google.html"
  });

  sitemap.Add(new Location()
  {
    Url = "http://blog.mikecouturier.com/p/sitemap.html",
    ChangeFrequency = Location.eChangeFrequency.daily,
    Priority = 0.8D
  });

  // one more random example
  for (int i = 0; i < 3; i++)
    sitemap.Add(new Location()
    {
      Url = "http://blog.mikecouturier.com/dynamic-url/" + i
    });

  // In MVC you would just return an XmlReturn, in WebForms
  // you can do this...
  Response.Clear();
  XmlSerializer xs = new XmlSerializer(typeof(Sitemap));
  Response.ContentType = "text/xml";
  xs.Serialize(Response.Output, sitemap);
  Response.End();
}

So whenever your sitemap.aspx is called, it generates on the fly this following XML Sitemap:

<?xml version="1.0" encoding="utf-8"?>
<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>http://blog.mikecouturier.com/</loc>
    <lastmod>2011-07-04T23:04:05.859375Z</lastmod>
  </url>
  <url>
    <loc>http://blog.mikecouturier.com/2011/07/create-zoomable-images-using-google.html</loc>
  </url>
  <url>
    <loc>http://blog.mikecouturier.com/p/sitemap.html</loc>
    <changefreq>daily</changefreq>
    <priority>0.8</priority>
  </url>
  <url>
    <loc>http://blog.mikecouturier.com/dynamic-url/0</loc>
  </url>
  <url>
    <loc>http://blog.mikecouturier.com/dynamic-url/1</loc>
  </url>
  <url>
    <loc>http://blog.mikecouturier.com/dynamic-url/2</loc>
  </url>
</urlset>

Step #2 - Put Your Modifications Online

This step should be relatively easy. You can test your sitemap by navigating to http://yourdomain.com/sitemap.aspx for example.

Step #3 - Submit Your Sitemap to Google Webmaster Tools

This assumes you already added your site to the Google Webmaster Tools website. If not, here's how you can do it: Adding a site.

  1. Navigate to the Google Webmaster Tools website
  2. Choose your website from the list
  3. Open up the Site configuration menu
  4. Click on Sitemaps
  5. Click on Submit a Sitemap

A special note about paths (taken from sitemaps.org): The location of a Sitemap file determines the set of URLs that can be included in that Sitemap. A Sitemap file located at http://example.com/catalog/sitemap.xml can include any URLs starting with http://example.com/catalog/ but can not include URLs starting with http://example.com/images/.

That's it!

Having Multiple XML Sitemaps


In case you want to have multiple Sitemaps, instead of submitting each and every one of them to the Webmaster Tools, here's what I do:

Define a sitemaps.xml file at the root of you site (adjust the file to your liking):

<?xml version="1.0" encoding="utf-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
   <sitemap>
      <loc>http://example.com/existing_sitemap.xml</loc>
   </sitemap>
   <sitemap>
      <loc>http://example.com/dynamic_sitemap.aspx</loc>
   </sitemap>
</sitemapindex>

Then, you only need to submit this sitemap once to Google Webmaster Tool, as the inclusion of the ones defined inside will be done automagically by Google! (Google reads your Sitemap often, not only when submitting it so any changes made to the file will be picked-up by Google).

C# Code - The Sitemap and Location Classes


Here are the Sitemap and the Location classes you need to include in your project for the whole thing to work:

[XmlRoot("urlset", Namespace = "http://www.sitemaps.org/schemas/sitemap/0.9")]
public class Sitemap
{
  private ArrayList map;

  public Sitemap()
  {
    map = new ArrayList();
  }

  [XmlElement("url")]
  public Location[] Locations
  {
    get
    {
      Location[] items = new Location[map.Count];
      map.CopyTo(items);
      return items;
    }
    set
    {
      if (value == null)
        return;
      Location[] items = (Location[])value;
      map.Clear();
      foreach (Location item in items)
        map.Add(item);
    }
  }

  public int Add(Location item)
  {
    return map.Add(item);
  }
}

// Items in the shopping list
public class Location
{
  public enum eChangeFrequency
  {
    always,
    hourly,
    daily,
    weekly,
    monthly,
    yearly,
    never
  }

  [XmlElement("loc")]
  public string Url { get; set; }

  [XmlElement("changefreq")]
  public eChangeFrequency? ChangeFrequency { get; set; }
  public bool ShouldSerializeChangeFrequency() { return ChangeFrequency.HasValue; }

  [XmlElement("lastmod")]
  public DateTime? LastModified { get; set; }
  public bool ShouldSerializeLastModified() { return LastModified.HasValue; }

  [XmlElement("priority")]
  public double? Priority { get; set; }
  public bool ShouldSerializePriority() { return Priority.HasValue; }
}

Suggested Reading


Today's leading SEO Book, : Search Engine Optimization Strategies for Dominating the World's Leading Search Engine, 2nd Edition, is a tell-all search engine optimization guide for anyone trying to reach the highly coveted #1 ranking on Google for their website or blog. Updated and expanded with the latest information on search engine optimization (SEO) and including more than 20 new pages of proven search engine optimization techniques.

With millions of websites on the Internet, search engine optimization - or SEO - is the difference between ranking one in Google and remaining invisible. breaks the latest SEO techniques into 50 tips that can help any website - from a small company to the largest corporation - rank number one in the search engines and boost your visitor traffic and sales.

Conclusion


So, we saw how to generate a Sitemap dynamically so you can:

  1. List every dynamic page of your website
  2. Control any attributes on how each URL is indexed

I hope you liked the article (if you did, don't forget to like it on Facebook at the beginning).

If you have any comments or questions, don't hesitate!

See ya

8 comments:

hey!!!
I found this blogging is really great one which teaches me to make a sitemap for right SEO.

Thanks Mike! You saved me a ton of time rolling all that C# code myself.

Anonymous Avatar Anonymous said...

wow fantastic,this is a best artice where i have visited related to sitemap

samuel said...

Very Good Article, about creating sitemap.. saved a day's time ;)

Thanks

Matt said...

Great code, thankyou. As above - saved a decent amount of time's effort doing it myself... :-)

Shaminder said...

Gr8 code !! You saved my time..
Thanx very much.. keep up the gud Work..

Anonymous Avatar Anonymous said...

sitemaps.org mentions that the last-modified date should be w3c compliant.

Did you have any issues with that?

thanks for useful article. I have a question. who to separate news section and image/video section in my sitemap so that google can find my news and image/videos

©2009-2011 Mike Gleason jr Couturier All Rights Reserved