      I am setting up a new community site and wanted to get peoples opinion on sitemaps and what you use to accomplish this.

          Sitemaps are important for user navigation and to enable search engines to efficiently crawl your website.  Time is money and Google et al budget the amount of time they'll spend crawling your site: you really want to optimize this process for them to ensure all your content is crawled and indexed, which will bring you more organic search traffic.


          On SCN, we are using a document as a site index.  It's linked from the footer of every page in the community:


          We also have extensive XML sitemaps that are manually maintained: we hope to have these automated soon.  There very specific rules for XML sitemaps (e.g. size limitations) and you should submit them to Google and Bing webmaster tools.  Our sitemaps include the following content as long as they are anonymously accessible (i.e. no login required):

          • All spaces overview pages, and corresponding content tab and blog
          • Main gateway pages: welcome, communications, activity, etc.
          • Most recent discussion threads (~200,000)
          • All blogs
          • All documents
          • Active users (has points)
          • Can also include polls and videos if you have those


          There are other considerations too such as providing a robots.txt file with a disallow list (e.g. http://scn.sap.com/robots.txt) and indicating to search engines which parameters should be ignored.

            For the html sitemap, we created a space on the site and added in some custom widgets that allow us to put a list of space and/or group IDs into a system property and the sitemap is then semi-automatically generated by pulling in all of the sub-spaces for the IDs listed and displaying them in the widget. We also have some manual html widgets on the page (with translation) for some of the main navigational items we want to include. There is a link to the sitemap in the footer on the site (www.element14.com)


            For our xml sitemaps, I've been using a tool called InSpyder. It runs on a separate laptop on my desk starting on the 1st of every month. It takes a very very long time to run though because there is so much content on the site for it to crawl through. Initial configuration is a right pain because of all the parameters and permutations of urls that Jive has, but once I got all the exclusions in now it just runs. It has some nice features that allow you to automate uploading of the file to the server via FTP when it completes and pinging the search engines to let them know the sitemap has been updated, etc.


                A whole space just for the site map! Well, it makes sense: provides a nice clean URL. We just went with a document that's manually maintained and linked from the footer aswell: http://scn.sap.com/docs/DOC-19361


                Once you crawl a site on your own you appreciate the work Google and other search engines do for free and across the entire web.  This is why search engine limit the frequency and time for crawling websites: it's important to make sure those visits are optimized to get more content into their indexes!  I'll look into Spyder and it's nice that it handles the post upload/notifications.


                Sadly, I didn't see anything out of the box for XML or HTML sitemaps in version 6

                I wrote a small app to create sitemaps - anyhow for communities with dynamic content (replys, comments) it's hard to do this right.