WebProWorld Part of WebProNews.com
Page One Link To Us Edit Profile Private Messages Archives FAQ RSS Feeds  
 

Go Back   WebProWorld > Search Engines > Insider Reports
Subscribe to the Newsletter FREE!


Register FAQ Members List Calendar Arcade Chatbox Mark Forums Read

Insider Reports Anyone is welcome to reply and discuss but starting new topics is reserved for WebProWorld staff and MVPs.

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 06-03-2005, 11:10 AM
Chris Chris is offline
WebProWorld MVP
WebProWorld MVP
 

Join Date: Jul 2003
Location: KCMO
Posts: 1,110
Chris RepRank 4Chris RepRank 4Chris RepRank 4
Default Google Sitemaps: RSS For The Entire Website?

Well, no, not really. But Google Sitemaps does employ XML technology in order to provide its program members the opportunity to have their site crawled after they make updates or alterations.

Before we get ahead of ourselves and in case you haven’t heard, yesterday, Google launched Sitemaps, a “collaborative crawling” service designed to keep Google informed of modifications to your web site so their search index can reflect these changes… or as Rusty call it, a free pay-for-inclusion program.

Sitemaps works by taking advantage of XML and RSS capabilities. By placing XML code on the web server, you inform Google of when changes occur, and they respond by crawling the updated pages and making the necessary updates to the search index. Over at the Google Blog, Engineering Director Shiva Shivakumar indicated why Google launched Sitemaps:

“Initially, we plan to use the URL information webmasters supply to further improve the coverage and freshness of our index. Over time that will lead to our doing an even better job of delivering more search results from more websites.”

Shiva also gave an extensive interview to Danny Sullivan over at the SearchEngineWatch Blog. In it, Shiva iterates that the Sitemaps program’s current beta state; he won’t guarantee each submitted URL would be crawled. He did indicate that this was something they were working toward, however.

As mentioned, in order to participate in Sitemaps, you have to have a Google account and you have to place an XML file on the webserver being used by your site. This is done in order to inform Google’s crawlers of what URLs to look for and how often these pages change. As pointed out by Rusty, over at SocialPatterns.com, SEM Michael Nguyen broke down an example of his Sitemaps’ XML code, line-by-line; in order to shed some light on what’s actually being done.

The XML file must also the URL of each page you want to be in the Sitemaps program. If you have four pages that undergo frequent change, all 4 page URLs should be listed, if you have an entire site that you want included, you have to include the URL of each page. By employing the changefreq and priority XML tags, you can also indicate how important each page is and how frequently the page changes.

After the XML is complete, you must submit it to the Sitemaps program. This is where the Google account comes in. Once the URL of the sitemap is submitted, your task is complete.

There are a couple of methods you can use in order to get an XML sitemap. A sitemap generator can be downloaded from Google or you can develop one. The generator is an open-source Python file that has to be uploaded to the webserver. According to their FAQ the sitemap generator “can create sitemaps from URL lists, webserver directories, or from access logs.”

You can also develop your own XML sitemap if you so choose. This will have to be submitted as well. The final method Google accepts is a text file containing the URLs you want in the program. Obviously, this method is saved for those who have little-to-no experience dealing with webservers or structural web alterations. It also seems like the text files will be given the lowest priority, at least until the program is off and running.

As to whether or not you should be taking part in the Google Sitemaps program is quite simple: if search engines play any role in your business whatsoever, you should be apart of the program. Having Google’s (or any other search engine for that matter) index reflect changes in your site quickly will only benefit your search engine presence. Or as Nathan Weinberg says, it’d be stupid not to.

An additional area of interest is that Google made Sitemaps as open-source as possible… at least on the XML end. By making the sitemap generator in Python and releasing it under the Attribution/Share Alike Creative Commons license, Google is only furthering their embrace of open-source. This also allows the program to be adapted in order to support other search engines.

Update: Someone who emailed me installed the sitemap generator on his webserver, and evidently the server went boom... or it at least was overtaxed. Here's a quote from his post discussing the event: "Running it brought down my 3200MHz Pentium 4 running Debian Linux and 2 GB of RAM."

Read Theo's report and see for yourselves.
__________________
Former WebProWorld Admin
Reply With Quote
  #2 (permalink)  
Old 06-03-2005, 12:59 PM
suesheboy suesheboy is offline
WebProWorld Pro
 

Join Date: Apr 2004
Location: Boca Raton Florida
Posts: 161
suesheboy RepRank 0
Default

Obviously anyone not taking advantage of this new feature aught to have their head examined!
Reply With Quote
  #3 (permalink)  
Old 06-03-2005, 01:25 PM
strum4life strum4life is offline
WebProWorld Member
 

Join Date: Jul 2004
Posts: 50
strum4life RepRank 0
Default Google Sitemaps

This is great, but it will take me forever to accomplish for all my sites.
__________________
Fantasy Blitz
Reply With Quote
  #4 (permalink)  
Old 06-03-2005, 02:00 PM
strum4life strum4life is offline
WebProWorld Member
 

Join Date: Jul 2004
Posts: 50
strum4life RepRank 0
Default Re: Google Sitemaps: RSS For The Entire Website?

Quote:
Originally Posted by CRich
Here's a quote from his post discussing the event: "Running it brought down my 3200MHz Pentium 4 running Debian Linux and 2 GB of RAM."
If this is the case, screw the SiteMap Generator. If you're using PHP and MySQL you can easily create a dynamic XML file that will never bring down a Pentium 4. Then create a cron job to tell Google when your sitemap has been updated. My cron job looks like this:

This tells Google that your sitemap has been updated every hour.
__________________
Fantasy Blitz
Reply With Quote
  #5 (permalink)  
Old 06-06-2005, 01:44 AM
espectations espectations is offline
WebProWorld Pro
 

Join Date: Jan 2004
Location: South Africa
Posts: 269
espectations RepRank 0
Default

Hi, I clicked on the links in the article and none of them worked?

Did Google take down the pages since launch of this article?

There are a lot of opportunities with this sort of thing - I am glad they are moving in this direction ......
Reply With Quote
  #6 (permalink)  
Old 06-06-2005, 10:12 AM
KeithO KeithO is offline
WebProWorld Veteran
 

Join Date: Apr 2005
Location: Winter Park, FL
Posts: 605
KeithO RepRank 0
Default

only problem is we don't have python installed on our servers. :(
Reply With Quote
  #7 (permalink)  
Old 06-06-2005, 10:41 AM
ReviewGolf.com ReviewGolf.com is offline
WebProWorld Pro
 

Join Date: Mar 2004
Location: USA
Posts: 231
ReviewGolf.com RepRank 0
Default

Actually It's not a such a bad idea, especially if some of the content you'd like to be spidered is dynamically driven...
__________________
Site for sale: http://reviewgolf.com

"Web design is the area saturated by amateurs that confuse software capabilities with their own talent." ~~ me
Reply With Quote
  #8 (permalink)  
Old 06-08-2005, 08:01 AM
Spectur's Avatar
Spectur Spectur is offline
WebProWorld Pro
 

Join Date: Mar 2005
Location: Tampa,Fl
Posts: 239
Spectur RepRank 0
Default

I tried it but once again python is not installed on my isp.. Its a great Idea but why not give a dtd file and lets us create and submit a sitemap xml to google ?

This would be an interesting option.

just a hint to the goggle guy not everyone has access to the command line on there webserver..
Reply With Quote
  #9 (permalink)  
Old 06-08-2005, 11:22 AM
flood6 flood6 is offline
WebProWorld 1,000+ Club
 

Join Date: Sep 2003
Location: Texas
Posts: 1,283
flood6 RepRank 0
Default

Quote:
Originally Posted by Spectur
I tried it but once again python is not installed on my isp.. Its a great Idea but why not give a dtd file and lets us create and submit a sitemap xml to google ?
You can generate and submit an xml sitemap using whatever means you want. They supplied the python code to get the ball rolling.

The specs for the format they want are here.
Reply With Quote
  #10 (permalink)  
Old 06-10-2005, 09:21 AM
greeneagle greeneagle is offline
WebProWorld 1,000+ Club
 

Join Date: Dec 2003
Location: Houston
Posts: 5,716
greeneagle RepRank 0
Default

This protocol seems to have the distinct possibility of tilting in the favor of an elite group of professional spammers.

Ken
Reply With Quote
  #11 (permalink)  
Old 06-13-2005, 12:53 PM
se-survivor se-survivor is offline
WebProWorld Member
 

Join Date: Feb 2004
Location: cyberspace
Posts: 57
se-survivor RepRank -1
Default

Quote:
As mentioned, in order to participate in Sitemaps, you have to have a Google account ...
I'm not so sure about this. You may be able to generate a simple .xml file with a site like this
http://www.googlesitemap.info/thesimplerway/ (ignore steps 1 & 5)
or
http://www.better-business-directory...-generator.htm

and then ping the server anonymously without a Google account:
https://www.google.com/webmasters/si...en/faq.html#s4

..for those of us worried about being assimilated by the Borg's database...which is what this is really all about.
Reply With Quote
  #12 (permalink)  
Old 06-14-2005, 04:31 PM
Chris Chris is offline
WebProWorld MVP
WebProWorld MVP
 

Join Date: Jul 2003
Location: KCMO
Posts: 1,110
Chris RepRank 4Chris RepRank 4Chris RepRank 4
Default

se-survivor:

you are correct sir. my mistake.

Google says:

6. Do I need to sign up for a Google Account?

You don't need an account to generate and submit a Sitemap. However, we encourage you to sign up for an account so that you can track the status of your Sitemaps and view diagnostic information for your submissions. Having an account will not affect your site's ranking within our results. If you already use Gmail, Groups, My Search History, Alerts, or Froogle Shopping List, you already have a Google Account and can sign in with your existing account to use Google Sitemaps.
__________________
Former WebProWorld Admin
Reply With Quote
  #13 (permalink)  
Old 06-15-2005, 09:31 AM
kgun's Avatar
kgun kgun is offline
WebProWorld 1,000+ Club
 

Join Date: May 2005
Location: Norway
Posts: 4,565
kgun RepRank 3kgun RepRank 3
Default Digression regarding RSS and XML.

Is RSS (readers) the next source of fraud / black hat.

There is one advantage with browsers, they are mature. Netscape invented the HTTPS protocol as far as I know.

There are now a lot of various RSS and atom readers and aggregators. Will they be used for creative inventions?

If you use a reader or aggregator, check the company. Call them and talk to them. Myself I use NewsGator. Talked to them for an hour.

What can be exported / imported via RSS (XML).


KBleivik
http://multifinanceit.com/
http://www.multifinansit.no/

P.S. Have you seen the sites that change content of other sites? It is not so easy on secure sites.
Possible to manipulate news (superimpose your own news)? I only ask.
Reply With Quote
  #14 (permalink)  
Old 07-12-2005, 11:10 AM
se-survivor se-survivor is offline
WebProWorld Member
 

Join Date: Feb 2004
Location: cyberspace
Posts: 57
se-survivor RepRank -1
Default submission change

Hey Crich:

Guess what. They closed the door. Now you're right and I'm wrong. World domination is back in play...

6. Do I need to sign up for a Google Account?

You need an Account to submit a Sitemap the first time. This allows you to track the status of your Sitemaps and view diagnostic information for your submissions. If you already use Gmail, Groups, My Search History, Alerts, or Froogle Shopping List, you already have a Google Account and can sign in with your existing Account to use Google Sitemaps. Otherwise, you can sign up for one.

They totally removed the anonymous option.

BTW, I noticed that MSN search indexed my page changes almost the same day I put up my sitemap, where Google is still ignoring it. I think MSN search is loving this.
Reply With Quote
  #15 (permalink)  
Old 07-12-2005, 01:37 PM
Chris Chris is offline
WebProWorld MVP
WebProWorld MVP
 

Join Date: Jul 2003
Location: KCMO
Posts: 1,110
Chris RepRank 4Chris RepRank 4Chris RepRank 4
Default

se-survivor:

nice of them to switch up like that. :-D
__________________
Former WebProWorld Admin
Reply With Quote
  #16 (permalink)  
Old 07-16-2005, 06:10 PM
TheWebDoctor(tm)'s Avatar
TheWebDoctor(tm) TheWebDoctor(tm) is offline
WebProWorld Pro
 

Join Date: Jun 2003
Location: USA
Posts: 249
TheWebDoctor(tm) RepRank 0
Default Inherent problems

After a month of using Google Sitemaps, it's provided no real value. In fact, doing an A-B test with two sites launched the same day, the site without the Google Sitemap has found more of its pages in the Google index than the other site.

We're actually expecting the Google Sitemaps to be no more advantageous than the Add URL for Google.
__________________
Lee Roberts
Reply With Quote
  #17 (permalink)  
Old 07-16-2005, 09:51 PM
brian.mark's Avatar
brian.mark brian.mark is offline
Administrator
 

Join Date: Jul 2004
Location: Omaha
Posts: 2,717
brian.mark RepRank 2brian.mark RepRank 2
Default Re: Inherent problems

Quote:
Originally Posted by TheWebDoctor(tm)
After a month of using Google Sitemaps, it's provided no real value. In fact, doing an A-B test with two sites launched the same day, the site without the Google Sitemap has found more of its pages in the Google index than the other site.

We're actually expecting the Google Sitemaps to be no more advantageous than the Add URL for Google.
We have a site that has been around since 1998. We've been having trouble getting deep crawls (even with static HTML pages), but it's all indexed now that we gave them a sitemap.xml file. For us, it's been huge.

Brian.
__________________
ToolBarn.com, an Internet Retailer Top 500 and Inc. 500 Company | Tool Parts | Pet Supplies
Reply With Quote
  #18 (permalink)  
Old 07-19-2005, 10:10 AM
Spectur's Avatar
Spectur Spectur is offline
WebProWorld Pro
 

Join Date: Mar 2005
Location: Tampa,Fl
Posts: 239
Spectur RepRank 0
Default

So I was digging around in the Dreamweaver exchange and found a cool addon to allows you to generate a sitemap.xml for google no more need for python :)

so if you have DW mx 2004 you can now generate sitemaps very easily...
Reply With Quote
  #19 (permalink)  
Old 07-19-2005, 10:17 AM
Chris Chris is offline
WebProWorld MVP
WebProWorld MVP
 

Join Date: Jul 2003
Location: KCMO
Posts: 1,110
Chris RepRank 4Chris RepRank 4Chris RepRank 4
Default

Quote:
Originally Posted by Spectur
So I was digging around in the Dreamweaver exchange and found a cool addon to allows you to generate a sitemap.xml for google no more need for python :)

so if you have DW mx 2004 you can now generate sitemaps very easily...
gotta link?
__________________
Former WebProWorld Admin
Reply With Quote
  #