Submit Your Article Forum Rules

Results 1 to 10 of 10

Thread: Spidering Session IDs

  1. #1
    WebProWorld MVP incrediblehelp's Avatar
    Join Date
    Jan 2004
    Posts
    7,567

    Spidering Session IDs

    Hello people,

    I have a client that I am working on now and noticed a strange occurrence. For some reason Yahoo and MSN are finding URLs with session IDs, spidering them and indexing them instead of the actually normal URL:

    MSN:
    http://beta.search.msn.com/results.a...com+&FORM=QBRE

    Yahoo:
    http://search.yahoo.com/search?p=sit...b-t&fl=0&x=wrt

    It is not happening on Google, but they need to come back to see our new original product page info as it is added:
    http://www.google.com/search?hl=en&q...mlogocases.com

    Any ideas as to why Yahoo and MSN is doing this?

  2. #2
    Senior Member sfowler's Avatar
    Join Date
    May 2004
    Posts
    947
    Yahoo had a couple of days of indexing nnexistent pages on my site. I think it was just hiccups, buecause it stopped after about two days.

  3. #3
    WebProWorld MVP incrediblehelp's Avatar
    Join Date
    Jan 2004
    Posts
    7,567
    but it is happening in the same manner on MSN

  4. #4
    WebProWorld MVP incrediblehelp's Avatar
    Join Date
    Jan 2004
    Posts
    7,567
    Bump,

    Any ideas on this. Is it a OS Commerce shopping cart issue? It is really effecting the rankings of these pages.

  5. #5

    Re: Spidering Session IDs

    Quote Originally Posted by incrediblehelp
    Hello people,

    I have a client that I am working on now and noticed a strange occurrence. For some reason Yahoo and MSN are finding URLs with session IDs, spidering them and indexing them instead of the actually normal URL:

    MSN:
    http://beta.search.msn.com/results.a...com+&FORM=QBRE

    Yahoo:
    http://search.yahoo.com/search?p=sit...b-t&fl=0&x=wrt

    It is not happening on Google, but they need to come back to see our new original product page info as it is added:
    http://www.google.com/search?hl=en&q...mlogocases.com

    Any ideas as to why Yahoo and MSN is doing this?
    Great question ... to complicate this even more, I have an offshoot question from it:

    Once these URL's are spidered and indexed by Yahoo or MSN with the session ID embedded in them ... how does one ever get them removed from their indices?

    Technically speaking, they're valid URL's and will work properly if a visitor clicks on them or if the spider comes back periodically to check them. But, it looks ridiculous to have the same page indexed 5-10 times with the only differentiator being a unique session ID.

    Will this problem sort itself out over time? Or can the search engines be contacted proactively to have them resolve this? Or, just let it ride?

    Best,

    James @ DVDsPlusMore

  6. #6
    Senior Member
    Join Date
    Oct 2004
    Posts
    433
    This is a kind of bug, imho. I've seen affiliate pages with sid ranking well in beta msn. And when you click ot the result you get 404. But it is in beta version, yet. I have no reasonable explanation about the yahoo results.

  7. #7
    Senior Member sfowler's Avatar
    Join Date
    May 2004
    Posts
    947
    There are one or two things about Yahoo! that never got out of beta mode.

    They will probably sort it out in time after everything has been spidered and they come back a few times.

  8. #8
    Junior Member
    Join Date
    Oct 2004
    Posts
    28

    add code to .htaccess files to strip session ID

    I had the same problem with my shopping cart software ecommercetemplates.com

    The answer was to put the following in the .htaccess file:
    <IfModule mod_php4.c>
    php_value session.use_trans_sid 0
    </IfModule>

    If your host is running SUphp like mine, you will instead need create and then add the code to a php.ini file in your root.

    Hope that helps!

  9. #9
    Senior Member
    Join Date
    May 2004
    Posts
    150
    Another method is to programatically check for these bots and remove the session id if detected. This is the way that phpBB boards are usually modified to nix the session id.
    Money Talk || SEO + Directory = SEOMA | SEO 1 | Link Vault

  10. #10
    WebProWorld MVP incrediblehelp's Avatar
    Join Date
    Jan 2004
    Posts
    7,567
    Since the default.php (this page is the one that contained the session ids indexed in the search engine) page was changed to index.php with their new updated website I simply put do not spider code in the robots.txt file for the default.php file.

    Since this was done these default.php pages have now been removed from the search index and the index.php URL strings are now starting to appear.

Similar Threads

  1. Session Ids
    By Fort Worth Realtor in forum Search Engine Optimization Forum
    Replies: 5
    Last Post: 01-12-2007, 11:42 PM
  2. Spidering
    By t'mack in forum Google Discussion Forum
    Replies: 4
    Last Post: 04-24-2006, 10:47 AM
  3. SPIDERING
    By DavidGarfield in forum Search Engine Optimization Forum
    Replies: 6
    Last Post: 03-14-2005, 12:09 PM
  4. Regarding session ids
    By site-report in forum Search Engine Optimization Forum
    Replies: 2
    Last Post: 03-16-2004, 04:31 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •