|
|
||||||
|
||||||
| Index Link To US Private Messages Archive FAQ RSS | ||||||
| Google Discussion Forum Google Discussion forum is for topics specifically related to Google. There is a subforum dedicated to AdSense/AdWords subjects. |
Share Thread: & Tags
|
||||
|
![]() |
|
|
LinkBack | Thread Tools | Display Modes |
|
||||
|
I thought I would ask here before digging in this further. Obviously I have my own blog:
http://www.jaankanellis.com/ Take a look at this site: operator query: http://www.google.com/search?q=site:...&start=10&sa=N I do have plenty of pages indexed as supplemental results in Google right now, some for good reason. I pretty much understand why and that is really not the concern here. The really concern is Google indexing duplicate posting from the poll plug-in I have installed on the blog. See the site: operator results have plenty indexed: http://www.jaankanellis.com/page/19/...rue&poll_id=5/ Normally I would not worry, but it seems to be effecting my traffic and indexing in Google. See Google has indexed the crappy poll URL: http://www.jaankanellis.com/page/19/...rue&poll_id=5/ as the main way to crawl my web content when in actuality it need to be crawling the URL below which has been placed in the Supplemental Index: http://www.jaankanellis.com/jagger3-here-it-comes/ As I usually preach and most understand here Google normally doest do this. "Usually they are smart enough to pick the right URL. In this case they have not. So my question is 1. How do I block the robots from accessing this "crap" URL? 2. What is the easiest way to fix this other than just blocking the bots from those URL? |
|
||||
|
Quote:
|
|
||||
|
So you changed the post title. IMO it is important to think some seconds of a good post title with relevant KW's, both for SE bot indexing, your own ad and finding the post at WPW.
|
|
||||
|
Quote:
Wonder if they're following java? The href for the view results link? Dave |
|
||||
|
Switch the actual content of each page or blog entry, (while keeping a back up) then the information and navigation you wanted to be your entrance page will be there, put the poll page in the entrance page place.
I place no credence in the ability of any bot to determine what is a head and what is a tail, it's all math, Figure out the math and your good. Art doesn't enter into it. |
|
|||
|
Am I thinking too simple or am I missing soimething?
1) Use the robots.txt to bar the googlebot from that URL. 2) Use the meta tag nofollow 3) use .htaccess to bar the google bot from that URL. |
|
|||
|
A lot of blogs right now are getting hit with a lot of supplemental results because there are too many of the same ways to get the same information, such as by date, category, or by the RSS/feed URL's. From a search engine standpoint the only one that should matter is the original blog post URL. Block out the other ways to access that same info in a robots.txt and your supplemental index ratio will go way down.
Obviously you don't want to block the feed readers from these URL's, but just Google/Slurp/MSNbot. |
|
||||
|
Quote:
If it was one URL, I wouldn't have to worry to much. In fact it is probably impossible for me to know all the URLs Google has indexed this way...blah. |
|
|||
|
Going a little further on what I wrote before, your poll is on every page so it could be that Google sees the URL's given by the poll as stronger than the URL you are hoping for because it is used more often on the site. My guess is that if you follow my advice above and disallow Google to the directories you do not want indexed and submit a sitemap to webmaster central that your problem will resolve itself within a reasonable amount of time. Take it a step further and do a 301 redirect if you can for the pages are most affected by this. Redirect the bad URL that is currently indexed to the one you want.
|
|
|||
|
Hi incrediblehelp, it's really a problem because the pages are so many - is this pages are orphan? the pages are linked from your main site? (you can find in these pages in google webmaster account). if not you have to add no index no follow tag in each page.
|
|
||||
|
Jaan I am not sure if I really understood the problem.
If it is about URLs like this http://www.jaankanellis.com/page/19/...rue&poll_id=5/ then you can add in your robots.txt the rule: Disallow: *?
__________________
"Being an expert isn't telling other people what you know. It's understanding what questions to ask, and flexibly applying your knowledge to the specific situation at hand. Being an expert means providing sensible, highly contextual direction." Jeff Atwood SEO Workers - Search Engine Optimization Consulting Company | SEO Analysis Tool | Webnauts Net SEO |
|
||||
|
Quote:
|
|
||||
|
Jaan...
I did a search for the following... jal_no_js=true&poll_id=1/ Here's what I got... http://www.google.com/search?hl=en&r...poll_id%3D1%2F Here's a link to the first result I see... http://forum.semiologic.com/discussi...seo-nightmare/ Perhaps this helps? Dave |
|
||||
|
Quote:
"The fix is to use plugins that enforce permalinks. You then get a 301 redirect to the proper uri when this occurs. End of story." I am already doing this. Obviously the permalink plugin/code I am using is not picking up these as dups and initiating the 301 redirect. Arrggggghhhh |
|
||||
|
Quote:
Quote:
Then perhaps use the console to delete the URL's. Dave |
|
||||
|
I posted at the following locatons looking for help:
Forum: Democracy is not an SEO nightmare plugin not 301 redirecting poll URLs by incrediblehelp |
|
|||
|
Go here and sign up for google sitemaps:
https://www.google.com/accounts/Serv...Fhl%3Den&hl=en Then go here and download the sitemap xml generator: http://sourceforge.net/project/showf...kage_id=153422 Run the program on your site to index and create a site map, once you have a sitemap you can assign each page an importance weight as a number, so your most important page would be weighted 1, you can even exclude pages from the google bot from the created google sitemap. Next upload the created sitemap files into the root directory of your website on your server making note of the file names. Now log into your google sitemap account and were it says submit a sitemap you will submit the xml files one at a time, example sitemap.xml, sitemap1.xml. Once submitted it takes google about 20 minutes for the bot to come to your site to verify the sitemaps are correct and then they will start to read the sitemap and the bot will act accordingly. You will also receive the following data from google in the sitemap admin panel: Crawl errors Web crawl Mobile Web robots.txt analysis Crawl rate Preferred domain Enhanced Image Search This is from googles help files on sitemaps: Index stats use our advanced operators to provide you with sample results about how your site is indexed. We've used these advanced operators to return information about your home page. Click on the link to view a list of results. Stats that may be available are: Indexed pages in your site - uses the site: operator to return a sample list of your indexed pages. Pages that refer to your site's URL - uses the allinurl: operator to return a sample list of pages that mention your site's URL. Pages that link to your site - uses the link: operator to return a sample list of pages that link to your site. The current cache of your site - uses the cache: operator to return the current cache of your home page. If you don't want Google to cache your site, you can specify this in the <head> section of your pages. Information we have about your site - uses the info: operator to return the description we have of your site. Pages that are similar to your site - uses the related: operator to return a sample list of pages that we consider similar to your site. Page analysis stats provide information about how the Googlebot views the crawled pages of your site. Stats that may be available are: Type - the content type of your crawled pages. We use content type for the File Format search option in our Advanced Search. Encodings - the encoding used by your crawled pages. Common words - words in your site's content, and in external anchors to your site. Query stats provide information about search queries that have returned pages from your site. If your site is listed in Google Mobile Web Search results, these queries are listed as well. Average top position is the highest position any page from your site ranked for that query, averaged over the last three weeks. Since our index is dynamic, this may not be the same as the current position of your site for this query. For detailed information about our search query syntax, see the Google Web API reference. You can click on any listed seach query to view the results of that query. Stats that may be available are: Top search queries - list the top queries that return results from your site. Note that this list is unrelated to where your site is listed in the search results. Top search query clicks - the top search queries that directed traffic to your site. These are the top searches that caused users to click on a link to your site. Hope this helps you Freddie Molto CMO http://www.sdcmedia.com |
|
|||||
|
Quote:
Quote:
Quote:
My advice in order of priority:
|
|
||||
|
Just to clarify:
1. I could fix this through robots.txt and I will as a last resort. Here is the code I could use: User-agent: Google Disallow: *jal_no_js* and User-agent: Slurp Disallow: *jal_no_js* 2. I would like to 301 redirect the crap URLs to the correct ones and I working on that now. 3. It is not about blocking the URLs. First Google shouldn't be giving these any weight to begin with. They should be using the correct ones, but we cant cry over spilled milk. 4. Building a sitemap or using Webmaster console has nothing really to do with this. In fact I am big fan of NOT using a sitemap if your website/blog is getting adequate spidering without it and I am. 5. There is no "incorrect" code here. Google is just picking the wrong URLs. 6. I am using permalink as you can see with the URLs being rewritten on the fly with the post title as the subdirectory. 7. It is counter productive to look up 100s or 1000s of URLs and manual delete them. 8. I really appreciate all comments thus far. |
|
||||
|
Quote:
He had a lot of slogans, among others this one: "I can not answer every question or solve every problem, but I can hit this button, and ask one of my employees." But perhaps, that is not an option when you have done a redirect. Myself, I love permalinks. |
|
|||
|
Quote:
|
![]() |
|
| Thread Tools | |
| Display Modes | |
|
|
|
WebProWorld |
Advertise |
Contact Us |
About |
Forum Rules |
MVP's |
Archive |
Newsletter Archive |
Top |
WebProNews
WebProWorld is an iEntry, Inc. ® site - © 2009 All Rights Reserved Privacy Policy and Legal iEntry, Inc. 2549 Richmond Rd. Lexington KY, 40509 |