PDA

View Full Version : Risky Technique For Luring Yahoo Spider Into Your Site



Garrett
05-18-2004, 08:32 AM
If you've got one or a few pages in Yahoo's index but are having trouble getting more pages indexed one thing that might work is putting links to your deeper, unindexed pages high in the code of the pages that are indexed.

EGOL in an SEOChat forum discussion (http://forums.seochat.com/t10504/s.html) said, "I am getting new pages in the index by making them one of the first links on one of my already indexed pages."

He goes on to say that he's not putting links to hundreds of pages up there - just the ones he considers the most important at the time. And he's doing this only in similarly-themed sites so that the links don't seem out of place.

"I have rotated them in and out of my top-of-page navigtation - and I don't think that it stunk up my design or content - these are all similar-theme sites."

The rotation allows him to get more than a few pages crawled from the already existing page.

Another poster in the thread suggested that those concerned with hiding these links to maintain the sites current layout could try hiding your links in a noscript tag.

Scott Harris, our head designer, suggested a more efficient, and definitely risky, method of hiding your links using CSS. Don't try this on your main site - you could kill your business. If you're going to use high risk techniques try them on a test site to see what happens.

His first example was simply to set the visibility of your links to "hidden."

Here's the code as he sent it to me:
Risky Technique (http://www.link.com)

That link would only show up in the html and not on the site itself.

As Yahoo could possibly check links for the phrase "visibility: hidden," you might want to make an external style sheet and name your tags so that your intentions aren't evident to the spider.

Your hidden link would appear to the spider this way:
Risky Technique (http://www.link.com)

The search engine won't know that "risky" means:

.risky {
position: absolute;
visibility: hidden;
z-index: -1;
}

If this technique works in Yahoo it would, in all likelihood, work in Google as well.

While this is probably an ancient high risk concept to some SEOs, I've not read much of this technique for hiding links thus far, and therefore don't know if it's currently an effective high risk technique or not. If you're considering using this or any high risk optimization method be sure to test it out on a sandbox site first. And remember - you're optimizing at your own risk.

If you notice that your competitors have links and text in their source code that doesn't appear on their site you could look for the .css link (do a ctrl + f and type in ".css"). In that style sheet do a ctrl + f and type in "visibility: hidden" to see what stands for hidden text in their source code. And brush up on your CSS (http://www.w3schools.com/css/default.asp) here because there's probably more ways than that to hide text and links.

This may help you uncover some hard evidence of high-risk techniques you can take to the search engine officiators and get your competitor's banned.

Mel
05-19-2004, 04:35 AM
What happened to using good old fashioned site maps to insure that all pages get indexed?

effisk
05-19-2004, 04:58 AM
Hi,

I'm not sure google or yahoo don't visit the .css files. Can anyone tell us if they do? and what about menus stored in .js files? If google doesn't visit these files, then the rest of the website will never be indexed if links to the various pages aren't displayed anywhere else.
and .swf flash animations? Can google read the text contained in the animations?

cheers
ok, I've got the answer for the flash animations:
http://www.webproworld.com/viewtopic.php?t=18475

effisk
05-19-2004, 05:01 AM
What happened to using good old fashioned site maps to insure that all pages get indexed?
The risk is that your sitemap page gets a higher ranking than any other page on your website. I've seen that happen more than once. It's probably not such a bad thing though...

Mel
05-19-2004, 08:49 AM
If your sitemap gets a better ranking than you pages constructed to rank well on a topic, then the optimization must be pretty poor.

spidermonkey
05-19-2004, 08:40 PM
Where did all this "ethical" crap come from?

Let's get back on topic!

CSS can be used to hide stuff that the Google Algo has no chance of seeing - fact!

If you are really sneaky you can put your "dodgy" CSS files in a folder that is excluded in your robots.txt file - fact!

So the only way the Google Algo can spot you is to break the robots exclusion protocol - fact!

If some "white hat" lamo reports you it will take months to be penalised - if at all - Google have expressed a preference for letting the Algo find the cheaters - fact!

Thanks Garrett for this introduction.

I am planning an "advanced" posting sometime soon.

Mike

GilesGuthrie
05-20-2004, 02:44 PM
Where did all this "ethical" crap come from?

Let's get back on topic!

CSS can be used to hide stuff that the Google Algo has no chance of seeing - fact!

If you are really sneaky you can put your "dodgy" CSS files in a folder that is excluded in your robots.txt file - fact!

So the only way the Google Algo can spot you is to break the robots exclusion protocol - fact!

If some "white hat" lamo reports you it will take months to be penalised - if at all - Google have expressed a preference for letting the Algo find the cheaters - fact!

Thanks Garrett for this introduction.

I am planning an "advanced" posting sometime soon.

Mike

The problem here is that the robots could then decide to exclude any <a> tage that has a class attached to it. This would be a very bad thing for people (such as myself) who are using CSS to control the behaviour of different styles of links in different areas (i.e. navigation vs content) in the same page.

eightfifteen
05-20-2004, 04:47 PM
If you want to add links to your page, but keep the integrity, why not use spacer images with links? Does Google consider that an evil thing to do?

carnail
05-20-2004, 06:06 PM
Hold on now; this thing might be new to me: are you guys saying that bots (or rather the software "looking at the html") is indexing the parsed and "rendered" output? I thought most of them ran through the raw html that was outputed to the client (the bot in this case) before it was rendered.

See it like this: If I would disable CSS in my webbrowser, I would see your hidden link, right? The answer is Yes.
So who says that a bot that doesnt care about makeup first renders the page (and enabling CSS) and then scrapes data from the result? If you have a blue background and make the text blue, would the algo care? Is the algorithm blind???? Come on ...!

sri_gan
05-20-2004, 06:21 PM
I'm not sure how far this is worth a testing.

'cause it is pretty clear in the CSS statement it is a hidden object.

The Googlebot (Crawler) might not find it but the Page Parser might track this down very easily.

I'm sure if Google catches this, they will definetely enforce a ban 'cause its mentioned in their guidelines, they don't encourage hidden links.

Not sure of Yahoo yet. It is not that impossible to track.

Mel
05-20-2004, 11:13 PM
Hold on now; this thing might be new to me: are you guys saying that bots (or rather the software "looking at the html") is indexing the parsed and "rendered" output? I thought most of them ran through the raw html that was outputed to the client (the bot in this case) before it was rendered.

See it like this: If I would disable CSS in my webbrowser, I would see your hidden link, right? The answer is Yes.
So who says that a bot that doesnt care about makeup first renders the page (and enabling CSS) and then scrapes data from the result? If you have a blue background and make the text blue, would the algo care? Is the algorithm blind???? Come on ...!

Bots don't parse or render anything they simply crawl and report the results of the crawl back to the search engine.

While google is a very intelligent engine, there is evidence to support the concept that their normal algo cannot recognize even the most elementary hidden text created by using text the same color as the background.

They can and do however make a special effort from time to time to ferret out hidden text (and other spam)and at that time pass out penalties to thousand of sites.

carnail
05-21-2004, 01:45 AM
CSS can be used to hide stuff that the Google Algo has no chance of seeing - fact!


K, so this is bullox then.
If the <A> link is in the output to the bot it will be used in calculations by the algo wheter or not its hidden for the human eye, bcause it doesnt understand CSS (as the top poster has proven by trying to cloak the "hidden" value with a call to a CSS file instead)

A "hiding" made on the clientside wont protect algorithms from seing stuff in an unparsed document retrieved from the server. This is only used for making pages look nice even with x numbers of link at the top.

Dave Hawley
05-21-2004, 02:17 AM
Agree with Mel (and that doesn't happen everyday ;o).

Why carry on bad site design by placing band-aids all over it? Create a site map if you have problems with pages getting indexed.


The risk is that your sitemap page gets a higher ranking than any other page on your website. I've seen that happen more than once. It's probably not such a bad thing though...

"risk"? Funny way to look at having a page ranking well. If my site map ranks higher than another page, I'm NOT going to get rid of the site map! I will try and get the page, I would rather above it,ranking better. If no matter what I try I doesn't work, so what?

If your handbrake on your car is better than your disk brakes, would you work on your handbrake or disk brakes?

Dave Hawley
05-21-2004, 02:23 AM
Where did all this "ethical" crap come from?

Let's get back on topic!

CSS can be used to hide stuff that the Google Algo has no chance of seeing - fact!

If you are really sneaky you can put your "dodgy" CSS files in a folder that is excluded in your robots.txt file - fact!

So the only way the Google Algo can spot you is to break the robots exclusion protocol - fact!

If some "white hat" lamo reports you it will take months to be penalised - if at all - Google have expressed a preference for letting the Algo find the cheaters - fact!

Thanks Garrett for this introduction.

I am planning an "advanced" posting sometime soon.

Mike

WOW! Each to their own but why chance getting banned from Google. For many, it would mean the end of their business.

You sound like the sort of guy that would burn/hide the garage sale sign of your neighbour if you had one on the same Week-end.

OneMoreBite
05-21-2004, 10:50 AM
At the risk of being considered prudish, I'm confused as to why the WebProWorld forum would suggest a technique with dubious benefit and possibly adverse affect? This seems less than prudent, IMO.

It brings to mind the recent postings by a fellow in the http://weightloss.about.com/mpboards.htm forum who was promoting his "fast food only" diet. Unlike Spurlock who produced the currently showing documentary, Super Size Me, this fellow claimed to have eaten nothing but McDonalds and lost weight. It may have worked for him (he claims) but on the whole it seemed a ridiculous thing to promote to a group of people seeking answers and/or advice in regards to gaining better health.

I visit and participate in these forums to discuss and debate innovative ideas, thinking and theories, not scrape the seedy underbelly of "what's on the bleeding edge of acceptable use."

Kathryn

sri_gan
05-21-2004, 12:21 PM
Where did all this "ethical" crap come from?

Let's get back on topic!

CSS can be used to hide stuff that the Google Algo has no chance of seeing - fact!

If you are really sneaky you can put your "dodgy" CSS files in a folder that is excluded in your robots.txt file - fact!

So the only way the Google Algo can spot you is to break the robots exclusion protocol - fact!

If some "white hat" lamo reports you it will take months to be penalised - if at all - Google have expressed a preference for letting the Algo find the cheaters - fact!

Thanks Garrett for this introduction.

I am planning an "advanced" posting sometime soon.

Mike

I strongly disagree with the bolded part of the quote, I have my own reason to say so.

If you guys have time, spend some time in Googlelabs, you will know why I say this.

Mel
05-22-2004, 04:00 AM
IMO anything that is publically viewable in googlelabs has not been added to the algo.

spidermonkey
05-22-2004, 04:43 AM
You sound like the sort of guy that would burn/hide the garage sale sign of your neighbour if you had one on the same Week-end.

Your burning analogy fits in as much as my making a posting to light a fire under this thread was entirely deliberate.

I spotted something in the original posting that I felt wasn't getting thoroughly examined.

Just because I make a posting that is a little contentious doesn't mean I am recruiting for the dark side of the force. I just wanted to get things moving before the topic got buried.

Best regards to all

Mike

spidermonkey
05-22-2004, 04:59 AM
Incidentally, though this topic references Yahoo, I figured it was really relavent to all search engine robots and the state of artificial intelligence being used to find hidden content at the moment.

Mike

morpheus.100
05-22-2004, 09:58 AM
I linked a site map with links dynamic and static to the top of my homepage with
and Googlebot indexed the entire site from site map. Also why do people deem it necessary to place a sitemap link at the bottom of a page. I had this self same link at the bottom of my homepage for over a month and googlebot was well on its way to another page before it got anyway near. It remains to be seen as to whether yahoo will index well as they really dont spider very often , but several other bots have all crawled my site with great sucess. As for CSS files being found by bots the answer is yes. Once bots found their way around my site they were taking an interest in all links even those in the headers.

Dave Hawley
05-22-2004, 09:27 PM
Spidermonkey, it was your
Where did all this "ethical" crap come from? that I was making reference to mostly. I, and I suspect many others, would read that statement as you having no ethics.

People without ethics are best ignored.

spidermonkey
05-23-2004, 04:24 PM
Quote

All men are frauds. The only difference between them is that some admit it. I myself deny it.
H. L. Mencken (1880 - 1956)

Dave Hawley
05-23-2004, 09:52 PM
<sarcasm>
WOW! That must be true if it is quoted from H. L. Mencken.
</sarcasm>

Now, I'm guessing you are also saying you are a fraud as well as having no ethics?

spidermonkey
05-24-2004, 05:02 AM
I make no apologies for taking the role of "Devils Advocate" to try and enliven a debate.

I would never resort to "off-topic" personal attacks on a poster. - They dilute the thread, become boring very quickly and - strangely for a man with no ethics - I consider them to be bad forum ethics.

Can we get back on subject please?

Before you say it... I am not changing the subject. I am trying to re-establish it.

Mike