View Full Version : Meta tags, file size and ranking
10-17-2003, 03:14 PM
I've just read an assertion that 'search engines' (unspecified) look at the ratio between file size and content to determine the page importance. The assertion continues that the text in a 40K file may be considered less important than that in a 10K file and that meta tags such as Author, Content, Distribution are superfluous and should be removed to make the file smaller. Since I haven't heard this one before, does anyone know if there is any basis for this in fact?
As a web developer I've noticed that ASP pages frequently have loads of whitespace (or carriage returns) which will be ignored by a browser when rendering a page but certainly would contribute to the perceived file size. From that standpoint, we should all be demanding that ISPs compress pages before transmission - but most of us dont.
Another factor is that big budget websites tend to be big users of dynamic content management with ASP, PHP and the like - does this mean that they are marked down by search engines? Seems unlikley...
All this then prompted me to think about how I would write a search engine (I was a software developer before becoming a web developer but now do both). I think I'd be parsing the header and body blocks separately which means that if I were to consider the ratio between filesize and content I would only apply to the body and not the header.
Any thoughts anyone?
10-17-2003, 03:57 PM
I don't know about ASP but PHP outputs only what you tell it to output. it does't put any extra spaces or carriage returns.. I usually specifically add them for code beautification
10-17-2003, 04:02 PM
I'm not that familiar with PHP but when I look at PHP generated pages (like this site) there are clearly whitespaces, tabs, carriage returns, etc which increase the file size. The real question is: does it affect how the search engine ranks the page?
To take a look at what the ses sees, go to:
You'll see that the spiders are not seeing much meaningful content.
You should remove all meta tags except title, desc, kws and possibly robots, as well as any script, css, ect., and link to them from the head to a seperate file.
10-19-2003, 04:23 PM
Thanks for your comments. I already had a look at the spider sim on the SearchEngineWorld site other day and another one at http://www.webmaster-toolkit.com/. These simulations are certainly worth using but presumably are generic and not representing any particular search engine, more the mind of the programmer at the time - note that the sims on each site produce slightly different results for the same input page.
Another issue that strikes me as odd is that if file size is such an issue, how does anyone using Dreamweaver (in design mode) get a ranking at all? In the past I've been able to reduce code by 50% on Dreamweaver files without too much trouble and with no loss of structure. I should say that I only use Dreamweaver once a flood, so I'm not particularly au-fait with its current performance.
The simulator basicly "sees what the spider reads", by skipping the stuff it does not read. Most of them that spider a site, do about the same thing. That's what it does and it's very helpful. I use it all the time on my sites. It's one of the things I do to all of them along with valadating, html compression, kw checker etc. You would be very surprised what some of the sites I work for or see here in the forums, turn up.
You could, for example have a left navagation menu or email signup being read first, and you probably don't want that. Yours says:
Caz Limited +44 (0)117 941 5920 email@example.com We aim to lead you along the path to the website most appropriate for your enterprise Our website clients include... Sally Walker Language Services Red Gate Software South West Arts Marketing Leiths School of Food and Wine QEH Theatre Hayles & Howe Equimat The English Gardening School Hoppin' Mad dance live! Bristol Julian Murphy --> ...and these are some of our other interests Helios Professional Audio GÓteVite Bellcrown France NEXT 2003 © Caz Limited - 19 October 2003 21:39. +44 (0)117 941 5920 firstname.lastname@example.org
There's not much by way of kws or anything for that matter. You are more of an ad for your clients at this point (seen that before). One thing that really works well for me, is to put a strip across the top that includes kws and perhaps important features.
Your homepage not only has little content, but the content it does have, is not very strong. One of my sites:
I just put the strip across the top. It's too early to tell, but if you put it through to the test, the first things read are:
Bingo - Instant - Gala - Mainstreet - Party - Mega - Splash - Glamour - Amigo - Fun Time - Galaxy - Astro - Bingo Casinos: Bucky's Casino Cliff Castle Casino Casino of the Sun Golden Ha:San Harrah's Ak-China Cocopah Casino Fort Mcdowell Casino Blue Water Fun Casino AZ Gila River Hon-Dah Casino Mazatzal Casino Tribal Websites and Info: Hualapai Tribe White Mountain Apache Havasupai Tribe Kaibab-Paiute Tribe Fort Mojave Navajo Nation Pascua Yaqui Quechan Tribe San Carlos
This lists all the bingo on site and all az tribes, especially those that do gaming. When this takes, it should come up under the bingo sites if someone is searching for them. I had hoped I would have better luck with the tribes, but none so far.
As far as reducing dreamweaver code, one thing that seems to be a problem from the onset in matters like this, is that the software setup could have been done differently. Once it is setup, you can modify the setup to get rid of unwanted stuff. I haven't used dw for a while, so look around and see what you can find.
I noticed that the site you mentioned in your last post has a "Dreamweaver Cleaner" on the bottom left. You might want to check it out if you havn't already.
10-20-2003, 09:19 PM
Let's not forget that both Dreamweaver and the new FrontPage have options to clean the html...this includes whitespace. Let's also remember that if we all wrote our code on one single line...it'd be hard to make sense of...hence carriage returns and indenting, among other things.
And the idea of a spider using file size as a factor for rank is rather silly. Besides being able to list how big the site is, what would be the point? Size dosn't equal content. And if they're indexing it based on how fast it could load due to size, that's not really a relevant search result now is it?