I've just read an assertion that 'search engines' (unspecified) look at the ratio between file size and content to determine the page importance. The assertion continues that the text in a 40K file may be considered less important than that in a 10K file and that meta tags such as Author, Content, Distribution are superfluous and should be removed to make the file smaller. Since I haven't heard this one before, does anyone know if there is any basis for this in fact?
As a web developer I've noticed that ASP pages frequently have loads of whitespace (or carriage returns) which will be ignored by a browser when rendering a page but certainly would contribute to the perceived file size. From that standpoint, we should all be demanding that ISPs compress pages before transmission - but most of us dont.
Another factor is that big budget websites tend to be big users of dynamic content management with ASP, PHP and the like - does this mean that they are marked down by search engines? Seems unlikley...
All this then prompted me to think about how I would write a search engine (I was a software developer before becoming a web developer but now do both). I think I'd be parsing the header and body blocks separately which means that if I were to consider the ratio between filesize and content I would only apply to the body and not the header.
Any thoughts anyone?