Fuzzy Search Engine Math: Whos Really Bigger?

Just when you thought someone was going to settle this once and for all, you’re thrown a curveball. Yahoo! says they’ve got a bigger index. Google says, “We don’t buy it.” And now a third party comes out on the side of Google, but has some suspect research methods. This is really, really confusing.

Last week, Yahoo!’s Tim Mayor came out twirling impressive numbers on Yahoo! Search Blog, claiming that with 20 billion indexed items, the #2 search engine had more than doubled Google’s retrievable information.

Google whirled around in their chairs and let everyone know that Google scientists were unable to verify those numbers.

This week, the National Center for Supercomputing Applications (NCSA) in conjunction with the University of Illinois at Urbana-Champaign, “conducted a brief study” of their own. The researchers, Matthew Cheney and Mike Perry, found that Google actually returned 166.9% more results than Yahoo!. Among the 10,012 test cases run, Yahoo! only returned more results than Google 3% of the time.

That’s a huge disparity in claims between Yahoo! and NCSA. One says it’s twice Google’s size, the other says Yahoo! isn’t even close to being in the same league as Google.

While the study would seem to lend support to Google’s contention that they have not been able to verify Yahoo!’s numbers, the study’s research methods give one pause.

“Unfortunately, both the Yahoo! and Google search engines truncate results returned to the user after 1,000 results. Thus, for the purposes of this study, we were forced to restrict our searches to those queries that returned less than 1,000 results on both Yahoo! and Google. Any search result found to have more than 1,000 returned results on either search engine was disregarded from our sample.”

The researchers who presented this study only used the most obscure of search terms and found that Google was able to plow the nether regions of the Internet better than Yahoo!.

But is that truly reflective of total size? Or is it only reflective of one search engine’s ability to find trivial items?

I’m no statistician, but these numbers don’t make a lot of sense to me. Would the real Search Giant please stand up?

Leave a Reply

Your email address will not be published. Required fields are marked *

Back To Top