The Search Engine Watch forum is off to a grand start with a hyper-scientific keyword research post from Orion, "a formal scientist, with special interest in AI applied to IR technology." His posts focus on "keywords semantic connectivity and what it can do for improving success across search engines."
The gist of his post, as I understand it, and I look forward to other interpretations, is that in order to find the best traffic for your keywords you should use semantically-related keywords on your site.
I derived this from a statement farther down the thread: "thus in Google, k1=car and k2=automobile seem to have a greater synonymity association (semantic connectivity) than k1=car and k2=auto." The word "auto," as Orion says, occurs in languages other than English - it's also the root of several other English words.
It's interesting to note however that when he ran his analysis on "car insurance" vs. "auto insurance" that "auto insurance" actually had more connectivity.
So how does his formula work? Here's his direct quote:
"Let n1 and n2 be the number of search results containing k1 and k2, respectively and n12 is the number of search results containing both terms. (One actually does a search for k1 then for k2 and finally a composite query consisting of k1 and k2). Using geometry arguments and fuzzy sets, it can be demonstrated that there exists an index, termed correlation index, c, such that
c = n12/(n1 + n2 - n12)"
This formula apparently helps most in comparing the connectivity of keywords - a higher connectivity helps with higher rankings. Why? Well, let's dig into the formula and see what happens.
(Orion emphasized that the keywords you compare have to be related, but not the same. I guess another word for that would be "synonyms.")
I'm using Google.
Here are my results:
c=999,000/(52,700,000 + 1,860,000 - 999,000)
That's great, but now we need to have something to compare, and, as I understand it, this formula is for finding the keywords that will have the most connectivity and therefore relevance. So now we'll substitute in a keyword there for k2 - "pooch."
Our new computation, with the results for pooch put in there, is
c=999,000/(52,700,000 + 264,000 - 999,000)
Now, if I performed everything properly then from these numbers we see that the combination of dog and pooch have a slightly higher connectivity than dog and canine. Orion states that "the optimum combination of k1 and k2 is that one with the highest c-index."
Honestly I'm not sure why that would be - it seems that since there are more instances of the word "canine" than "pooch" that there would naturally be more connectivity between "canine" and "dog." That's apparently not the case though.
Orion pointed out that you should, "repeat recipe for other search engines. c-indices may change, which indicates that semantic connectivity is different across databases. (Very important for SEOs!)"
I'm not sure all you seo copywriters out there should necessarily get your calculators out as you write. I've yet to hear any trusted SEOs weigh in on the subject yet, and this is the first I've heard of Orion.
Read his post and tell me what you think.