This story posted for Mike McDonald
One of the final sessions of the day was the “Meet The News Search Engines” and I decided to check out the session. While there weren’t huge numbers of people like at some of the other sessions, the information gave some great insight into how the news gathering aspects of the search engines come together.
The session brought in important players including Neil Budde, General Manager at Yahoo! News, Jim Pitkow, CEO at Moreover, Chris Tolles, VP of Sales and Marketing over at Topix.net and Nathan Stall, Google News Product Manager.
First out of the chute with some great information on how the Google News brain works was Nathan Stall. He said Google’s “Philosophy was to promote the value of multiple perspectives on the news.” He said the goals of Google’s news division were to:
· Offer easy discovery of multiple perspectives
· Host a conversation around stories in the news
· All news publishers are invited to participate
· Offer comprehensive coverage of online news
Stall discussed how Google gathers news too. He said stories are prioritized based on a number of factors. The criteria included who published the article, time of publication, were they original, the important of the story, the timeliness of the story and the relevance to the query.
Google gathers news by crawling and they group like stories with a cluster mechanism. Some pitfalls publishers can avoid include robots.txt mistakes, reusing URLs, no IDs, complex website formats,. He also said to avoid complex article formats, other text near articles and a lack of article focus. One point specifically mentioned too was Google’s now riding the RSS/Atom feed highway so users can pick that up for their feed fix.
"Avoid complex article formats and try to keep the content in a neat and clean structure"
Neil Budde from Yahoo said they were celebrating the 10th anniversary of Yahoo! News and said Jerry Garcia’s death was the reason. He addressed several factors about the was Yahoo! News works.
A common misconception is that Yahoo! News is built on search technology. Content actually comes from leading news providers and has human editors. Search is just one of the key components to the Yahoo! News experience.
Yahoo keeps track of what links are being transferred throughout their network and assigns importance to those subjects that are being widely circulated. Y!Q offers relevant news articles in a popup when keywords are clicked.
He did emphasize three important points about Yahoo! News. The provide users with multiple perspectives in multiple formats and work with publishers of all types and sizes. They also promote greater integration of user generated content.
Chris Tolles from Topix mentioned some important information too. He discussed the difference between the two types of news searchers: reference or goal directed web vs. incremental or subjects fed web. Reference is mission based on, finding info on ‘x’. It book marks to algorithms. Incremental is more fresh and non-directed and old RSS readers are now moving to algorithms too. His message is that a similar progression to algorithms is a common thread to the two basic types of news searchers.
Jim Pitkow at Moreover throughout some big statistics about search-based news. He said there were 12,000 international news sources producing 200,000 articles a day. They exist in 125 countries in 36 languages and in 380 catagories.
Bloggers and press are separated distinctly. Moreover doesn’t consider bloggers to be journalists necessarily. He said the press writes the news and the bloggers write opinions about news.
All in all, this session was incredibly informative about the search-based news industry. They discussed where it’s been and where it’s going. Major players gave their input on how best to utilize their products. Once again though, a major message pulled from this session like others is that original content is absolutely important.