View Full Version : Help make Burf.com a better search engine
08-26-2006, 08:04 AM
Well people, anyone fancy helping Burf.com give better results?
Well you can now ban sites from burf via the search page. So if anyone fancies doing a search and sees a site that they do not think should be there, hit "Ban site" and BANG its gone before your eyes (it may reappear if I think you where wrong)
Any help would be great :) virtual pints all round
Just like to say thanks to WPW for some very helpful replies, yo are helping to make the future Burf!
08-27-2006, 04:31 AM
Funnily enough I tried to submit to Burf.com on Friday and got an error message. I've just tried again with a different website on a different computer and got this:
Warning: main(http://...@illustrated-history.net&name=Simon%20Palmer): failed to open stream: Connection timed out in /home/burfcom/public_html/submit2.php on line 9
Warning: main(): Failed opening 'http://...@illustrated-history.net&name=Simon%20Palmer' for inclusion (include_path='.:/usr/lib/php:/usr/local/lib/php') in /home/burfcom/public_html/submit2.php on line 9
Any ideas what's going wrong? Other than that your changes sound great!
08-27-2006, 04:53 AM
So anyone can ban a competitor?
1. Is it a metasearch engine or a PHP powered database driven SQL engine? You do not need to answer, but everybody can make up their own opinion by "view source."
2. It is very slow in Norway.
3. In my view the hists for "Financial stability" are irrelevant. The first has Burf Rank 852. What does that mean? Compare the hits to the Google or Yahoo search for the same KW's.
4. If it is a PHP / MySQL powered SQL search, you may distribute / sell the code as a good site search that may be modified:
Example wildcard search (http://www.webproworld.com/viewtopic.php?t=66614) etc.
08-31-2006, 02:06 PM
Hello all from Morroco (burfs on holiday)
Sorry for the timeout message, I am working on hosting it on a proper big server
Currently anyone can ban anything with me then going through the list and double checking
Burf.com is not a meta engine nor uses mysql. It is completely hand written in equinox ( a language that my company makes)
I am thinking of porting it to either .net or php when I get a spare moment
So for the banning; should the site go and then checked by me or stay until I remove it (could hide it from user viq cookies)
08-31-2006, 03:16 PM
I did a search and banned some pages. I like that you can ban pages and not entire sites.
But you know, I would rather not ban pages completely. The pages I banned might have beeen ok, but they were all for the same site, over and over again. I would rather be able to "promote" or "demote" pages depending on if I think they are better or worse. Of course the ban option is nice, but a little too harsh. Besides, we are users and we really should not have that kind of power...
My view is that the best you can do is to make a search engine that will be reputed for giving relevant hits that it is difficult to spam.
1. Delete broken links.
2. Combating spam with a trust metric (http://dbpubs.stanford.edu:8090/pub/showDoc.Fulltext?lang=en&doc=2004-17&format=pdf&compression=). You may filter the sites like it is done in that article. Study it in detail.
3. A good starting point for your spider may be the BIS (Bank For International Settlements") central bank website (http://www.bis.org/cbanks.htm). Follow links from that starting point is my proposal.
4. Why not starting on the chat SE (http://www.subjex.net/servlet/SubjexServlet?showmain=true)? You can do it by using artificial intelligence.
Related WPW posts:
Next Gen Search: Thinking Engines (http://www.webproworld.com/viewtopic.php?t=49758)
"TrustRank" vs. "PageRank" (http://www.webproworld.com/archive.php/o_t__t_60595__start_0__index.html)
Backlinking and smartlinks. (http://www.webproworld.com/viewtopic.php?t=50056)
I like this Australian topic based SE (http://www.factbites.com/). They also have their own IE and Firefox toolbar (http://www.factbites.com/toolbar.php).
If you searh with the KW's
1. There is a note in red that says: "these results are not from the primary (high quality) database".
2. Under this condition, the results are relevant, but IMO, still not as relevant as Google's and Yahoo's first hits on the SERP's.
At the boottom it has a link to Try your search on: Qwika (all wikis) (http://www.qwika.com/find/).
I say to students that I have adviced, think of speaking (writing) to a special person when you write (code).
Imagine the following task:
You shall make a search engine for the Bank of England (http://www.bankofengland.co.uk/).
Hope you succeed with your SE and that you have got some ideas for improvement and / or evolution.
1. Google Revealed: The IT Strategy That Makes It Work (http://www.informationweek.com/management/showArticle.jhtml?articleID=192300292)
"A unique mix of internally developed software, open source, made-to-order hardware, and people management is the secret behind the search engine.
But a day spent with some of the company's IT leaders reveals there's more to Google's IT operations than a search engine running on a massive server farm. Behind the seeming simplicity is a mash-up of internally developed software, made-to-order hardware, artificial intelligence, obsession with performance, and an unorthodox approach to people management."
2. Google Goes Its Own Way In The Data Center (http://www.informationweek.com/story/showArticle.jhtml?articleID=192300294)
"George Mason University professor Paul Strassmann suggested in a lecture last December that Google's Linux-based infrastructure is considerably cheaper to buy and maintain than a comparable setup of Sun Microsystems servers or Windows servers would cost. For IT shops that spend half their budgets just keeping machines up and patched, the implications are significant. Strassmann said IT pros know where they need to go--toward a Google-style architecture.
Some of the new servers going into Google's data centers are probably equipped with AMD Opteron multicore processors. Google won't confirm that, but one of the reasons AMD chips are selling so well to other companies is that they don't throw off as much heat as older alternatives. Google engineers, who pay close attention to microprocessor efficiency and heat dissipation, must find AMD chips hard to ignore. Intel is racing to improve the performance-per-watt of its own processor line
Electric power may be expensive, but it's cheap compared to brainpower"
Conclusion: It is not least about hardware, bandwith and cheap energy supply and Google seems to have the right brain power, too.
09-14-2006, 08:38 AM
Thanks for your input, there is soo much to read and do
Plans, as Christian_SEO mentioned, it would be nice to promote and demote a site!
There is a function to rate a site now, this is in testing like banning but will eventurally play a part in the Burf Rank (lol) and thus you will be able to promote and demote sites by rating them.
A thought I had on holiday was to create "Your search" a cookie / login based search where if you ban a search, you dont see it again but others do. That could give people there own individual look at the engine!
10-19-2006, 10:47 AM
This now has an adult options and proper caching so you should notice faster results, please give it a go!
01-16-2007, 01:34 AM
It has been a lonng time since we last spoke. I need to get your new search string. I am getting ready to re-open yoddle and noticed that you are using some ajax or at least a real fast cache. Get in touch.