 |

08-02-2006, 07:23 AM
|
 |
WebProWorld Member
|
|
Join Date: Aug 2005
Location: india
Posts: 89
|
|
free internal site search code ?
Can anyone provide with a free internal site search code that does not use DB and indexes the site. Maybe I am would pay a few bugs for it. I have tried couple of free ones available but they did not work.
Can anybody please ?
|

08-02-2006, 12:41 PM
|
 |
WebProWorld Pro
|
|
Join Date: Jul 2003
Location: Guelph, Ontario, Canada
Posts: 157
|
|
You can try Swish-e (it's open-source software):
http://www.swish-e.org/
I haven't used it myself, but it looks very good. I'd like to hear if it works for you!
|

08-02-2006, 01:20 PM
|
 |
WebProWorld MVP
|
|
Join Date: Jul 2003
Location: Denver, Colorado USA
Posts: 1,263
|
|
www.freefind.com
I use it in paid mode for 15+ sites.
Use the free version on 2 websites.
|

08-03-2006, 05:00 AM
|
 |
WebProWorld Member
|
|
Join Date: Nov 2004
Location: UK
Posts: 504
|
|
we use the Fluid Dynamics Search Engine. $45 on off fee for the licence. you can see it at work on all of our sites.
http://www.xav.com/scripts/search/
|

08-03-2006, 03:10 PM
|
 |
Moderator
|
|
Join Date: Jul 2006
Posts: 89
|
|
Very good info guys! Bumped to sticky :D
|

08-06-2006, 10:16 AM
|
|
WebProWorld Member
|
|
Join Date: Mar 2005
Location: Milano, Italy via Northern Ireland
Posts: 72
|
|
I can second Pagetta's recommendation of Fluid Dynamics - it has been working brilliantly for us and can handle re-directs, PDFs and DOCs with a plug-in, as well as opening with search terms highlighted. Definitely recommended.
Take care
|

12-06-2006, 03:50 PM
|
|
WebProWorld Member
|
|
Join Date: Dec 2003
Location: uk
Posts: 324
|
|
great post's people!
after spending a hour or two trawling through the web, and searching different forums for some site search scripts, I find just what i need in a nice handy sticky (good spot vectorman211)
dave
|

08-23-2007, 01:05 PM
|
 |
WebProWorld 1,000+ Club
|
|
Join Date: May 2005
Location: Norway
Posts: 4,039
|
|
Re: free internal site search code ?
I wrote this thread:
Google site operator and forum site search
before I read this post. It is explained how to implement a simple site serach function in the XML book mentioned in my first post. The code follows with the book.
If you structure your site very well, you can make a very efficient site search engine in my view. It is not so very difficult to modify the code to your own needs. That site search function will retrive content by: - keywords,
- titles,
- and description
and display those pieces that have a status of live.
It is a programming task to retrive content by n features (eg. elements and attributes). The method is there.
Last edited by kgun : 08-23-2007 at 01:19 PM.
|

08-30-2007, 07:59 AM
|
 |
WebProWorld 1,000+ Club
|
|
Join Date: May 2005
Location: Norway
Posts: 4,039
|
|
Re: free internal site search code ?
Here is a related thread that goes into more detail:
An XML powered site search engine.
|

08-30-2007, 09:34 AM
|
 |
WebProWorld Veteran
|
|
Join Date: Nov 2006
Location: Steinbach, Manitoba, Canada
Posts: 990
|
|
Re: free internal site search code ?
I get asked about this all the time and finally came across ZoomSearch a few weeks ago. After trying it out, I've decided to include it in a new site I'm working on. The free solution is perfect for little sites that don't have big budgets but still want a simple search feature.
ZoomSearch is easy to set up and free for small sites up to 50 pages. The "Standard" version will index 100 pages for $49 US, the "Professional" solution is only $99 US and will search up to 100,000 pages. The unlimited "Enterprise" license is only $299 US.
You run their software from your desktop and specify file types and locations. The tool indexes your site and compiles a few files. You then upload the files to your server and you're good to go.
The search box, extended search and results pages are simple to integrate with your site too.
WrenSoft Website Search Engine Software
Enjoy.
|

08-30-2007, 12:25 PM
|
 |
WebProWorld 1,000+ Club
|
|
Join Date: May 2005
Location: Norway
Posts: 4,039
|
|
Re: free internal site search code ?
But the important question is what the SE index and find. Does it scan every page in your site for:
"bug"
" bug "
for example?
The code I mention above is so simple that there should be no problem to modify it to find whatever you are looking for in your documents, especially if you have structured your documents and files well.
If you buy an engine, it scans documents based on another programmers chosen criteria. If you program it yourself, it scans on what you specify. Not least for large companies that make their own XML meta language this may be very important. It can be made very effecient choosing the correct tags and attributes.
How many have used engines like freefind, yahoo, google etc. sitesearch on their sites and searched for words you know are in your documents without finding them? The reason is that the engine does not scan the complete document as a string and match the word(s) you specify.
There is no theoretic limit on the number of documents that can be scanned. The engine goes in a loop using XMLReader or another parser, loads one and one file into an object /variable, whose tags and attributs are scanned for the chosen criteria.
Here is the most important and difficult  part of the code:
Code:
$handle = opendir($fileDir);
$items = array();
while (($file = readdir($handle)) !== FALSE) {
if (is_dir($fileDir . $file)) continue;
if (!eregi("^(news|article|webcopy).*\.xml$", $file)) continue;
$xmlItem = simplexml_load_file($fileDir . $file);
if ((stripos($xmlItem->keywords, $term) !== FALSE or
stripos($xmlItem->headline, $term) !== FALSE or
stripos($xmlItem->description, $term) !== FALSE) and
(string)$xmlItem->status == 'live') {
$item = array();
$item['id'] = (string)$xmlItem['id'];
$item['headline'] = (string)$xmlItem->headline;
$items[] = $item;
} //The if test ends here.
} // The while loop ends here.
Note that $items is an array of $item arrays. The part in red is the part you can modify to your own needs. As you will see, this special engine only scans documents with a status of live. You can choose whatever you want. simplexml_load_file loads the document into a variable that may be scanned as a string. For example "stripos($xmlItem->headline, $term)" scans the document for headlines that contains the $term you choose. Stripos automatically casts the first argument in the function to a string. That is not done in the last read line, where you have to cast the $xmlItem->status object to a string. Casting is everyday work in OOP.
If you looked up the two related posts at the W3 Schools forum, you may have noted that my signature there is:
$MyProfile = simplexml_load_file(myprofile.xml);
echo $MyProfile->name;
:: Kjell Gunnar Bleivik
echo $MyProfile->xpath('/personalinfo/profile[@profileID=W3Schools]');
:: Why learn the bad dialect of HTML when you get it free when tagging XML?
Learn to tag with XML, program with XSL, then improve using Ruby on Rails, PHP...
Later you may learn to fly using BETA.
I liked the Borland C++ Builder with the possiblity to use inline ASM {statements, but I am an economist}.
Overall principle, make it simple, as simple as possible but no simpler.
as of 30. august 2007. It is when you combine the parsers with XPath (red line above), XPointer, XLink, XSL(T) the power of XML comes to its right.
Last edited by kgun : 08-30-2007 at 01:18 PM.
|

09-02-2007, 11:37 AM
|
 |
WebProWorld 1,000+ Club
|
|
Join Date: May 2005
Location: Norway
Posts: 4,039
|
|
Re: free internal site search code ?
If there is something called a web 2.0 site search engine / function, that is especially well suited for XML and its family of techologies and XML parsers. There should be no problem for a clever programmer that knows: - XML XPath XPointer and XLink
- PHP or another server side language with good XML Parsers.
- AJAX.
to build a fairly advanced site search engine with Google suggest functionality. May be there are one already.
You find more information in my Web 2.0 static link collection of resources. Especially the links with anchor text: - Improving Web linking using XLink.
- XML linking language.
These technologies make it possible to have dynamic and generic links, multi-source, multi-destination links and much much more. Using link bases, location sets, archs and assiciations between these sets in your XML documents, the search engine may be fairly advanced and flexible.
Here is a link base example:
"Example: Annotating a Specification
Following is a non-normative set of declarations for an extended link that specializes in providing linkbase arcs:
Code:
<!ELEMENT basesloaded ((startrsrc|linkbase|load)*)>
<!ATTLIST basesloaded xlink:type (extended) #FIXED "extended">
<!ELEMENT startrsrc EMPTY>
<!ATTLIST startrsrc xlink:type (locator) #FIXED "locator" xlink:href CDATA #REQUIRED xlink:label NMTOKEN #IMPLIED>
<!ELEMENT linkbase EMPTY>
<!ATTLIST linkbase xlink:type (locator) #FIXED "locator" xlink:href CDATA #REQUIRED xlink:label NMTOKEN #IMPLIED>
<!ELEMENT load EMPTY>
<!ATTLIST load xlink:type (arc) #FIXED "arc" xlink:arcrole CDATA #FIXED "http://www.w3.org/1999/xlink/properties/linkbase" xlink:actuate (onLoad |onRequest |other |none) #IMPLIED xlink:from NMTOKEN #IMPLIED xlink:to NMTOKEN #IMPLIED>
Following is how an XML element using these declarations might look. This would indicate that when a specification document is loaded, a linkbase full of annotations to it should automatically be loaded as well, possibly necessitating re-rendering of the entire specification document to reveal any regions within it that serve as starting resources in the links found in the linkbase.
Code:
<basesloaded>
<startrsrc xlink:label="spec" xlink:href="spec.xml" />
<linkbase xlink:label="linkbase" xlink:href="linkbase.xml" />
<load xlink:from="spec" xlink:to="linkbase" actuate="onLoad" />
</basesloaded>
Following is how an XML element using these declarations might look if the linkbase loading were on request. This time, the starting resource consists of the words "Click here to reveal annotations." If the starting resource were the entire document as in the example above, a reasonable behavior for allowing a user to actuate traversal would be a confirmation dialog box".
Code:
<basesloaded>
<startrsrc xlink:label="spec" xlink:href="spec.xml#string-range(//*,'Click here to reveal annotations.')" />
<linkbase xlink:label="linkbase" xlink:href="linkbase.xml" />
<load xlink:from="spec" xlink:to="linkbase" actuate="onRequest" />
</basesloaded>
Source: ML Linking Language (XLink) Version 1.0
For those that only know HTML, there should be nothing revolutionary in the above markup. The difficult part is to learn the new markup languages and the important concept of XML name spaces that you use to bind different resources. Note that Internationalized Resource Identifiers (IRIs) is a generalization of an URI that is an generalization of an URL.
So to sum up. To make a very efficient Web 2.0 XML powered site search engine with AJAX functionality (like Google suggest) the technology is there already. It is very important to think thoroughly when you structure your XML (CMS) site. Clever and smart use of tags, nodes, attributes, link bases and location sets etc. etc. may make your site search engine stand out. If you take the time to write one based on the above ideas, please cite this source and give me an example for free.  Google suggest (AJAX) functionality will be much appreciated. Preferrably, use PHP paresers to make the code compact. Use streaming parsers like XMLReader to make it efficient.
Last edited by kgun : 09-02-2007 at 11:43 AM.
|

09-21-2007, 10:36 AM
|
|
WebProWorld Member
|
|
Join Date: Sep 2006
Posts: 35
|
|
Re: free internal site search code ?
Hi All,
I wrote my own internal site search for the website i work on, derived from the project i worked on for my Msc. I used a ranking based on natural language and stem formation of words. (For example if someone writes reporting, reported or report these would all be converted to their stem form which is 'report'). I could then compare words easily when i then set a reference within a database table to each page on my site including reference to the key-phrases on each page.
This method allows me to rank pages based on what and how i wanted them to rank and appear - and of course cause i built it it's free - woo-hoo! The only bummer is i havent written anything to automattically index new pages so i have to update the external page search params via mssql. If anyone wants to know anymore/access the source drop me a line.
cheers
mamola
|

09-21-2007, 12:55 PM
|
 |
WebProWorld 1,000+ Club
|
|
Join Date: May 2005
Location: Norway
Posts: 4,039
|
|
Re: free internal site search code ?
Should like to see the code.
|

10-11-2007, 09:53 PM
|
 |
WebProWorld 1,000+ Club
|
|
Join Date: Aug 2003
Location: Edmonton, AB, Canada
Posts: 3,406
|
|
Re: free internal site search code ?
Quote:
Originally Posted by kgun
If there is something called a web 2.0 site search engine / function, that is especially well suited for XML and its family of techologies and XML parsers. There should be no problem for a clever programmer that knows: - XML XPath XPointer and XLink
- PHP or another server side language with good XML Parsers.
- AJAX.
to build a fairly advanced site search engine with Google suggest functionality. May be there are one already.
You find more information in my Web 2.0 static link collection of resources. Especially the links with anchor text: - Improving Web linking using XLink.
- XML linking language.
These technologies make it possible to have dynamic and generic links, multi-source, multi-destination links and much much more. Using link bases, location sets, archs and assiciations between these sets in your XML documents, the search engine may be fairly advanced and flexible.
Here is a link base example:
"Example: Annotating a Specification
Following is a non-normative set of declarations for an extended link that specializes in providing linkbase arcs:
Code:
<!ELEMENT basesloaded ((startrsrc|linkbase|load)*)>
<!ATTLIST basesloaded xlink:type (extended) #FIXED "extended">
<!ELEMENT startrsrc EMPTY>
<!ATTLIST startrsrc xlink:type (locator) #FIXED "locator" xlink:href CDATA #REQUIRED xlink:label NMTOKEN #IMPLIED>
<!ELEMENT linkbase EMPTY>
<!ATTLIST linkbase xlink:type (locator) #FIXED "locator" xlink:href CDATA #REQUIRED xlink:label NMTOKEN #IMPLIED>
<!ELEMENT load EMPTY>
<!ATTLIST load xlink:type (arc) #FIXED "arc" xlink:arcrole CDATA #FIXED "http://www.w3.org/1999/xlink/properties/linkbase" xlink:actuate (onLoad |onRequest |other |none) #IMPLIED xlink:from NMTOKEN #IMPLIED xlink:to NMTOKEN #IMPLIED>
Following is how an XML element using these declarations might look. This would indicate that when a specification document is loaded, a linkbase full of annotations to it should automatically be loaded as well, possibly necessitating re-rendering of the entire specification document to reveal any regions within it that serve as starting resources in the links found in the linkbase.
Code:
<basesloaded>
<startrsrc xlink:label="spec" xlink:href="spec.xml" />
<linkbase xlink:label="linkbase" xlink:href="linkbase.xml" />
<load xlink:from="spec" xlink:to="linkbase" actuate="onLoad" />
</basesloaded>
Following is how an XML element using these declarations might look if the linkbase loading were on request. This time, the starting resource consists of the words "Click here to reveal annotations." If the starting resource were the entire document as in the example above, a reasonable behavior for allowing a user to actuate traversal would be a confirmation dialog box".
Code:
<basesloaded>
<startrsrc xlink:label="spec" xlink:href="spec.xml#string-range(//*,'Click here to reveal annotations.')" />
<linkbase xlink:label="linkbase" xlink:href="linkbase.xml" />
<load xlink:from="spec" xlink:to="linkbase" actuate="onRequest" />
</basesloaded>
Source: ML Linking Language (XLink) Version 1.0
For those that only know HTML, there should be nothing revolutionary in the above markup. The difficult part is to learn the new markup languages and the important concept of XML name spaces that you use to bind different resources. Note that Internationalized Resource Identifiers (IRIs) is a generalization of an URI that is an generalization of an URL.
So to sum up. To make a very efficient Web 2.0 XML powered site search engine with AJAX functionality (like Google suggest) the technology is there already. It is very important to think thoroughly when you structure your XML (CMS) site. Clever and smart use of tags, nodes, attributes, link bases and location sets etc. etc. may make your site search engine stand out. If you take the time to write one based on the above ideas, please cite this source and give me an example for free.  Google suggest (AJAX) functionality will be much appreciated. Preferrably, use PHP paresers to make the code compact. Use streaming parsers like XMLReader to make it efficient.
|
This is incredible. i have to learn xlink and xforms, but I use ajax and of course the basics of xslt and schemas (and dtd).
I am getting a new website this weekend, with JSP. This is xml at it`s most long developed api.
Dynamic search responses are excellent using javascript, and xml files.
__________________
What I am is what I am, are you what you are, or what.
Eddie Brickel
|

11-05-2007, 02:34 AM
|
|
WebProWorld New Member
|
|
Join Date: Nov 2007
Posts: 6
|
|
Re: free internal site search code ?
thanks guys.
i like this site because i get usefull infos for me
|

11-18-2007, 01:57 AM
|
 |
WebProWorld 1,000+ Club
|
|
| | |