|
Related YoLinux Tutorials:
°Linux Networking
°Linux Sys Admin
°Internet Security
°Security Tools
°Web site configuration
°Web Tricks
°Internet Gateway
°YoLinux Tutorials Index
|
There are a plethora of search options available for your web site.
- Use a search service to spider and index your site and provide the search box
on your web pages, search engine and provide the search results which
point back to your site.
The actual search index and spidering of the web site is handled by the
search service. Google and others can provide this service.
It is the equivalent of a Google search with:
search-word site:your-domain.com
The HTML form which calls the Google search engine can also embed the
domain to acheive the same effect. Google and other search firms
provide free and for fee services.
- Index your site and provide your own search capabilities.
Commercial and open source solutions exist. This can be a cgi program
which performs a grep/search on the site contents when called or it can
use a previously generated index of the contents of the web site for faster results.
- A separate search "appliance" or search server can spider your web site
and provide the search facility for your site.
This works best for sites with multiple web servers or for intranets with
multiple file and web servers.
| Commercial Search Services: |
-
| Commercial Search Engine Software Vendors: |
-
| Vendor |
Product |
| Focuseek |
Searchbox2: Index HTML. PDF, MS/Word, RTF and plain text documents |
| Folio |
Folio Site Director |
| Google |
Google enterprise solutions:
(Based on Stanford research)
|
| SLI Systems |
Learning search |
| Lycos |
Inmagic |
| Maxum Development Corp. |
Phantom |
| Netscape |
Compass Server |
| Quadralay Corp |
Web Works Search |
| HotBot |
www.hotbot.com |
| Opentext |
Livelink |
| Verity |
Ultraseek opverview |
Multi-Media
-
| List of Open Source options: |
-
| Search Engine |
Web Site |
| perl_site_search |
Simplest search to implement |
| SWISH |
Version 1.1: Use on low number of local pages only. |
| SWISH++ |
The fastest SWISH. Written in C++. |
| Lucene |
From the Apache group. Written in Java and runs on Tomcat. |
| WebGlimpse/Glimpse |
Original U of Arizona and commercial versions. Written in Perl and C.
HTML, PDF, Word and other formats. |
| freeWais |
Can perform "And", "Or" and "Not" type searches.
Also:
|
| freeWais-sf |
One of the first available content indexing/search engines.
The SF is for "Structured Fields".
These fields are used for informations types such as author, title, date...
Can perform "And", "Or" and "Not" type searches.
Info:
|
| DataParkSearch |
HTML, plain text, audio MP3 and GIF images. Supports synonyms, and fuzzy search. Multi-character support. Index and CGI. GPL |
| Spider/Robot Index Engine |
|
| ht/Dig |
Search/Index single site resident on server or spider remote WWW servers.
Supports robots.txt exclusions. HTML and plain text documents. GPL. (San Diego State U.) |
| Harvest (Robot Indexer) |
Supports HTML include TeX, DVI, PS, full text, mail, man
pages, news, troff, WordPerfect, RTF, Microsoft Word/Excel, SGML, C
sources and PDF (using Xpdf) Modular. Written in Perl. |
On line reviews.
Comprehensive list of search sites. See:
| On-lineReviews of Search Engines: |
- "Web Publishing Unleashed, HTML, JAVA, CGI, VRML, SGML"
ISBN #1-57521-051-7, SAMS
This book dedicates an entire chapter to WAIS and search engines.
|