| |
1. Egothor | By: | | License: | | URL: | http://www.egothor.org/ | Description: | Egothor is an Open Source, high-performance, full-featured text search engine written entirely in Java. It is technology suitable for nearly any application that requires full-text search, especially cross-platform. It can be configured as a standalone engine, metasearcher, peer-to-peer HUB, and, moreover, it can be used as a library for an application that needs full-text search.
|
2. Apache Lucene | By: | | License: | Apache Software License | URL: | http://jakarta.apache.org/lucene/docs/index.html | Description: | Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.
|
3. Oxyus | By: | | License: | Apache Software License | URL: | http://oxyus.sourceforge.net/ | Description: | Oxyus is an open source search engine written in 100% Java.
|
4. BDDBot | By: | | License: | GNU General Public License (GPL) | URL: | http://www.twmacinta.com/bddbot/ | Description: | What is a BDDBot, you ask? BDDBot is a web robot, search engine, and web server written entirely in Java(TM). It was written by Tim Macinta for his book (co-authored with Wes Sonnenreich), a Web Developer's Guide to Search Engines by Wiley Publishing. It was written as an example for a chapter on how to write your search engines, and as such it is very simplistic. While not as heavy duty as other free search engines such as ht://Dig, the BDDBot offers the following advantages:
* Its simplicity makes it a good learning tool for how search engines work. The aforementioned book provides a good top-level overview of how it works so please go buy the book (insert goofy smiley face emoticon here).
* Its simplicity also makes it easily expandable. You can very easily expand it so that it can index document types besides HTML and plain text. You can also very easily expand it so that it can crawl using different protocols (e.g., gopher, wais) by using the standard Java method for adding protocols.
* It comes with its own built in web server - we don't know of any other free search engine out there that does this. If you do, please let us know.
* It's completely free, ala the GNU General Public License. ht://Dig is the only other free search engine we know of that's under the GPL.
* It's written in Java, which provides several advantages in and of itself. Because it's written in Java:
o The BDDBot can run on any machine that has a stable Java Virtual Machine (at least as long as Microsoft continues to fail at making Java a Windows specific language).
o It is in an easy to understand and powerful language.
o It is object oriented for even greater extensibility.
o It's very small - just over 100K including source code, binaries, and configuration files at last count.
* Its indexes are very small. They are on the order of 10% of the size of the text on your site even though they index every single alphanumeric word.
|
5. Zilverline | By: | | License: | | URL: | http://www.zilverline.org/ | Description: | Zilverline is what you could call a 'Reverse Search Engine': Zilverline is a search engine that offers web access to your personal or intranet content.
Zilverline is a 'Lucene Desktop' comparable to Google Desktop, but based on Lucene.
Zilverline supports collections: a set of files and directories in a directory. Zilverline extracts content from PDF, Word, Excel, Powerpoint, RTF, txt, java, CHM as well as zip, rar, and many other archives. A collection can be indexed, and searched. The results of the search can be retrieved from local disk or Intranet. Files inside zip, rar, chm and other archives are extracted during indexing, and can be preserved for searches. Otherwise they are extracted 'on-the-fly'.
|
6. YaCy | By: | | License: | GNU General Public License (GPL) | URL: | http://www.yacy.net/yacy | Description: | The YaCy project is a new approach to build a P2P-based Web indexing network.
* Anonymous, independend, not-censored web search
* No central server, no storage of user behaviour
* Your can crawl the web and feed pages that you selected to the global index
* Run your peer to support other YaCy crawlers, they support your crawler
* Host information on your peer using the built-in http-server, file-sharing zone and wiki
* Easy installation! No additional database required!
* GPL'ed, freeware
|
7. Compass | By: | | License: | GNU Library or Lesser General Public License (LGPL) | URL: | http://compass.sourceforge.net/ | Description: | Compass is a first class open source Java Search Engine Framework, enabling the power of Search Engine semantics to your application stack decoratively. Built on top of the amazing Lucene Search Engine, Compass integrates seamlessly to popular development frameworks like Hibernate and Spring. It provides search capability to your application data model and synchronizes changes with the datasource. With Compass: write less code, find data quicker.
As of version 0.8, Compass also provides a Lucene Jdbc Directory implementation, allowing storing Lucene index within a database for both pure Lucene applications and Compass enabled applications. Note, when using Compass, using a database as the index storage requires only updating configuration settings.
|
8. LIUS | By: | | License: | GNU General Public License (GPL) | URL: | http://www.bibl.ulaval.ca/lius/ | Description: | LIUS est un framework d'indexation d?lopp?n Java bas?ur le projet Jakarta Lucene. LIUS ajoute ?ucene plusieurs fonctionnalit?d'indexation de type de documents tel que : Ms Word, Ms Excel, Ms PowerPoint, RTF, PDF, XML, HTML, TXT, la suite Open Office et les JavaBeans.
L'indexation des javaBeans peut ?e tr?utile lorsqu'on veut indexer des bases de donn? et plus pr?s?nt lorsque l'utilisateur programme la couche persistance (ou connexion ?a base de donn? en utilisant des ORM (Object Relational Mapping) comme Hibernate, JDO, toplink, torque etc..
|
|