Java Doc for Indexer.java in » Search-Engine » BDDBot » bdd » search » spider » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation

1.	6.0 JDK Core
2.	6.0 JDK Modules
3.	6.0 JDK Modules com.sun
4.	6.0 JDK Modules com.sun.java
5.	6.0 JDK Modules sun
6.	6.0 JDK Platform
7.	Ajax
8.	Apache Harmony Java SE
9.	Aspect oriented
10.	Authentication Authorization
11.	Blogger System
12.	Build
13.	Byte Code
14.	Cache
15.	Chart
16.	Chat
17.	Code Analyzer
18.	Collaboration
19.	Content Management System
20.	Database Client
21.	Database DBMS
22.	Database JDBC Connection Pool
23.	Database ORM
24.	Development
25.	EJB Server geronimo
26.	EJB Server GlassFish
27.	EJB Server JBoss 4.2.1
28.	EJB Server resin 3.1.5
29.	ERP CRM Financial
30.	ESB
31.	Forum
32.	GIS
33.	Graphic Library
34.	Groupware
35.	HTML Parser
36.	IDE
37.	IDE Eclipse
38.	IDE Netbeans
39.	Installer
40.	Internationalization Localization
41.	Inversion of Control
42.	Issue Tracking
43.	J2EE
44.	JBoss
45.	JMS
46.	JMX
47.	Library
48.	Mail Clients
49.	Net
50.	Parser
51.	PDF
52.	Portal
53.	Profiler
54.	Project Management
55.	Report
56.	RSS RDF
57.	Rule Engine
58.	Science
59.	Scripting
60.	Search Engine
61.	Security
62.	Sevlet Container
63.	Source Control
64.	Swing Library
65.	Template Engine
66.	Test Coverage
67.	Testing
68.	UML
69.	Web Crawler
70.	Web Framework
71.	Web Mail
72.	Web Server
73.	Web Services
74.	Web Services apache cxf 2.0.1
75.	Web Services AXIS2
76.	Wiki Engine
77.	Workflow Engines
78.	XML
79.	XML UI

Java

Java Tutorial

Illustrator Tutorials

GIMP Tutorials

C# / C Sharp

C# / CSharp Tutorial

C# / CSharp Open Source

SQL Server / T-SQL Tutorial

Oracle PL / SQL

Oracle PL/SQL Tutorial

Flash / Flex / ActionScript

VBA / Excel / Access / Word

XML

XML Tutorial

Microsoft Office PowerPoint 2007 Tutorial

Microsoft Office Excel 2007 Tutorial

Microsoft Office Word 2007 Tutorial

Java Source Code / Java Documentation » Search Engine » BDDBot » bdd.search.spider

Source Cross Reference

Class Diagram

Java Document (Java Doc)

java.lang .Object

java.lang .Thread

bdd.search.spider .Indexer

Indexer
public class Indexer extends Thread (Code)
	Written by Tim Macinta 1997 Distributed under the GNU Public License (a copy of which is enclosed with the source). The Indexer is a thread which can index URLs that have been cached using the URLStatus class. Use the queueURL() method to add cached URLs to the Indexer's list of URLs. Once the start() method is called, the Indexer will start processing URLs in its queue. More URLs can also be added after calling start, in fact this may be the best way to use the Indexer. Calling the stopWhenDone() method will cause the Indexer thread to stop as soon as its queue empties.

Field Summary
final static String	TMP_NAME
final static String	TMP_NAME_2
Crawler	crawler
boolean	exit_when_done
EnginePrefs	prefs
FIFOQueue	q
Object	q_mutex
boolean	running
long	total_bytes
File	working_dir

Constructor Summary
public	Indexer(File working_dir, Crawler crawler, EnginePrefs prefs) "working_dir" should be a directory that only this Indexer and a given Cralwer will be accessing.

Method Summary
void	addNewURLs(LinkExtractor urls) Adds new URLs to the crawler's queue.
void	cleanUp() Removes all the ".db" and ".tmp" files in the directory "working_dir".
void	merge(File file1, File file2, File target) Takes two search databases, "file1" and "file2", and merges their contents with the results being placed in "target".
void	mergeDatabases(File temporary) Repeatedly attempts to merge "temporary" with other temporary databases which have been merged the same number of times.
void	pipe(InputStream in, OutputStream out) Pipes "in" to "out" until "in" is exhausted then closes "in".
public void	queueURL(URLStatus url) Use this method to add a cached url to the Indexer.
void	replaceMainIndex() Completes the merging of all temporary databases and replaces the main database with the final product.
public void	run() This is where the actual indexing is done.
public void	start() Starts the Indexer.
public void	stopWhenDone(boolean exit_when_done) Causes this Indexer to stop whenever it finishes indexing the URLs in its queue.

Field Detail

TMP_NAME
final static String TMP_NAME(Code)

TMP_NAME_2
final static String TMP_NAME_2(Code)

crawler
Crawler crawler(Code)

exit_when_done
boolean exit_when_done(Code)

prefs
EnginePrefs prefs(Code)

q
FIFOQueue q(Code)

q_mutex
Object q_mutex(Code)

running
boolean running(Code)

total_bytes
long total_bytes(Code)

working_dir
File working_dir(Code)

Constructor Detail

Indexer
public Indexer(File working_dir, Crawler crawler, EnginePrefs prefs)(Code)
	"working_dir" should be a directory that only this Indexer and a given Cralwer will be accessing. This means that if several Indexers are running simultaneously, they should all be given different "working_dir" directories. Also, no other threads should write to this directory (except for the selected Crawler).

Method Detail

addNewURLs
void addNewURLs(LinkExtractor urls)(Code)
	Adds new URLs to the crawler's queue.

cleanUp
void cleanUp()(Code)
	Removes all the ".db" and ".tmp" files in the directory "working_dir".

merge
void merge(File file1, File file2, File target) throws IOException(Code)
	Takes two search databases, "file1" and "file2", and merges their contents with the results being placed in "target". "file2" must exist, but "file1" need not. If "file1" does not exist then "file2" is copied to "target".

mergeDatabases
void mergeDatabases(File temporary) throws IOException(Code)
	Repeatedly attempts to merge "temporary" with other temporary databases which have been merged the same number of times. In other words, this method will first try to merge "temporary" with any databases that haven't been merged yet. If that is successful, this database will then be merged with any databases that have been merged once. If that is successful, this database will then be merged with any databases that have been merged twice... and so on and so forth. Databases are named based on the number of times they have been merged. E.g., a file called "6.db" will have been merged six times while a file called "9.db" will have been merged nine times. It is assumed that the "temporary" file has not been merged at all.

pipe
void pipe(InputStream in, OutputStream out) throws IOException(Code)
	Pipes "in" to "out" until "in" is exhausted then closes "in".

queueURL
public void queueURL(URLStatus url)(Code)
	Use this method to add a cached url to the Indexer.

replaceMainIndex
void replaceMainIndex() throws IOException(Code)
	Completes the merging of all temporary databases and replaces the main database with the final product.

run
public void run()(Code)
	This is where the actual indexing is done.

start
public void start()(Code)
	Starts the Indexer.

stopWhenDone
public void stopWhenDone(boolean exit_when_done)(Code)
	Causes this Indexer to stop whenever it finishes indexing the URLs in its queue.

Fields inherited from java.lang.Thread

final public static int MAX_PRIORITY(Code)(Java Doc)
final public static int MIN_PRIORITY(Code)(Java Doc)
final public static int NORM_PRIORITY(Code)(Java Doc)

Methods inherited from java.lang.Thread

public static int activeCount()(Code)(Java Doc)
final public void checkAccess()(Code)(Java Doc)
native public int countStackFrames()(Code)(Java Doc)
native public static Thread currentThread()(Code)(Java Doc)
public void destroy()(Code)(Java Doc)
public static void dumpStack()(Code)(Java Doc)
public static int enumerate(Thread tarray)(Code)(Java Doc)
public static Map<Thread, StackTraceElement[]> getAllStackTraces()(Code)(Java Doc)
public ClassLoader getContextClassLoader()(Code)(Java Doc)
public static UncaughtExceptionHandler getDefaultUncaughtExceptionHandler()(Code)(Java Doc)
public long getId()(Code)(Java Doc)
final public String getName()(Code)(Java Doc)
final public int getPriority()(Code)(Java Doc)
public StackTraceElement[] getStackTrace()(Code)(Java Doc)
public State getState()(Code)(Java Doc)
final public ThreadGroup getThreadGroup()(Code)(Java Doc)
public UncaughtExceptionHandler getUncaughtExceptionHandler()(Code)(Java Doc)
native public static boolean holdsLock(Object obj)(Code)(Java Doc)
public void interrupt()(Code)(Java Doc)
public static boolean interrupted()(Code)(Java Doc)
final native public boolean isAlive()(Code)(Java Doc)
final public boolean isDaemon()(Code)(Java Doc)
public boolean isInterrupted()(Code)(Java Doc)
final public synchronized void join(long millis) throws InterruptedException(Code)(Java Doc)
final public synchronized void join(long millis, int nanos) throws InterruptedException(Code)(Java Doc)
final public void join() throws InterruptedException(Code)(Java Doc)
final public void resume()(Code)(Java Doc)
public void run()(Code)(Java Doc)
public void setContextClassLoader(ClassLoader cl)(Code)(Java Doc)
final public void setDaemon(boolean on)(Code)(Java Doc)
public static void setDefaultUncaughtExceptionHandler(UncaughtExceptionHandler eh)(Code)(Java Doc)
final public void setName(String name)(Code)(Java Doc)
final public void setPriority(int newPriority)(Code)(Java Doc)
public void setUncaughtExceptionHandler(UncaughtExceptionHandler eh)(Code)(Java Doc)
native public static void sleep(long millis) throws InterruptedException(Code)(Java Doc)
public static void sleep(long millis, int nanos) throws InterruptedException(Code)(Java Doc)
public synchronized void start()(Code)(Java Doc)
final public void stop()(Code)(Java Doc)
final public synchronized void stop(Throwable obj)(Code)(Java Doc)
final public void suspend()(Code)(Java Doc)
public String toString()(Code)(Java Doc)
native public static void yield()(Code)(Java Doc)

Methods inherited from java.lang.Object

native protected Object clone() throws CloneNotSupportedException(Code)(Java Doc)
public boolean equals(Object obj)(Code)(Java Doc)
protected void finalize() throws Throwable(Code)(Java Doc)
final native public Class getClass()(Code)(Java Doc)
native public int hashCode()(Code)(Java Doc)
final native public void notify()(Code)(Java Doc)
final native public void notifyAll()(Code)(Java Doc)
public String toString()(Code)(Java Doc)
final native public void wait(long timeout) throws InterruptedException(Code)(Java Doc)
final public void wait(long timeout, int nanos) throws InterruptedException(Code)(Java Doc)
final public void wait() throws InterruptedException(Code)(Java Doc)

www.java2java.com | Contact Us

All other trademarks are property of their respective owners.