Java Doc for URLStatus.java in » Search-Engine » BDDBot » bdd » search » spider » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation

1.	6.0 JDK Core
2.	6.0 JDK Modules
3.	6.0 JDK Modules com.sun
4.	6.0 JDK Modules com.sun.java
5.	6.0 JDK Modules sun
6.	6.0 JDK Platform
7.	Ajax
8.	Apache Harmony Java SE
9.	Aspect oriented
10.	Authentication Authorization
11.	Blogger System
12.	Build
13.	Byte Code
14.	Cache
15.	Chart
16.	Chat
17.	Code Analyzer
18.	Collaboration
19.	Content Management System
20.	Database Client
21.	Database DBMS
22.	Database JDBC Connection Pool
23.	Database ORM
24.	Development
25.	EJB Server geronimo
26.	EJB Server GlassFish
27.	EJB Server JBoss 4.2.1
28.	EJB Server resin 3.1.5
29.	ERP CRM Financial
30.	ESB
31.	Forum
32.	GIS
33.	Graphic Library
34.	Groupware
35.	HTML Parser
36.	IDE
37.	IDE Eclipse
38.	IDE Netbeans
39.	Installer
40.	Internationalization Localization
41.	Inversion of Control
42.	Issue Tracking
43.	J2EE
44.	JBoss
45.	JMS
46.	JMX
47.	Library
48.	Mail Clients
49.	Net
50.	Parser
51.	PDF
52.	Portal
53.	Profiler
54.	Project Management
55.	Report
56.	RSS RDF
57.	Rule Engine
58.	Science
59.	Scripting
60.	Search Engine
61.	Security
62.	Sevlet Container
63.	Source Control
64.	Swing Library
65.	Template Engine
66.	Test Coverage
67.	Testing
68.	UML
69.	Web Crawler
70.	Web Framework
71.	Web Mail
72.	Web Server
73.	Web Services
74.	Web Services apache cxf 2.0.1
75.	Web Services AXIS2
76.	Wiki Engine
77.	Workflow Engines
78.	XML
79.	XML UI

Java

Java Tutorial

Illustrator Tutorials

GIMP Tutorials

C# / C Sharp

C# / CSharp Tutorial

C# / CSharp Open Source

SQL Server / T-SQL Tutorial

Oracle PL / SQL

Oracle PL/SQL Tutorial

Flash / Flex / ActionScript

VBA / Excel / Access / Word

XML

XML Tutorial

Microsoft Office PowerPoint 2007 Tutorial

Microsoft Office Excel 2007 Tutorial

Microsoft Office Word 2007 Tutorial

Java Source Code / Java Documentation » Search Engine » BDDBot » bdd.search.spider

Source Cross Reference

Class Diagram

Java Document (Java Doc)

java.lang .Object

bdd.search.spider .URLStatus

URLStatus
public class URLStatus (Code)
	Written by Tim Macinta 1997 Distributed under the GNU Public License (a copy of which is enclosed with the source). This class holds information about the content at a particular URL. It can also be used to fetch and parse an URL.

Field Summary
final static int	DUPLICATE
final static int	IO_ERROR
final static int	LOADED
final static int	MISC_ERROR
final static int	MISSING
final static int	MOVED
final static int	NOT_LOADED
final static int	TIMED_OUT
final static int	UNSUPPORTED_MIMETYPE
URL	actual_url
String	email_address
EnginePrefs	eng_prefs
URL	given_url
String	mime_type
int	status
File	temp_file
String	user_agent

Constructor Summary
public	URLStatus(URL url, File temp_file, EnginePrefs eng_prefs) "url" is the location of the information and "temp_file" is the temporary file that can be used to store the contents of this url.

Method Summary
public void	dumpToDatabase(DataOutputStream out) Creates a database containing just this URL.
void	dumpWords(DataOutputStream out) Dumps the words contained in this URL in database format to "out".
public void	finalize() Gets rid of the temporary file.
public File	getCacheFile() Returns the file that is used to cache the contents of this URL.
public long	getContentLength() Returns the length of the content, or 0 if it's unknown.
public LinkExtractor	getLinkExtractor() Returns a LinkExtractor that can handle this URL's mime type. To add support for new mime types add a LinkExtractor that handles those mime types here and add appropriate WordExtractors to the getWordExtractor() method.
public WordExtractor	getWordExtractor() Returns a WordExtractor that can handle this URL's mime type. To add support for new mime types add a WordExtractor that handles those mime types here and add appropriate LinkExtractors to the getLinkExtractor() method.
public boolean	loaded() Returns true if and only if this URL was loaded without an error.
public boolean	mimeTypeUnderstood(String mime_type) Returns true if and only if this mime type can be processed.
public boolean	moved() Returns true if and only if this URL causes a redirection.
void	pipe(InputStream in, OutputStream out) Pipes "in" to "out" until "in" is exhausted then closes "in".
public void	readContent() Downloads the content of the given URL and stores it in a temporary cache file.
void	readGeneric() This method provides a fallback to the default Java implementation for protocols which have not been re-implemented.
void	readHTTP() Downloads a file using the HTTP protocol.
String	readLine(PushbackInputStream in) A replacement for the java.io.DataInputStream which doesn't return the line ending characters like it should.

Field Detail

DUPLICATE
final static int DUPLICATE(Code)

IO_ERROR
final static int IO_ERROR(Code)

LOADED
final static int LOADED(Code)

MISC_ERROR
final static int MISC_ERROR(Code)

MISSING
final static int MISSING(Code)

MOVED
final static int MOVED(Code)

NOT_LOADED
final static int NOT_LOADED(Code)

TIMED_OUT
final static int TIMED_OUT(Code)

UNSUPPORTED_MIMETYPE
final static int UNSUPPORTED_MIMETYPE(Code)

actual_url
URL actual_url(Code)

email_address
String email_address(Code)

eng_prefs
EnginePrefs eng_prefs(Code)

given_url
URL given_url(Code)

mime_type
String mime_type(Code)

status
int status(Code)

temp_file
File temp_file(Code)

user_agent
String user_agent(Code)

Constructor Detail

URLStatus
public URLStatus(URL url, File temp_file, EnginePrefs eng_prefs)(Code)
	"url" is the location of the information and "temp_file" is the temporary file that can be used to store the contents of this url.

Method Detail

dumpToDatabase
public void dumpToDatabase(DataOutputStream out) throws IOException(Code)
	Creates a database containing just this URL.

dumpWords
void dumpWords(DataOutputStream out) throws IOException(Code)
	Dumps the words contained in this URL in database format to "out".

finalize
public void finalize() throws Throwable(Code)
	Gets rid of the temporary file.

getCacheFile
public File getCacheFile()(Code)
	Returns the file that is used to cache the contents of this URL.

getContentLength
public long getContentLength()(Code)
	Returns the length of the content, or 0 if it's unknown.

getLinkExtractor
public LinkExtractor getLinkExtractor() throws IOException(Code)
	Returns a LinkExtractor that can handle this URL's mime type. To add support for new mime types add a LinkExtractor that handles those mime types here and add appropriate WordExtractors to the getWordExtractor() method. Also, add the mime type to the list in the mimeTypeUnderstood() method.

getWordExtractor
public WordExtractor getWordExtractor() throws IOException(Code)
	Returns a WordExtractor that can handle this URL's mime type. To add support for new mime types add a WordExtractor that handles those mime types here and add appropriate LinkExtractors to the getLinkExtractor() method. Also, add the mime type to the list in the mimeTypeUnderstood() method.

loaded
public boolean loaded()(Code)
	Returns true if and only if this URL was loaded without an error.

mimeTypeUnderstood
public boolean mimeTypeUnderstood(String mime_type)(Code)
	Returns true if and only if this mime type can be processed.

moved
public boolean moved()(Code)
	Returns true if and only if this URL causes a redirection.

pipe
void pipe(InputStream in, OutputStream out) throws IOException(Code)
	Pipes "in" to "out" until "in" is exhausted then closes "in".

readContent
public void readContent()(Code)
	Downloads the content of the given URL and stores it in a temporary cache file.

readGeneric
void readGeneric() throws IOException(Code)
	This method provides a fallback to the default Java implementation for protocols which have not been re-implemented.

readHTTP

void readHTTP() throws IOException(Code)

Downloads a file using the HTTP protocol. It was necessary to write a method to do this from scratch rather than using the default method in Java because:

There is no means for specifying the user agent using the default method.
There is a bug in Java 1.0 implementation that makes it incompatible with HTTP version 1.1.
Redirects are automatically followed (at least in Java 1.0) without providing a way to determine whether a redirect has occured.

readLine
String readLine(PushbackInputStream in) throws IOException(Code)
	A replacement for the java.io.DataInputStream which doesn't return the line ending characters like it should.

Methods inherited from java.lang.Object

native protected Object clone() throws CloneNotSupportedException(Code)(Java Doc)
public boolean equals(Object obj)(Code)(Java Doc)
protected void finalize() throws Throwable(Code)(Java Doc)
final native public Class getClass()(Code)(Java Doc)
native public int hashCode()(Code)(Java Doc)
final native public void notify()(Code)(Java Doc)
final native public void notifyAll()(Code)(Java Doc)
public String toString()(Code)(Java Doc)
final native public void wait(long timeout) throws InterruptedException(Code)(Java Doc)
final public void wait(long timeout, int nanos) throws InterruptedException(Code)(Java Doc)
final public void wait() throws InterruptedException(Code)(Java Doc)

www.java2java.com | Contact Us

All other trademarks are property of their respective owners.