Java Doc for URLStatus.java in  » Search-Engine » BDDBot » bdd » search » spider » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation
1. 6.0 JDK Core
2. 6.0 JDK Modules
3. 6.0 JDK Modules com.sun
4. 6.0 JDK Modules com.sun.java
5. 6.0 JDK Modules sun
6. 6.0 JDK Platform
7. Ajax
8. Apache Harmony Java SE
9. Aspect oriented
10. Authentication Authorization
11. Blogger System
12. Build
13. Byte Code
14. Cache
15. Chart
16. Chat
17. Code Analyzer
18. Collaboration
19. Content Management System
20. Database Client
21. Database DBMS
22. Database JDBC Connection Pool
23. Database ORM
24. Development
25. EJB Server geronimo
26. EJB Server GlassFish
27. EJB Server JBoss 4.2.1
28. EJB Server resin 3.1.5
29. ERP CRM Financial
30. ESB
31. Forum
32. GIS
33. Graphic Library
34. Groupware
35. HTML Parser
36. IDE
37. IDE Eclipse
38. IDE Netbeans
39. Installer
40. Internationalization Localization
41. Inversion of Control
42. Issue Tracking
43. J2EE
44. JBoss
45. JMS
46. JMX
47. Library
48. Mail Clients
49. Net
50. Parser
51. PDF
52. Portal
53. Profiler
54. Project Management
55. Report
56. RSS RDF
57. Rule Engine
58. Science
59. Scripting
60. Search Engine
61. Security
62. Sevlet Container
63. Source Control
64. Swing Library
65. Template Engine
66. Test Coverage
67. Testing
68. UML
69. Web Crawler
70. Web Framework
71. Web Mail
72. Web Server
73. Web Services
74. Web Services apache cxf 2.0.1
75. Web Services AXIS2
76. Wiki Engine
77. Workflow Engines
78. XML
79. XML UI
Java
Java Tutorial
Java Open Source
Jar File Download
Java Articles
Java Products
Java by API
Photoshop Tutorials
Maya Tutorials
Flash Tutorials
3ds-Max Tutorials
Illustrator Tutorials
GIMP Tutorials
C# / C Sharp
C# / CSharp Tutorial
C# / CSharp Open Source
ASP.Net
ASP.NET Tutorial
JavaScript DHTML
JavaScript Tutorial
JavaScript Reference
HTML / CSS
HTML CSS Reference
C / ANSI-C
C Tutorial
C++
C++ Tutorial
Ruby
PHP
Python
Python Tutorial
Python Open Source
SQL Server / T-SQL
SQL Server / T-SQL Tutorial
Oracle PL / SQL
Oracle PL/SQL Tutorial
PostgreSQL
SQL / MySQL
MySQL Tutorial
VB.Net
VB.Net Tutorial
Flash / Flex / ActionScript
VBA / Excel / Access / Word
XML
XML Tutorial
Microsoft Office PowerPoint 2007 Tutorial
Microsoft Office Excel 2007 Tutorial
Microsoft Office Word 2007 Tutorial
Java Source Code / Java Documentation » Search Engine » BDDBot » bdd.search.spider 
Source Cross Reference  Class Diagram Java Document (Java Doc) 


java.lang.Object
   bdd.search.spider.URLStatus

URLStatus
public class URLStatus (Code)
Written by Tim Macinta 1997
Distributed under the GNU Public License (a copy of which is enclosed with the source).

This class holds information about the content at a particular URL. It can also be used to fetch and parse an URL.


Field Summary
final static  intDUPLICATE
    
final static  intIO_ERROR
    
final static  intLOADED
    
final static  intMISC_ERROR
    
final static  intMISSING
    
final static  intMOVED
    
final static  intNOT_LOADED
    
final static  intTIMED_OUT
    
final static  intUNSUPPORTED_MIMETYPE
    
 URLactual_url
    
 Stringemail_address
    
 EnginePrefseng_prefs
    
 URLgiven_url
    
 Stringmime_type
    
 intstatus
    
 Filetemp_file
    
 Stringuser_agent
    

Constructor Summary
public  URLStatus(URL url, File temp_file, EnginePrefs eng_prefs)
     "url" is the location of the information and "temp_file" is the temporary file that can be used to store the contents of this url.

Method Summary
public  voiddumpToDatabase(DataOutputStream out)
     Creates a database containing just this URL.
 voiddumpWords(DataOutputStream out)
     Dumps the words contained in this URL in database format to "out".
public  voidfinalize()
     Gets rid of the temporary file.
public  FilegetCacheFile()
     Returns the file that is used to cache the contents of this URL.
public  longgetContentLength()
     Returns the length of the content, or 0 if it's unknown.
public  LinkExtractorgetLinkExtractor()
     Returns a LinkExtractor that can handle this URL's mime type. To add support for new mime types add a LinkExtractor that handles those mime types here and add appropriate WordExtractors to the getWordExtractor() method.
public  WordExtractorgetWordExtractor()
     Returns a WordExtractor that can handle this URL's mime type. To add support for new mime types add a WordExtractor that handles those mime types here and add appropriate LinkExtractors to the getLinkExtractor() method.
public  booleanloaded()
     Returns true if and only if this URL was loaded without an error.
public  booleanmimeTypeUnderstood(String mime_type)
     Returns true if and only if this mime type can be processed.
public  booleanmoved()
     Returns true if and only if this URL causes a redirection.
 voidpipe(InputStream in, OutputStream out)
     Pipes "in" to "out" until "in" is exhausted then closes "in".
public  voidreadContent()
     Downloads the content of the given URL and stores it in a temporary cache file.
 voidreadGeneric()
     This method provides a fallback to the default Java implementation for protocols which have not been re-implemented.
 voidreadHTTP()
     Downloads a file using the HTTP protocol.
 StringreadLine(PushbackInputStream in)
     A replacement for the java.io.DataInputStream which doesn't return the line ending characters like it should.

Field Detail
DUPLICATE
final static int DUPLICATE(Code)



IO_ERROR
final static int IO_ERROR(Code)



LOADED
final static int LOADED(Code)



MISC_ERROR
final static int MISC_ERROR(Code)



MISSING
final static int MISSING(Code)



MOVED
final static int MOVED(Code)



NOT_LOADED
final static int NOT_LOADED(Code)



TIMED_OUT
final static int TIMED_OUT(Code)



UNSUPPORTED_MIMETYPE
final static int UNSUPPORTED_MIMETYPE(Code)



actual_url
URL actual_url(Code)



email_address
String email_address(Code)



eng_prefs
EnginePrefs eng_prefs(Code)



given_url
URL given_url(Code)



mime_type
String mime_type(Code)



status
int status(Code)



temp_file
File temp_file(Code)



user_agent
String user_agent(Code)




Constructor Detail
URLStatus
public URLStatus(URL url, File temp_file, EnginePrefs eng_prefs)(Code)
"url" is the location of the information and "temp_file" is the temporary file that can be used to store the contents of this url.




Method Detail
dumpToDatabase
public void dumpToDatabase(DataOutputStream out) throws IOException(Code)
Creates a database containing just this URL.



dumpWords
void dumpWords(DataOutputStream out) throws IOException(Code)
Dumps the words contained in this URL in database format to "out".



finalize
public void finalize() throws Throwable(Code)
Gets rid of the temporary file.



getCacheFile
public File getCacheFile()(Code)
Returns the file that is used to cache the contents of this URL.



getContentLength
public long getContentLength()(Code)
Returns the length of the content, or 0 if it's unknown.



getLinkExtractor
public LinkExtractor getLinkExtractor() throws IOException(Code)
Returns a LinkExtractor that can handle this URL's mime type. To add support for new mime types add a LinkExtractor that handles those mime types here and add appropriate WordExtractors to the getWordExtractor() method. Also, add the mime type to the list in the mimeTypeUnderstood() method.



getWordExtractor
public WordExtractor getWordExtractor() throws IOException(Code)
Returns a WordExtractor that can handle this URL's mime type. To add support for new mime types add a WordExtractor that handles those mime types here and add appropriate LinkExtractors to the getLinkExtractor() method. Also, add the mime type to the list in the mimeTypeUnderstood() method.



loaded
public boolean loaded()(Code)
Returns true if and only if this URL was loaded without an error.



mimeTypeUnderstood
public boolean mimeTypeUnderstood(String mime_type)(Code)
Returns true if and only if this mime type can be processed.



moved
public boolean moved()(Code)
Returns true if and only if this URL causes a redirection.



pipe
void pipe(InputStream in, OutputStream out) throws IOException(Code)
Pipes "in" to "out" until "in" is exhausted then closes "in".



readContent
public void readContent()(Code)
Downloads the content of the given URL and stores it in a temporary cache file.



readGeneric
void readGeneric() throws IOException(Code)
This method provides a fallback to the default Java implementation for protocols which have not been re-implemented.



readHTTP
void readHTTP() throws IOException(Code)
Downloads a file using the HTTP protocol. It was necessary to write a method to do this from scratch rather than using the default method in Java because:

  • There is no means for specifying the user agent using the default method.
  • There is a bug in Java 1.0 implementation that makes it incompatible with HTTP version 1.1.
  • Redirects are automatically followed (at least in Java 1.0) without providing a way to determine whether a redirect has occured.



readLine
String readLine(PushbackInputStream in) throws IOException(Code)
A replacement for the java.io.DataInputStream which doesn't return the line ending characters like it should.



Methods inherited from java.lang.Object
native protected Object clone() throws CloneNotSupportedException(Code)(Java Doc)
public boolean equals(Object obj)(Code)(Java Doc)
protected void finalize() throws Throwable(Code)(Java Doc)
final native public Class getClass()(Code)(Java Doc)
native public int hashCode()(Code)(Java Doc)
final native public void notify()(Code)(Java Doc)
final native public void notifyAll()(Code)(Java Doc)
public String toString()(Code)(Java Doc)
final native public void wait(long timeout) throws InterruptedException(Code)(Java Doc)
final public void wait(long timeout, int nanos) throws InterruptedException(Code)(Java Doc)
final public void wait() throws InterruptedException(Code)(Java Doc)

www.java2java.com | Contact Us
Copyright 2009 - 12 Demo Source and Support. All rights reserved.
All other trademarks are property of their respective owners.