Java Doc for HttpDocToFile.java in  » Web-Crawler » JoBo » net » matuschek » http » Java Source Code / Java DocumentationJava Source Code and Java Documentation

Java Source Code / Java Documentation
1. 6.0 JDK Core
2. 6.0 JDK Modules
3. 6.0 JDK Modules com.sun
4. 6.0 JDK Modules com.sun.java
5. 6.0 JDK Modules sun
6. 6.0 JDK Platform
7. Ajax
8. Apache Harmony Java SE
9. Aspect oriented
10. Authentication Authorization
11. Blogger System
12. Build
13. Byte Code
14. Cache
15. Chart
16. Chat
17. Code Analyzer
18. Collaboration
19. Content Management System
20. Database Client
21. Database DBMS
22. Database JDBC Connection Pool
23. Database ORM
24. Development
25. EJB Server geronimo
26. EJB Server GlassFish
27. EJB Server JBoss 4.2.1
28. EJB Server resin 3.1.5
29. ERP CRM Financial
30. ESB
31. Forum
32. GIS
33. Graphic Library
34. Groupware
35. HTML Parser
36. IDE
37. IDE Eclipse
38. IDE Netbeans
39. Installer
40. Internationalization Localization
41. Inversion of Control
42. Issue Tracking
43. J2EE
44. JBoss
45. JMS
46. JMX
47. Library
48. Mail Clients
49. Net
50. Parser
51. PDF
52. Portal
53. Profiler
54. Project Management
55. Report
56. RSS RDF
57. Rule Engine
58. Science
59. Scripting
60. Search Engine
61. Security
62. Sevlet Container
63. Source Control
64. Swing Library
65. Template Engine
66. Test Coverage
67. Testing
68. UML
69. Web Crawler
70. Web Framework
71. Web Mail
72. Web Server
73. Web Services
74. Web Services apache cxf 2.0.1
75. Web Services AXIS2
76. Wiki Engine
77. Workflow Engines
78. XML
79. XML UI
Java
Java Tutorial
Java Open Source
Jar File Download
Java Articles
Java Products
Java by API
Photoshop Tutorials
Maya Tutorials
Flash Tutorials
3ds-Max Tutorials
Illustrator Tutorials
GIMP Tutorials
C# / C Sharp
C# / CSharp Tutorial
C# / CSharp Open Source
ASP.Net
ASP.NET Tutorial
JavaScript DHTML
JavaScript Tutorial
JavaScript Reference
HTML / CSS
HTML CSS Reference
C / ANSI-C
C Tutorial
C++
C++ Tutorial
Ruby
PHP
Python
Python Tutorial
Python Open Source
SQL Server / T-SQL
SQL Server / T-SQL Tutorial
Oracle PL / SQL
Oracle PL/SQL Tutorial
PostgreSQL
SQL / MySQL
MySQL Tutorial
VB.Net
VB.Net Tutorial
Flash / Flex / ActionScript
VBA / Excel / Access / Word
XML
XML Tutorial
Microsoft Office PowerPoint 2007 Tutorial
Microsoft Office Excel 2007 Tutorial
Microsoft Office Word 2007 Tutorial
Java Source Code / Java Documentation » Web Crawler » JoBo » net.matuschek.http 
Source Cross Reference  Class Diagram Java Document (Java Doc) 


java.lang.Object
   net.matuschek.http.AbstractHttpDocManager
      net.matuschek.http.HttpDocToFile

HttpDocToFile
public class HttpDocToFile extends AbstractHttpDocManager (Code)
DocumentManager that will store document contents in a file.
author:
   Daniel Matuschek
version:
   $Revision: 1.11 $



Constructor Summary
public  HttpDocToFile(String baseDir)
    

Method Summary
protected  voidcreateDirs(String filename)
    
public  StringgetBaseDir()
    
protected  FilegetCacheFile(URL url)
     Gets the cacheFile of the given URL if its document was stored.
protected  StringgetExtension(URL url)
     Gets the extension of the given URL if its document was stored.
public  intgetMinFileSize()
     gets the value of minFileSize.
public  booleangetStoreCGI()
    
public  booleanisReplaceAllSpecials()
     Get the value of replaceAllSpecials. if replaceAllSpecials is true, all sepcial characters in the URL will be replaced by "-".
public  voidremoveDocument(URL u)
     Removes a document that was stored previous from the file system.
public  HttpDocretrieveFromCache(URL u)
     Gets a document that was stored previous from the file system. Because the HttpDocToFile does not store the HTTP headers, only the Content-Type header will exists.
public  voidsetBaseDir(String baseDir)
    
public  voidsetMinFileSize(int minFileSize)
    
public  voidsetReplaceAllSpecials(boolean v)
     Set the value of replaceAllSpecials. if replaceAllSpecials is true, all sepcial characters in the URL will be replaced by "-".
public  voidsetStoreCGI(boolean v)
     Set the value of storeCGI.
public  voidstoreDocument(HttpDoc doc)
    
protected  Stringurl2Filename(URL u)
    


Constructor Detail
HttpDocToFile
public HttpDocToFile(String baseDir)(Code)
creates a new HttpDocToFile object that will store the documents in the given directory




Method Detail
createDirs
protected void createDirs(String filename) throws IOException(Code)
creates all directories that are needed to place the file filename if they don't exists
Parameters:
  filename - the full path name of a file



getBaseDir
public String getBaseDir()(Code)
gets the value of baseDir the value of baseDir



getCacheFile
protected File getCacheFile(URL url)(Code)
Gets the cacheFile of the given URL if its document was stored.
Parameters:
  url - cacheFile



getExtension
protected String getExtension(URL url)(Code)
Gets the extension of the given URL if its document was stored.
Parameters:
  url - String



getMinFileSize
public int getMinFileSize()(Code)
gets the value of minFileSize. Files smaller then this size (in Bytes) will not be saved to disk ! the value of minFileSize



getStoreCGI
public boolean getStoreCGI()(Code)
Get the value of storeCGI If this is true, the object will store ALL retrieved documents, otherwise it will store only documents from URLs that do not have a "?" in the URL



isReplaceAllSpecials
public boolean isReplaceAllSpecials()(Code)
Get the value of replaceAllSpecials. if replaceAllSpecials is true, all sepcial characters in the URL will be replaced by "-". This is useful for operating system that can't handle files with special characters in the filename (e.g. Windows) value of replaceAllSpecials.



removeDocument
public void removeDocument(URL u)(Code)
Removes a document that was stored previous from the file system. Because the HttpDocToFile does not store the HTTP headers, only the Content-Type header will exists. Even this header may not be correct. It will only use a simple heuristic to determine the possible MIME type.



retrieveFromCache
public HttpDoc retrieveFromCache(URL u)(Code)
Gets a document that was stored previous from the file system. Because the HttpDocToFile does not store the HTTP headers, only the Content-Type header will exists. Even this header may not be correct. It will only use a simple heuristic to determine the possible MIME type. null, if this document was not stored before or it seemsto be a dynamic document.



setBaseDir
public void setBaseDir(String baseDir)(Code)
sets the value of basedir
Parameters:
  baseDir - the new value of baseDir



setMinFileSize
public void setMinFileSize(int minFileSize)(Code)
sets the value of minFileSize
Parameters:
  minFileSize - the new value of minFileSize
See Also:   HttpDocToFile.getMinFileSize()



setReplaceAllSpecials
public void setReplaceAllSpecials(boolean v)(Code)
Set the value of replaceAllSpecials. if replaceAllSpecials is true, all sepcial characters in the URL will be replaced by "-". This is useful for operating system that can't handle files with special characters in the filename (e.g. Windows)
Parameters:
  v - Value to assign to replaceAllSpecials.



setStoreCGI
public void setStoreCGI(boolean v)(Code)
Set the value of storeCGI. If this is true, the object will store ALL retrieved documents, otherwise it will store only documents from URLs that do not have a "?" in the URL
Parameters:
  v - Value to assign to storeCGI.



storeDocument
public void storeDocument(HttpDoc doc) throws DocManagerException(Code)
store document (that means write it to disk)
Parameters:
  doc - the document to store
exception:
  DocManagerException - if the document can't be stored(some IO error occured)



url2Filename
protected String url2Filename(URL u)(Code)
converts an URL to a filename http://host/path will be converted to basedir/host/path
Parameters:
  URL - a URL to convert, must not be null a pathname



Methods inherited from net.matuschek.http.AbstractHttpDocManager
public String findDuplicate(HttpDoc doc) throws IOException(Code)(Java Doc)
public void finish()(Code)(Java Doc)
public void processDocument(HttpDoc doc) throws DocManagerException(Code)(Java Doc)
public void removeDocument(URL url)(Code)(Java Doc)
public HttpDoc retrieveFromCache(URL u)(Code)(Java Doc)
public void storeDocument(HttpDoc doc) throws DocManagerException(Code)(Java Doc)

Methods inherited from java.lang.Object
native protected Object clone() throws CloneNotSupportedException(Code)(Java Doc)
public boolean equals(Object obj)(Code)(Java Doc)
protected void finalize() throws Throwable(Code)(Java Doc)
final native public Class getClass()(Code)(Java Doc)
native public int hashCode()(Code)(Java Doc)
final native public void notify()(Code)(Java Doc)
final native public void notifyAll()(Code)(Java Doc)
public String toString()(Code)(Java Doc)
final native public void wait(long timeout) throws InterruptedException(Code)(Java Doc)
final public void wait(long timeout, int nanos) throws InterruptedException(Code)(Java Doc)
final public void wait() throws InterruptedException(Code)(Java Doc)

www.java2java.com | Contact Us
Copyright 2009 - 12 Demo Source and Support. All rights reserved.
All other trademarks are property of their respective owners.