Method Summary |
|
protected File | contentFile(String hex, String extension) Returns a File with the mapping of this content to its URLs. |
protected void | finalize() Calls finish and super.finalize(). |
public String | findDuplicate(HttpDoc doc) Returns URL-String of duplicate content (if found). |
public void | finish() Close storageDirectory File. |
protected String | generateFilename(String docURI) Generate a valid filename for the given docURI. |
protected String | getDefaultExtension(String contentType) Get default extension for given contentType. |
public int | getStorageDirDepth() Method getstorageDirDepth.
returns the directory depth of the source set directory
Parameters: desired - depth of source set directory. |
public void | processDocument(HttpDoc doc) Collects Urls (duplicates will be skipped).
Parameters: doc - a HttpDoc object to process. |
protected void | readContentFromZipFile(HttpDoc doc, ZipFile contentZip) |
protected boolean | readHeadersFromZipFile(HttpDoc doc, ZipFile zf) |
protected boolean | readLinksFromZipFile(HttpDoc doc, ZipFile zf) |
public void | removeDocument(URL url) Remove document from cache. |
public HttpDoc | retrieveFromCache(java.net.URL url) retrieves a document from the cache. |
public void | setStorageDirDepth(int depth) |
protected void | storeContent(HttpDoc doc) Creates a file with a name created by the content, containing the URL. |
public void | storeDocument(HttpDoc doc) Method store. |
public String | toString() List collected URLs. |
protected void | writeContentToZipFile(HttpDoc doc, ZipOutputStream zos) |
protected void | writeDirectoryInfo(HttpDoc doc, String filename) Write Directory info. |
protected ZipEntry | writeHeadersToZipFile(HttpDoc doc, ZipOutputStream zos) Write headers to zipFile. |
protected void | writeLinksToZipFile(List links, ZipOutputStream zs) Write links to ZipFile. |
protected ZipEntry | writeUrlToZipFile(HttpDoc doc, ZipOutputStream zos) Write Url to ZipFile. |