| org.archive.crawler.framework.WriterPoolProcessor org.archive.crawler.writer.ExperimentalWARCWriterProcessor
Method Summary | |
protected String[] | getDefaultPath() | protected String | getFirstrecordStylesheet() | protected URI | getRecordID() | protected void | innerProcess(CrawlURI curi) Writes a CrawlURI and its associated data to store file. | protected URI | qualifyRecordID(URI base, String key, String value) | protected void | saveHeader(String origName, HttpMethodBase method, ANVLRecord headers, String newName) | protected void | setupPool(AtomicInteger serialNo) | protected void | write(String lowerCaseScheme, CrawlURI curi) | protected URI | writeMetadata(ExperimentalWARCWriter w, String timestamp, URI baseid, CrawlURI curi, ANVLRecord namedFields) | protected URI | writeRequest(ExperimentalWARCWriter w, String timestamp, String mimetype, URI baseid, CrawlURI curi, ANVLRecord namedFields) | protected URI | writeResponse(ExperimentalWARCWriter w, String timestamp, String mimetype, URI baseid, CrawlURI curi, ANVLRecord namedFields) | protected URI | writeRevisitDigest(ExperimentalWARCWriter w, String timestamp, String mimetype, URI baseid, CrawlURI curi, ANVLRecord namedFields) | protected URI | writeRevisitNotModified(ExperimentalWARCWriter w, String timestamp, URI baseid, CrawlURI curi, ANVLRecord namedFields) |
ATTR_WRITE_METADATA | final public static String ATTR_WRITE_METADATA(Code) | | Key for whether to write 'metadata' type records where possible
|
ATTR_WRITE_REQUESTS | final public static String ATTR_WRITE_REQUESTS(Code) | | Key for whether to write 'request' type records where possible
|
ATTR_WRITE_REVISIT_FOR_IDENTICAL_DIGESTS | final public static String ATTR_WRITE_REVISIT_FOR_IDENTICAL_DIGESTS(Code) | | Key for whether to write 'revisit' type records when
consecutive identical digest
|
ATTR_WRITE_REVISIT_FOR_NOT_MODIFIED | final public static String ATTR_WRITE_REVISIT_FOR_NOT_MODIFIED(Code) | | Key for whether to write 'revisit' type records for server
"304 not modified" responses
|
ExperimentalWARCWriterProcessor | public ExperimentalWARCWriterProcessor(String name)(Code) | | Parameters: name - Name of this writer. |
getFirstrecordStylesheet | protected String getFirstrecordStylesheet()(Code) | | |
innerProcess | protected void innerProcess(CrawlURI curi)(Code) | | Writes a CrawlURI and its associated data to store file.
Currently this method understands the following uri types: dns, http, and
https.
Parameters: curi - CrawlURI to process. |
saveHeader | protected void saveHeader(String origName, HttpMethodBase method, ANVLRecord headers, String newName)(Code) | | Save a header from the given HTTP operation into the
provider headers under a new name
Parameters: origName - header name to get if present Parameters: method - http operation containing headers |
|
|