| org.archive.crawler.framework.Processor org.archive.crawler.processor.recrawl.PersistProcessor
All known Subclasses: org.archive.crawler.processor.recrawl.PersistOnlineProcessor, org.archive.crawler.processor.recrawl.PersistLogProcessor,
PersistProcessor | abstract public class PersistProcessor extends Processor (Code) | | Superclass for Processors which utilize BDB-JE for URI state
(including most notably history) persistence.
author: gojomo |
Method Summary | |
protected static DatabaseConfig | historyDatabaseConfig() | public static void | main(String[] args) Utility main for importing a log into a BDB-JE environment or moving a
database between environments (2 arguments), or simply dumping a log
to stdout in a more readable format (1 argument). | public String | persistKeyFor(CrawlURI curi) Return a preferred String key for persisting the given CrawlURI's
AList state. | protected boolean | shouldLoad(CrawlURI curi) | protected boolean | shouldStore(CrawlURI curi) |
URI_HISTORY_DBNAME | final public static String URI_HISTORY_DBNAME(Code) | | name of history Database
|
PersistProcessor | public PersistProcessor(String name, String string)(Code) | | Usual constructor
Parameters: name - Parameters: string - |
historyDatabaseConfig | protected static DatabaseConfig historyDatabaseConfig()(Code) | | DatabaseConfig for history Database |
main | public static void main(String[] args) throws DatabaseException, IOException(Code) | | Utility main for importing a log into a BDB-JE environment or moving a
database between environments (2 arguments), or simply dumping a log
to stdout in a more readable format (1 argument).
Parameters: args - command-line arguments throws: DatabaseException - throws: IOException - |
persistKeyFor | public String persistKeyFor(CrawlURI curi)(Code) | | Return a preferred String key for persisting the given CrawlURI's
AList state.
Parameters: curi - CrawlURI String key |
shouldLoad | protected boolean shouldLoad(CrawlURI curi)(Code) | | Whether the current CrawlURI's state should be loaded
Parameters: curi - CrawlURI true if state should be loaded; false to skip loading |
shouldStore | protected boolean shouldStore(CrawlURI curi)(Code) | | Whether the current CrawlURI's state should be persisted (to log or
direct to database)
Parameters: curi - CrawlURI true if state should be stored; false to skip persistence |
|
|