| |
|
| org.archive.crawler.framework.Processor org.archive.crawler.postprocessor.LowDiskPauseProcessor
LowDiskPauseProcessor | public class LowDiskPauseProcessor extends Processor (Code) | | Processor module which uses 'df -k', where available and with
the expected output format (on Linux), to monitor available
disk space and pause the crawl if free space on monitored
filesystems falls below certain thresholds.
|
Method Summary | |
protected void | innerProcess(CrawlURI curi) Notes a CrawlURI's content size in its running tally. |
ATTR_MONITOR_MOUNTS | final public static String ATTR_MONITOR_MOUNTS(Code) | | List of mounts to monitor; should match "Mounted on" column of 'df' output
|
ATTR_PAUSE_THRESHOLD | final public static String ATTR_PAUSE_THRESHOLD(Code) | | Space available level below which a crawl-pause should be triggered.
|
ATTR_RECHECK_THRESHOLD | final public static String ATTR_RECHECK_THRESHOLD(Code) | | Amount of content received between each recheck of free space
|
AVAILABLE_EXTRACTOR | final public static Pattern AVAILABLE_EXTRACTOR(Code) | | |
DEFAULT_MONITOR_MOUNTS | final public static String DEFAULT_MONITOR_MOUNTS(Code) | | |
DEFAULT_PAUSE_THRESHOLD | final public static int DEFAULT_PAUSE_THRESHOLD(Code) | | |
DEFAULT_RECHECK_THRESHOLD | final public static int DEFAULT_RECHECK_THRESHOLD(Code) | | |
contentSinceCheck | protected int contentSinceCheck(Code) | | |
LowDiskPauseProcessor | public LowDiskPauseProcessor(String name)(Code) | | Parameters: name - Name of this writer. |
innerProcess | protected void innerProcess(CrawlURI curi)(Code) | | Notes a CrawlURI's content size in its running tally. If the
recheck increment of content has passed through since the last
available-space check, checks available space and pauses the
crawl if any monitored mounts are below the configured threshold.
Parameters: curi - CrawlURI to process. |
|
|
|