| |
|
| java.lang.Object org.archive.crawler.framework.Checkpointer
Checkpointer | public class Checkpointer implements Serializable(Code) | | Runs checkpointing.
Also keeps history of crawl checkpoints Generally used by CrawlController
only but also has static utility methods classes that need to participate in
a checkpoint can use.
author: gojomo author: stack |
Inner Class :public class CheckpointingThread extends Thread | |
Checkpointer | public Checkpointer(CrawlController cc, File checkpointDir)(Code) | | Create a new CheckpointContext with the given store directory
Parameters: cc - CrawlController instance thats hosting this Checkpointer. Parameters: checkpointDir - Where to store checkpoint. |
Checkpointer | public Checkpointer(CrawlController cc, String prefix)(Code) | | Create a new CheckpointContext with the given store directory
Parameters: cc - CrawlController instance thats hosting this Checkpointer. Parameters: prefix - Prefix for checkpoint label. |
checkpoint | public void checkpoint()(Code) | | Run a checkpoint of the crawler.
|
checkpointFailed | protected void checkpointFailed(Exception e)(Code) | | Note that a checkpoint failed
Parameters: e - Exception checkpoint failed on. |
checkpointFailed | protected void checkpointFailed(String message)(Code) | | |
checkpointFailed | protected void checkpointFailed()(Code) | | |
clearCheckpointInProgressDirectory | protected void clearCheckpointInProgressDirectory()(Code) | | |
createCheckpointInProgressDirectory | protected File createCheckpointInProgressDirectory()(Code) | | |
formatCheckpointName | public static String formatCheckpointName(String prefix, int index)(Code) | | |
getCheckpointInProgressDirectory | public File getCheckpointInProgressDirectory()(Code) | | Checkpoint directory. Name of the directory is the name of thiscurrent checkpoint. Null if no checkpoint in progress. |
getNextCheckpoint | public int getNextCheckpoint()(Code) | | Returns the nextCheckpoint index. |
getNextCheckpointName | public String getNextCheckpointName()(Code) | | next checkpoint name (zero-padding string). |
getPredecessorCheckpoints | public List getPredecessorCheckpoints()(Code) | | Returns the predecessorCheckpoints. |
isAtBeginning | public boolean isAtBeginning()(Code) | | Return whether this context is at a new crawl, never-checkpointed state. |
isCheckpointErrors | protected boolean isCheckpointErrors()(Code) | | |
isCheckpointFailed | public boolean isCheckpointFailed()(Code) | | True if current/last checkpoint failed. |
isCheckpointing | public boolean isCheckpointing()(Code) | | True if a checkpoint is in progress. |
recover | public void recover(CrawlController cc)(Code) | | Call when recovering from a checkpoint.
Call this after instance has been revivifyied post-serialization to
amend counters and directories that effect where checkpoints get stored
from here on out.
Parameters: cc - CrawlController instance. |
setCheckpointErrors | protected void setCheckpointErrors(boolean checkpointErrors)(Code) | | |
writeValidity | protected void writeValidity()(Code) | | |
|
|
|