A partial implementation of the StatisticsTracking interface.
It covers the thread handling. (Launching, pausing etc.) Included in this is
keeping track of the total time spent (actually) crawling. Several methods
to access the time started, finished etc. are provided.
To handle the thread work the class implements the CrawlStatusListener and
uses it's events to pause, resume and stop logging of statistics. The run()
method will call logActivity() at intervals specified in the crawl order.
getLogWriteInterval() The number of seconds to wait between writing snapshot data to log file.
public void
initialize(CrawlController c) Sets up the Logger (including logInterval) and registers with the
CrawlController for CrawlStatus and CrawlURIDisposition events.
Parameters: c - A crawl controller instance. throws: FatalConfigurationException - Not thrown here.
progressStatisticsEvent(EventObject e) A method for logging current crawler state.
This method will be called by run() at intervals specified in
the crawl order file.
If crawl has ended it will return the time it ended (given by
System.currentTimeMillis() at that time).
If crawl is still going on it will return the same as
System.currentTimeMillis() at the time of the call.
The time of the crawl ending or the current time if the crawl hasnot ended.
Get the time when the the crawl was last paused/suspended (as given by
System.currentTimeMillis() at that time). Will be 0 if the
crawl is not currently paused.
time of the crawl's last pause/suspend or 0 if the crawl is notcurrently paused.
Returns the number of milliseconds that the crawl spent paused or
otherwise in a nonactive state.
the number of msec. that the crawl was paused or otherwisesuspended.
Notify tracker that crawl has begun. Must be called
outside tracker's own thread, to ensure it is noted
before other threads start interacting with tracker.
A method for logging current crawler state.
This method will be called by run() at intervals specified in
the crawl order file. It is also invoked when pausing or
stopping a crawl to capture the state at that point. Default behavior is
call to
CrawlController.logProgressStatistics so CrawlController
can act on progress statistics event.
It is recommended that for implementations of this method it be
carefully considered if it should be synchronized in whole or in
part
Parameters: e - Progress statistics event.