org.archive.crawler.admin |
org.archive.crawler.admin package
Contains classes that the web UI uses to monitor and control crawls. Some
utilities classes used exclusively or primarily for the UI are also
included.
Most of the heavy duty work is done by the CrawlJobHandler
that manages most of the interaction between the UI and the the
CrawlController . The CrawlJob class serves to
encapsulate the settings needed to launch one crawl.
This package also provides an implementation of the Statistics Tracking
interface that contains useful methods to access progress data. This is
used for monitoring crawls. While it is technically possible to launch
jobs without this statistics tracker, it would render the UI inoperable
as far as monitoring the progress of that crawl.
|
Java Source File Name | Type | Comment |
CrawlJob.java | Class | A CrawlJob encapsulates a 'crawl order' with any and all information and
methods needed by a CrawlJobHandler to accept and execute them.
A given crawl job may also be a 'profile' for a crawl. |
CrawlJobErrorHandler.java | Class | An implementation of the ValueErrorHandler for the UI. |
CrawlJobHandler.java | Class | This class manages CrawlJobs. |
InvalidJobFileException.java | Class | An exception that is thrown when a program encounters a jobfile that is
corrupt or otherwise incomplete or invalid. |
SeedRecord.java | Class | Record of all interesting info about the most-recent
processing of a specific seed. |
StatisticsSummary.java | Class | This class provides descriptive statistics of a finished crawl job by
using the crawl report files generated by StatisticsTracker. |
StatisticsTracker.java | Class | This is an implementation of the AbstractTracker. |