| |
|
| org.archive.crawler.framework.FrontierHostStatistics
FrontierHostStatistics | public interface FrontierHostStatistics (Code) | | An optional interface the Frontiers can implement to provide information
about specific hosts.
Some URIFrontier implmentations will want to provide a number of
statistics relating to the progress of particular hosts. This only applies
to those Frontiers whose internal structure uses hosts to split up the
workload and (for example) implement politeness. Some other Frontiers may
also provide this info based on calculations.
author: Kristinn Sigurdsson See Also: org.archive.crawler.framework.Frontier |
Field Summary | |
final public static int | HOST_DEFERRED Host has been deferred for some amount of time, will become ready once
once that time has elapsed. | final public static int | HOST_INACTIVE Host has been encountered and all availible URIs for it have been
processed already. | final public static int | HOST_INPROCESS Host has URIs currently being proessed. | final public static int | HOST_READY Host has URIs ready to be emited. | final public static int | HOST_UNKNOWN Host has not been encountered by the Frontier, or has been encountered
but has been inactive so long that it has expired. |
Method Summary | |
public int | activeHosts() Total number of hosts that are currently active. | public int | deferredHosts() Total number of deferred hosts. | public int | hostStatus(String host) Get the status of a host.
Hosts can be in one of the following states:
Some Frontiers may allow a host to have more then one URI in process
at the same time. | public int | inProcessHosts() Total number of hosts with URIs in process.
It is generally assumed that each host can have only 1 URI in
process at the same time. | public int | inactiveHosts() Total number of inactive hosts. | public int | readyHosts() Total number of hosts that have a URI ready for processing. |
HOST_DEFERRED | final public static int HOST_DEFERRED(Code) | | Host has been deferred for some amount of time, will become ready once
once that time has elapsed. This is most likely due to politeness or
waiting between retries. Other conditions may exist.
|
HOST_INACTIVE | final public static int HOST_INACTIVE(Code) | | Host has been encountered and all availible URIs for it have been
processed already. More URIs may become availible later or not.
Inactive hosts may eventually become 'forgotten'.
|
HOST_INPROCESS | final public static int HOST_INPROCESS(Code) | | Host has URIs currently being proessed.
|
HOST_READY | final public static int HOST_READY(Code) | | Host has URIs ready to be emited.
|
HOST_UNKNOWN | final public static int HOST_UNKNOWN(Code) | | Host has not been encountered by the Frontier, or has been encountered
but has been inactive so long that it has expired.
|
activeHosts | public int activeHosts()(Code) | | Total number of hosts that are currently active.
Active hosts are considered to be those that are ready, deferred or
in process.
Total number of hosts that are currently active. |
deferredHosts | public int deferredHosts()(Code) | | Total number of deferred hosts.
Deferred hosts are currently active hosts that have been deferred
from processing for the time being (becausee of politeness or waiting
before retrying.
Total number of deferred hosts. |
inProcessHosts | public int inProcessHosts()(Code) | | Total number of hosts with URIs in process.
It is generally assumed that each host can have only 1 URI in
process at the same time. However some frontiers may implement
politeness differently meaning that the same host is both ready and
in process.
FrontierHostStatistics.activeHosts() activeHosts() will not count them
twice though.
Total number of hosts with URIs in process. |
inactiveHosts | public int inactiveHosts()(Code) | | Total number of inactive hosts.
Inactive hosts are those hosts that have been active but have now been
exhausted and contain no more additional URIs.
Total number of inactive hosts. |
readyHosts | public int readyHosts()(Code) | | Total number of hosts that have a URI ready for processing.
Total number of hosts that have a URI ready for processing. |
|
|
|