| |
|
| org.archive.crawler.framework.CrawlScope org.archive.crawler.scope.ClassicScope
All known Subclasses: org.archive.crawler.scope.SeedCachingScope, org.archive.crawler.scope.BroadScope, org.archive.crawler.scope.RefinedScope,
ClassicScope | public class ClassicScope extends CrawlScope (Code) | | ClassicScope: superclass with shared Scope behavior for
most common scopes.
Roughly, its logic is captured in innerAccept(). A URI is
included if:
forceAccepts(uri)
|| (((isSeed(uri)
|| focusAccepts(uri))
|| additionalFocusAccepts(uri)
|| transitiveAccepts(uri))
&& !excludeAccepts(uri));
Subclasses should override focusAccepts, additionalFocusAccepts,
and transitiveAccepts.
The excludeFilter may be specified by supplying
a exclude subelement. If unspecified, a
accepts-none filter will be used -- meaning that
no URIs will pass the filter and thus be excluded.
author: gojomo |
Method Summary | |
protected boolean | additionalFocusAccepts(Object o) Check if URI is accepted by the additional focus of this scope.
This method should be overridden in subclasses.
Parameters: o - the URI to check. | protected boolean | exceedsMaxHops(Object o) Check if there are too many hops
Parameters: o - URI to check. | protected boolean | excludeAccepts(Object o) Check if URI is excluded by any filters.
Parameters: o - the URI to check. | protected boolean | focusAccepts(Object o) Check if URI is accepted by the focus of this scope.
This method should be overridden in subclasses.
Parameters: o - the URI to check. | protected boolean | forceAccepts(Object o) Parameters: o - the URI to check. | final protected boolean | innerAccepts(Object o) Returns whether the given object (typically a CandidateURI) falls within
this scope.
Parameters: o - Object to test. | public void | kickUpdate() Take note of a situation (such as settings edit) where involved
reconfiguration (such as reading from external files) may be necessary. | protected boolean | transitiveAccepts(Object o) Parameters: o - the URI to check. |
ATTR_EXCLUDE_FILTER | final public static String ATTR_EXCLUDE_FILTER(Code) | | |
ATTR_FORCE_ACCEPT_FILTER | final public static String ATTR_FORCE_ACCEPT_FILTER(Code) | | |
ATTR_MAX_LINK_HOPS | final public static String ATTR_MAX_LINK_HOPS(Code) | | |
ATTR_MAX_TRANS_HOPS | final public static String ATTR_MAX_TRANS_HOPS(Code) | | |
ClassicScope | public ClassicScope(String name)(Code) | | Parameters: name - ignored by superclass |
ClassicScope | public ClassicScope()(Code) | | Default constructor.
|
additionalFocusAccepts | protected boolean additionalFocusAccepts(Object o)(Code) | | Check if URI is accepted by the additional focus of this scope.
This method should be overridden in subclasses.
Parameters: o - the URI to check. True if additional focus filter accepts passed object. |
exceedsMaxHops | protected boolean exceedsMaxHops(Object o)(Code) | | Check if there are too many hops
Parameters: o - URI to check. true if too many hops. |
excludeAccepts | protected boolean excludeAccepts(Object o)(Code) | | Check if URI is excluded by any filters.
Parameters: o - the URI to check. True if exclude filter accepts passed object. |
focusAccepts | protected boolean focusAccepts(Object o)(Code) | | Check if URI is accepted by the focus of this scope.
This method should be overridden in subclasses.
Parameters: o - the URI to check. True if focus filter accepts passed object. |
forceAccepts | protected boolean forceAccepts(Object o)(Code) | | Parameters: o - the URI to check. True if force-accepts filter accepts passed object. |
innerAccepts | final protected boolean innerAccepts(Object o)(Code) | | Returns whether the given object (typically a CandidateURI) falls within
this scope.
Parameters: o - Object to test. Whether the given object (typically a CandidateURI) falls withinthis scope. |
kickUpdate | public void kickUpdate()(Code) | | Take note of a situation (such as settings edit) where involved
reconfiguration (such as reading from external files) may be necessary.
|
transitiveAccepts | protected boolean transitiveAccepts(Object o)(Code) | | Parameters: o - the URI to check. True if transitive filter accepts passed object. |
|
|
|