| A core CrawlScope suitable for the most common
crawl needs.
Roughly, its logic is that a URI is included if:
(( isSeed(uri) || focusFilter.accepts(uri) )
|| transitiveFilter.accepts(uri) )
&& ! excludeFilter.accepts(uri)
The focusFilter may be specified by either:
- adding a 'mode' attribute to the
scope element. mode="broad" is equivalent
to no focus; modes "path", "host", and "domain"
imply a SeedExtensionFilter will be used, with
the scope element providing its configuration
- adding a focus subelement
If unspecified, the focusFilter will default to
an accepts-all filter.
The transitiveFilter may be specified by supplying
a transitive subelement. If unspecified, a
TransclusionFilter will be used, with the scope
element providing its configuration.
The excludeFilter may be specified by supplying
a exclude subelement. If unspecified, a
accepts-none filter will be used -- meaning that
no URIs will pass the filter and thus be excluded.
author: gojomoDecidingScope |