| java.lang.Object org.archive.extractor.CharSequenceLinkExtractor org.archive.extractor.RegexpCSSLinkExtractor
RegexpCSSLinkExtractor | public class RegexpCSSLinkExtractor extends CharSequenceLinkExtractor (Code) | | This extractor is parsing URIs from CSS type files.
The format of a CSS URL value is 'url(' followed by optional white space
followed by an optional single quote (') or double quote (") character
followed by the URL itself followed by an optional single quote (') or
double quote (") character followed by optional white space followed by ')'.
Parentheses, commas, white space characters, single quotes (') and double
quotes (") appearing in a URL must be escaped with a backslash:
'\(', '\)', '\,'. Partial URLs are interpreted relative to the source of
the style sheet, not relative to the document.
Source: www.w3.org
ROUGH DRAFT IN PROGRESS / incomplete... untested... major changes likely
author: igor gojomo |
CSS_BACKSLASH_ESCAPE | final static String CSS_BACKSLASH_ESCAPE(Code) | | |
CSS_URI_EXTRACTOR | final static String CSS_URI_EXTRACTOR(Code) | | CSS URL extractor pattern.
This pattern extracts URIs for CSS files
|
findNextLink | protected boolean findNextLink()(Code) | | |
reset | public void reset()(Code) | | |
Methods inherited from org.archive.extractor.CharSequenceLinkExtractor | protected CharSequence charSequenceFrom(InputStream content, Charset charset)(Code)(Java Doc) protected CharSequence createCharSequenceFrom(InputStream content, Charset charset)(Code)(Java Doc) public static void extract(CharSequence content, UURI source, UURI base, List<Link> collector, ExtractErrorListener extractErrorListener)(Code)(Java Doc) abstract protected boolean findNextLink()(Code)(Java Doc) public boolean hasNext()(Code)(Java Doc) protected static CharSequenceLinkExtractor newDefaultInstance()(Code)(Java Doc) public Object next()(Code)(Java Doc) public Link nextLink()(Code)(Java Doc) public void remove()(Code)(Java Doc) public void reset()(Code)(Java Doc) public void setup(UURI source, UURI base, InputStream content, Charset charset, ExtractErrorListener listener)(Code)(Java Doc) public void setup(UURI source, UURI base, CharSequence content, ExtractErrorListener listener)(Code)(Java Doc) public void setup(UURI sourceandbase, CharSequence content, ExtractErrorListener listener)(Code)(Java Doc) public void setup(UURI sourceandbase, InputStream content, Charset charset, ExtractErrorListener listener)(Code)(Java Doc)
|
|
|