| java.lang.Object org.apache.jmeter.protocol.http.parser.HTMLParser
All known Subclasses: org.apache.jmeter.protocol.http.parser.RegexpHTMLParser, org.apache.jmeter.protocol.http.parser.JTidyHTMLParser, org.apache.jmeter.protocol.http.parser.HtmlParserHTMLParser,
HTMLParser | abstract public class HTMLParser (Code) | | HtmlParsers can parse HTML content to obtain URLs.
author: Jordi Salvat i Alabart version: $Revision: 514343 $ updated on $Date: 2007-03-04 03:17:42 +0000 (Sun, 04 Mar 2007) $ |
Constructor Summary | |
protected | HTMLParser() Protected constructor to prevent instantiation except from within
subclasses. |
Method Summary | |
public Iterator | getEmbeddedResourceURLs(byte[] html, URL baseUrl) Get the URLs for all the resources that a browser would automatically
download following the download of the HTML content, that is: images,
stylesheets, javascript files, applets, etc...
URLs should not appear twice in the returned iterator.
Malformed URLs can be reported to the caller by having the Iterator
return the corresponding RL String. | abstract public Iterator | getEmbeddedResourceURLs(byte[] html, URL baseUrl, URLCollection coll) Get the URLs for all the resources that a browser would automatically
download following the download of the HTML content, that is: images,
stylesheets, javascript files, applets, etc...
All URLs should be added to the Collection.
Malformed URLs can be reported to the caller by having the Iterator
return the corresponding RL String. | public Iterator | getEmbeddedResourceURLs(byte[] html, URL baseUrl, Collection coll) Get the URLs for all the resources that a browser would automatically
download following the download of the HTML content, that is: images,
stylesheets, javascript files, applets, etc...
N.B. | final public static HTMLParser | getParser() | final public static synchronized HTMLParser | getParser(String htmlParserClassName) | protected boolean | isReusable() Parsers should over-ride this method if the parser class is re-usable, in
which case the class will be cached for the next getParser() call. |
ATT_BACKGROUND | final protected static String ATT_BACKGROUND(Code) | | |
ATT_IS_IMAGE | final protected static String ATT_IS_IMAGE(Code) | | |
DEFAULT_PARSER | final public static String DEFAULT_PARSER(Code) | | |
PARSER_CLASSNAME | final public static String PARSER_CLASSNAME(Code) | | |
HTMLParser | protected HTMLParser()(Code) | | Protected constructor to prevent instantiation except from within
subclasses.
|
getEmbeddedResourceURLs | public Iterator getEmbeddedResourceURLs(byte[] html, URL baseUrl) throws HTMLParseException(Code) | | Get the URLs for all the resources that a browser would automatically
download following the download of the HTML content, that is: images,
stylesheets, javascript files, applets, etc...
URLs should not appear twice in the returned iterator.
Malformed URLs can be reported to the caller by having the Iterator
return the corresponding RL String. Overall problems parsing the html
should be reported by throwing an HTMLParseException.
Parameters: html - HTML code Parameters: baseUrl - Base URL from which the HTML code was obtained an Iterator for the resource URLs |
getEmbeddedResourceURLs | abstract public Iterator getEmbeddedResourceURLs(byte[] html, URL baseUrl, URLCollection coll) throws HTMLParseException(Code) | | Get the URLs for all the resources that a browser would automatically
download following the download of the HTML content, that is: images,
stylesheets, javascript files, applets, etc...
All URLs should be added to the Collection.
Malformed URLs can be reported to the caller by having the Iterator
return the corresponding RL String. Overall problems parsing the html
should be reported by throwing an HTMLParseException.
N.B. The Iterator returns URLs, but the Collection will contain objects
of class URLString.
Parameters: html - HTML code Parameters: baseUrl - Base URL from which the HTML code was obtained Parameters: coll - URLCollection an Iterator for the resource URLs |
getEmbeddedResourceURLs | public Iterator getEmbeddedResourceURLs(byte[] html, URL baseUrl, Collection coll) throws HTMLParseException(Code) | | Get the URLs for all the resources that a browser would automatically
download following the download of the HTML content, that is: images,
stylesheets, javascript files, applets, etc...
N.B. The Iterator returns URLs, but the Collection will contain objects
of class URLString.
Parameters: html - HTML code Parameters: baseUrl - Base URL from which the HTML code was obtained Parameters: coll - Collection - will contain URLString objects, not URLs an Iterator for the resource URLs |
isReusable | protected boolean isReusable()(Code) | | Parsers should over-ride this method if the parser class is re-usable, in
which case the class will be cached for the next getParser() call.
true if the Parser is reusable |
|
|