Class JsoupBasedHtmlParser
-
- All Implemented Interfaces:
-
org.apache.jmeter.protocol.http.parser.LinkExtractorParser
public class JsoupBasedHtmlParser extends HTMLParser
Parser based on JSOUP
-
-
Field Summary
Fields Modifier and Type Field Description public final static String
PARSER_CLASSNAME
public final static String
DEFAULT_PARSER
-
Constructor Summary
Constructors Constructor Description JsoupBasedHtmlParser()
-
Method Summary
Modifier and Type Method Description Iterator<URL>
getEmbeddedResourceURLs(String userAgent, Array<byte> html, URL baseUrl, URLCollection coll, String encoding)
Get the URLs for all the resources that a browser would automatically download following the download of the HTML content, that is: images, stylesheets, javascript files, applets, etc... -
Methods inherited from class org.apache.jmeter.protocol.http.parser.HTMLParser
getEmbeddedResourceURLs, getEmbeddedResourceURLs
-
Methods inherited from class org.apache.jmeter.protocol.http.parser.BaseParser
getParser, isReusable
-
Methods inherited from class org.apache.jmeter.protocol.http.parser.LinkExtractorParser
getEmbeddedResourceURLs
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
-
Method Detail
-
getEmbeddedResourceURLs
Iterator<URL> getEmbeddedResourceURLs(String userAgent, Array<byte> html, URL baseUrl, URLCollection coll, String encoding)
Get the URLs for all the resources that a browser would automatically download following the download of the HTML content, that is: images, stylesheets, javascript files, applets, etc...
All URLs should be added to the Collection.
Malformed URLs can be reported to the caller by having the Iterator return the corresponding RL String. Overall problems parsing the html should be reported by throwing an HTMLParseException.
N.B. The Iterator returns URLs, but the Collection will contain objects of class URLString.
- Parameters:
userAgent
- User Agenthtml
- HTML codebaseUrl
- Base URL from which the HTML code was obtainedcoll
- URLCollectionencoding
- Charset
-
-
-
-