Package

net.ruippeixotog.scalascraper

browser

Permalink

package browser

Visibility
  1. Public
  2. All

Type Members

  1. trait Browser extends AnyRef

    Permalink
  2. class HtmlUnitBrowser extends Browser

    Permalink

    A Browser implementation based on HtmlUnit, a GUI-less browser for Java programs.

    A Browser implementation based on HtmlUnit, a GUI-less browser for Java programs. HtmlUnitBrowser simulates thoroughly a web browser, executing JavaScript code in the pages besides parsing and modelling its HTML content. It supports several compatibility modes, allowing it to emulate browsers such as Internet Explorer.

    Both the Document and the Element instances obtained from HtmlUnitBrowser can be mutated in the background. JavaScript code can at any time change attributes and the content of elements, reflected both in queries to Document and on previously stored references to Elements. The Document instance will always represent the current page in the browser's "window". This means the Document's location value can change, together with its root element, in the event of client-side page refreshes or redirections. However, Element instances belong to a fixed DOM tree and they stop being meaningful as soon as they are removed from the DOM or a client-side page reload occurs.

  3. class JsoupBrowser extends Browser

    Permalink

    A Browser implementation based on jsoup, a Java HTML parser library.

    A Browser implementation based on jsoup, a Java HTML parser library. JsoupBrowser provides powerful and efficient document querying, but it doesn't run JavaScript in the pages. As such, it is limited to working strictly with the HTML send in the page source.

    Currently, JsoupBrowser does not keep separate cookie stores for different domains and paths. In each request all cookies set previously will be sent, regardless of the domain they were set on. If you do requests to different domains and do not want this behavior, use different JsoupBrowser instances.

    As the documents parsed by JsoupBrowser instances are not changed after loading, Document and Element instances obtained from them are guaranteed to be immutable.

Value Members

  1. object HtmlUnitBrowser

    Permalink
  2. object JsoupBrowser

    Permalink

Ungrouped