Class Crawler

  • All Implemented Interfaces:
    Interruptible, AutoCloseable

    public final class Crawler
    extends Object
    implements Interruptible, AutoCloseable
    This class handles the coordination between classes during the pre and post fetching of a page such as executing threads, calling to fetcher and manipulating the priority of a scheduled request.
    Author:
    Maksim Tkachenko, Truong Quoc Tuan, Ween Jiann Lee
    • Method Detail

      • builder

        public static Crawler.Builder builder()
        Creates a new instance of Builder.
        Returns:
        an instance of Builder.
      • buildDefault

        public static Crawler buildDefault()
        Builds a new default instance of Crawler.
        Returns:
        an instance of Crawler.
      • getScheduler

        public Scheduler getScheduler()
        Get the instance of scheduler used.
        Returns:
        the instance of scheduler used.
      • start

        public Crawler start()
        Starts the crawler by starting a new thread to poll for jobs.
        Returns:
        the instance of Crawler used.
      • startAndClose

        public Crawler startAndClose()
                              throws Exception
        Starts the crawler by starting a new thread to poll for jobs and close it after the jobQueue has reached 0.
        Returns:
        the instance of Crawler used.
        Throws:
        Exception - if this resource cannot be closed.
      • interruptAndClose

        public void interruptAndClose()
                               throws Exception
        Interrupts then close this object.
        Throws:
        Exception - if exception is thrown on close.
      • interrupt

        public void interrupt()
        Interrupts crawler, fetcher and worker threads.
        Specified by:
        interrupt in interface Interruptible