Interface Extractor

  • All Implemented Interfaces:
    java.io.Serializable

    
    public interface Extractor
     implements Serializable
                        

    CSS/JQuery based extractor for HTML pages

    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
    • Field Summary

      Fields 
      Modifier and Type Field Description
    • Constructor Summary

      Constructors 
      Constructor Description
    • Enum Constant Summary

      Enum Constants 
      Enum Constant Description
    • Method Summary

      Modifier and Type Method Description
      abstract int extract(String expression, String attribute, int matchNumber, String inputString, List<String> result, int found, String cacheKey)
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

    • Method Detail

      • extract

         abstract int extract(String expression, String attribute, int matchNumber, String inputString, List<String> result, int found, String cacheKey)
        Parameters:
        expression - Expression used for extraction of nodes
        attribute - Attribute name to return
        matchNumber - Match number
        inputString - Page or excerpt
        result - List of results
        found - current matches found
        cacheKey - If not null, the implementation is encouraged to cache parsing result and use this key as part of cache key