Class ScannerOptions

    • Field Detail

      • serverSideIteratorList

        protected List<IterInfo> serverSideIteratorList
      • timeOut

        protected long timeOut
      • batchTimeOut

        protected long batchTimeOut
      • classLoaderContext

        protected String classLoaderContext
    • Constructor Detail

      • ScannerOptions

        protected ScannerOptions()
    • Method Detail

      • addScanIterator

        public void addScanIterator​(IteratorSetting si)
        Description copied from interface: ScannerBase
        Add a server-side scan iterator.
        Specified by:
        addScanIterator in interface ScannerBase
        Parameters:
        si - fully specified scan-time iterator, including all options for the iterator. Any changes to the iterator setting after this call are not propagated to the stored iterator.
      • removeScanIterator

        public void removeScanIterator​(String iteratorName)
        Description copied from interface: ScannerBase
        Remove an iterator from the list of iterators.
        Specified by:
        removeScanIterator in interface ScannerBase
        Parameters:
        iteratorName - nickname used for the iterator
      • updateScanIteratorOption

        public void updateScanIteratorOption​(String iteratorName,
                                             String key,
                                             String value)
        Description copied from interface: ScannerBase
        Update the options for an iterator. Note that this does not change the iterator options during a scan, it just replaces the given option on a configured iterator before a scan is started.
        Specified by:
        updateScanIteratorOption in interface ScannerBase
        Parameters:
        iteratorName - the name of the iterator to change
        key - the name of the option
        value - the new value for the named option
      • fetchColumnFamily

        public void fetchColumnFamily​(org.apache.hadoop.io.Text col)
        Description copied from interface: ScannerBase
        Adds a column family to the list of columns that will be fetched by this scanner. By default when no columns have been added the scanner fetches all columns. To fetch multiple column families call this function multiple times.

        This can help limit which locality groups are read on the server side.

        When used in conjunction with custom iterators, the set of column families fetched is passed to the top iterator's seek method. Custom iterators may change this set of column families when calling seek on their source.

        Specified by:
        fetchColumnFamily in interface ScannerBase
        Parameters:
        col - the column family to be fetched
      • fetchColumn

        public void fetchColumn​(org.apache.hadoop.io.Text colFam,
                                org.apache.hadoop.io.Text colQual)
        Description copied from interface: ScannerBase
        Adds a column to the list of columns that will be fetched by this scanner. The column is identified by family and qualifier. By default when no columns have been added the scanner fetches all columns.

        WARNING. Using this method with custom iterators may have unexpected results. Iterators have control over which column families are fetched. However iterators have no control over which column qualifiers are fetched. When this method is called it activates a system iterator that only allows the requested family/qualifier pairs through. This low level filtering prevents custom iterators from requesting additional column families when calling seek.

        For an example, assume fetchColumns(A, Q1) and fetchColumns(B,Q1) is called on a scanner and a custom iterator is configured. The families (A,B) will be passed to the seek method of the custom iterator. If the custom iterator seeks its source iterator using the families (A,B,C), it will never see any data from C because the system iterator filtering A:Q1 and B:Q1 will prevent the C family from getting through. ACCUMULO-3905 also has an example of the type of problem this method can cause.

        tl;dr If using a custom iterator with a seek method that adds column families, then may want to avoid using this method.

        Specified by:
        fetchColumn in interface ScannerBase
        Parameters:
        colFam - the column family of the column to be fetched
        colQual - the column qualifier of the column to be fetched
      • clearColumns

        public void clearColumns()
        Description copied from interface: ScannerBase
        Clears the columns to be fetched (useful for resetting the scanner for reuse). Once cleared, the scanner will fetch all columns.
        Specified by:
        clearColumns in interface ScannerBase
      • clearScanIterators

        public void clearScanIterators()
        Description copied from interface: ScannerBase
        Clears scan iterators prior to returning a scanner to the pool.
        Specified by:
        clearScanIterators in interface ScannerBase
      • iterator

        public Iterator<Map.Entry<Key,​Value>> iterator()
        Description copied from interface: ScannerBase
        Returns an iterator over an accumulo table. This iterator uses the options that are currently set for its lifetime, so setting options will have no effect on existing iterators.

        Keys returned by the iterator are not guaranteed to be in sorted order.

        Specified by:
        iterator in interface Iterable<Map.Entry<Key,​Value>>
        Specified by:
        iterator in interface ScannerBase
        Returns:
        an iterator over Key,Value pairs which meet the restrictions set on the scanner
      • setTimeout

        public void setTimeout​(long timeout,
                               TimeUnit timeUnit)
        Description copied from interface: ScannerBase
        This setting determines how long a scanner will automatically retry when a failure occurs. By default, a scanner will retry forever.

        Setting the timeout to zero (with any time unit) or Long.MAX_VALUE (with TimeUnit.MILLISECONDS) means no timeout.

        Specified by:
        setTimeout in interface ScannerBase
        Parameters:
        timeout - the length of the timeout
        timeUnit - the units of the timeout
      • getTimeout

        public long getTimeout​(TimeUnit timeunit)
        Description copied from interface: ScannerBase
        Returns the setting for how long a scanner will automatically retry when a failure occurs.
        Specified by:
        getTimeout in interface ScannerBase
        Returns:
        the timeout configured for this scanner
      • close

        public void close()
        Description copied from interface: ScannerBase
        Closes any underlying connections on the scanner. This may invalidate any iterators derived from the Scanner, causing them to throw exceptions.
        Specified by:
        close in interface AutoCloseable
        Specified by:
        close in interface ScannerBase
      • getAuthorizations

        public Authorizations getAuthorizations()
        Description copied from interface: ScannerBase
        Returns the authorizations that have been set on the scanner
        Specified by:
        getAuthorizations in interface ScannerBase
        Returns:
        The authorizations set on the scanner instance
      • setSamplerConfiguration

        public void setSamplerConfiguration​(SamplerConfiguration samplerConfig)
        Description copied from interface: ScannerBase
        Setting this will cause the scanner to read sample data, as long as that sample data was generated with the given configuration. By default this is not set and all data is read.

        One way to use this method is as follows, where the sampler configuration is obtained from the table configuration. Sample data can be generated in many different ways, so its important to verify the sample data configuration meets expectations.

         
           // could cache this if creating many scanners to avoid RPCs.
           SamplerConfiguration samplerConfig =
             client.tableOperations().getSamplerConfiguration(table);
           // verify table's sample data is generated in an expected way before using
           userCode.verifySamplerConfig(samplerConfig);
           scanner.setSamplerConfiguration(samplerConfig);
         
         

        Of course this is not the only way to obtain a SamplerConfiguration, it could be a constant, configuration, etc.

        If sample data is not present or sample data was generated with a different configuration, then the scanner iterator will throw a SampleNotPresentException. Also if a table's sampler configuration is changed while a scanner is iterating over a table, a SampleNotPresentException may be thrown.

        Specified by:
        setSamplerConfiguration in interface ScannerBase
      • setBatchTimeout

        public void setBatchTimeout​(long timeout,
                                    TimeUnit timeUnit)
        Description copied from interface: ScannerBase
        This setting determines how long a scanner will wait to fill the returned batch. By default, a scanner wait until the batch is full.

        Setting the timeout to zero (with any time unit) or Long.MAX_VALUE (with TimeUnit.MILLISECONDS) means no timeout.

        Specified by:
        setBatchTimeout in interface ScannerBase
        Parameters:
        timeout - the length of the timeout
        timeUnit - the units of the timeout
      • getBatchTimeout

        public long getBatchTimeout​(TimeUnit timeUnit)
        Description copied from interface: ScannerBase
        Returns the timeout to fill a batch in the given TimeUnit.
        Specified by:
        getBatchTimeout in interface ScannerBase
        Returns:
        the batch timeout configured for this scanner
      • setClassLoaderContext

        public void setClassLoaderContext​(String classLoaderContext)
        Description copied from interface: ScannerBase
        Sets the name of the classloader context on this scanner. See the administration chapter of the user manual for details on how to configure and use classloader contexts.
        Specified by:
        setClassLoaderContext in interface ScannerBase
        Parameters:
        classLoaderContext - name of the classloader context
      • clearClassLoaderContext

        public void clearClassLoaderContext()
        Description copied from interface: ScannerBase
        Clears the current classloader context set on this scanner
        Specified by:
        clearClassLoaderContext in interface ScannerBase
      • getClassLoaderContext

        public String getClassLoaderContext()
        Description copied from interface: ScannerBase
        Returns the name of the current classloader context set on this scanner
        Specified by:
        getClassLoaderContext in interface ScannerBase
        Returns:
        name of the current context
      • setExecutionHints

        public void setExecutionHints​(Map<String,​String> hints)
        Description copied from interface: ScannerBase
        Set hints for the configured ScanPrioritizer and ScanDispatcher. These hints are available on the server side via ScanInfo.getExecutionHints() Depending on the configuration, these hints may be ignored. Hints will never impact what data is returned by a scan, only how quickly it is returned.

        Using the hint scan_type=<type> and documenting all of the types for your application is one strategy to consider. This allows administrators to adjust executor and prioritizer config for your application scan types without having to change the application source code.

        The default configuration for Accumulo will ignore hints. See HintScanPrioritizer and SimpleScanDispatcher for examples of classes that can react to hints.

        Specified by:
        setExecutionHints in interface ScannerBase