BatchCursor

abstract class BatchCursor[+A] extends Serializable

Similar to Java's and Scala's Iterator, the BatchCursor type can can be used to iterate over the data in a collection, but it cannot be used to modify the underlying collection.

Similar to Java's and Scala's Iterator, the BatchCursor type can can be used to iterate over the data in a collection, but it cannot be used to modify the underlying collection.

Inspired by the standard Iterator, provides a way to efficiently apply operations such as map, filter, collect on the underlying collection without such operations having necessarily lazy behavior. So in other words, when wrapping a standard Array, an application of map will copy the data to a new Array instance with its elements modified, immediately and is thus having strict (eager) behavior. In other cases, when wrapping potentially infinite collections, like Iterable or Stream, that's when lazy behavior happens.

Sample:

 def sum(cursor: BatchCursor[Int]): Long = {
   var sum = 0L

   while (cursor.hasNext()) {
     sum += cursor.next()
   }
   sum
 }

This class is provided as an alternative to Scala's Iterator because:

  • the list of supported operations is smaller
  • implementations specialized for primitives are provided to avoid boxing
  • depending on the implementation, the behaviour of operators can be eager (e.g. map, filter), but only in case the source cursor doesn't need to be consumed (if the cursor is backed by an array, then a new array gets created, etc.)
  • the recommendedBatchSize can signal how many batch can be processed in batches

Used in the Iterant implementation.

Companion
object
trait Serializable
class Object
trait Matchable
class Any

Value members

Abstract methods

def collect[B](pf: PartialFunction[A, B]): BatchCursor[B]

Creates a cursor by transforming values produced by the source cursor with a partial function, dropping those values for which the partial function is not defined.

Creates a cursor by transforming values produced by the source cursor with a partial function, dropping those values for which the partial function is not defined.

NOTE: application of this function can be either strict or lazy (depending on the underlying cursor type), but it does not modify the original collection.

Value Params
pf

the partial function which filters and maps the cursor.

Returns

a new cursor which yields each value x produced by this cursor for which pf is defined

def drop(n: Int): BatchCursor[A]

Creates a new cursor that advances this cursor past the first n elements, or the length of the cursor, whichever is smaller.

Creates a new cursor that advances this cursor past the first n elements, or the length of the cursor, whichever is smaller.

Value Params
n

the number of elements to drop

Returns

a cursor which produces all values of the current cursor, except it omits the first n values.

def filter(p: A => Boolean): BatchCursor[A]

Returns an cursor over all the elements of the source cursor that satisfy the predicate p. The order of the elements is preserved.

Returns an cursor over all the elements of the source cursor that satisfy the predicate p. The order of the elements is preserved.

NOTE: application of this function can be either strict or lazy (depending on the underlying cursor type), but it does not modify the original collection.

Value Params
p

the predicate used to test values.

Returns

a cursor which produces those values of this cursor which satisfy the predicate p.

def hasNext(): Boolean

Tests whether this cursor can provide another element.

Tests whether this cursor can provide another element.

This method can be side-effecting, depending on the implementation and thus it can also throw exceptions. This is because in certain cases the only way to know if there is a next element or not involves triggering dangerous side-effects.

This method is idempotent until the call to next happens, meaning that multiple hasNext calls can be made and implementations are advised to memoize the result.

Returns

true if a subsequent call to next will yield an element, false otherwise.

def map[B](f: A => B): BatchCursor[B]

Creates a new cursor that maps all produced values of this cursor to new values using a transformation function.

Creates a new cursor that maps all produced values of this cursor to new values using a transformation function.

NOTE: application of this function can be either strict or lazy (depending on the underlying cursor type), but it does not modify the original collection.

Value Params
f

is the transformation function

Returns

a new cursor which transforms every value produced by this cursor by applying the function f to it.

def next(): A

Produces the next element of this iterator.

Produces the next element of this iterator.

This method is side-effecting, as it mutates the internal state of the cursor and can throw exceptions.

Returns

the next element of this iterator, if hasNext is true, undefined behavior otherwise (can throw exceptions).

In case this cursor is going to be processed eagerly, in batches then this value should be the recommended batch size for the source cursor.

In case this cursor is going to be processed eagerly, in batches then this value should be the recommended batch size for the source cursor.

Examples:

  • if this cursor is iterating over a standard collection with a finite size, it can be something generous like 1024
  • if it's iterating over a cheap infinite iterator (e.g. Iterator.range), it could be 128.
  • if it does any sort of I/O or blocking of threads, then the recommended value is 1.

Basically the batch size should be adjusted according to how expensive processing this cursor is. If it's a strict collection of a finite size, then it can probably be processed all at once. But if it's a big database result set that can block threads on reads, then it's probably wise to do it one item at a time.

def slice(from: Int, until: Int): BatchCursor[A]

Creates an cursor returning an interval of the values produced by this cursor.

Creates an cursor returning an interval of the values produced by this cursor.

Value Params
from

the index of the first element in this cursor which forms part of the slice.

until

the index of the first element following the slice.

Returns

a cursor which advances this cursor past the first from elements using drop, and then takes until - from elements, using take

def take(n: Int): BatchCursor[A]

Creates a new cursor that will only emit the first n values of this cursor.

Creates a new cursor that will only emit the first n values of this cursor.

Value Params
n

is the number of values to take

Returns

a cursor producing only of the first n values of this cursor, or else the whole sequence, if it produces fewer than n values.

def toIterator: Iterator[A]

Converts this cursor into a Scala Iterator.

Converts this cursor into a Scala Iterator.

Concrete methods

def foldLeft[R](initial: R)(op: (R, A) => R): R

Applies a binary operator to a start value and all elements of this cursor, going left to right.

Applies a binary operator to a start value and all elements of this cursor, going left to right.

NOTE: applying this function on the cursor will consume it completely.

Type Params
R

is the result type of the binary operator.

Value Params
initial

is the start value.

op

the binary operator to apply

Returns

the result of inserting op between consecutive elements of this cursor, going left to right with the start value initial on the left. Returns initial if the cursor is empty.

def isEmpty: Boolean

Returns true in case our cursor is empty or false if there are more elements to process.

Returns true in case our cursor is empty or false if there are more elements to process.

Alias for !cursor.hasNext().

def nonEmpty: Boolean

Returns true in case our cursor has more elements to process or false if the cursor is empty.

Returns true in case our cursor has more elements to process or false if the cursor is empty.

Alias for hasNext.

def toArray[B >: A](`evidence$1`: ClassTag[B]): Array[B]

Converts this cursor into an Array, consuming it in the process.

Converts this cursor into an Array, consuming it in the process.

def toBatch: Batch[A]

Converts this cursor into a reusable array-backed Batch, consuming it in the process.

Converts this cursor into a reusable array-backed Batch, consuming it in the process.

def toList: List[A]

Converts this cursor into a Scala immutable List, consuming it in the process.

Converts this cursor into a Scala immutable List, consuming it in the process.