AnalyzerPipe

sealed abstract case class AnalyzerPipe[F[_]](readerF: Reader => Resource[F, TokenGetter])(implicit F: Async[F])

AnalyzerPipe provides methods to tokenize a possibly very long Stream[F, String] or Stream[F, Byte], such as from a file. When possible, prefer starting with a Stream[F, Byte] and use tokenizeBytes.

Companion:
object
Source:
AnalyzerPipe.scala
trait Serializable
trait Product
trait Equals
class Object
trait Matchable
class Any

Value members

Concrete methods

def tokenizeBytes(in: Stream[F, Byte], tokenN: Int): Stream[F, String]

Emits a string for every token, as determined by the Analyzer, in the input stream. Decoding from bytes to strings is done using the default charset.

Emits a string for every token, as determined by the Analyzer, in the input stream. Decoding from bytes to strings is done using the default charset.

Value parameters:
in

input stream to tokenize

tokenN

maximum number of tokens to read at a time

Source:
AnalyzerPipe.scala
def tokenizeStrings(in: Stream[F, String], tokenN: Int): Stream[F, String]

Emits a string for every token, as determined by the Analyzer, in the input stream. A space is inserted between each element in the input stream to avoid accidentally combining words. See tokenizeStringsRaw to avoid this behaviour.

Emits a string for every token, as determined by the Analyzer, in the input stream. A space is inserted between each element in the input stream to avoid accidentally combining words. See tokenizeStringsRaw to avoid this behaviour.

Value parameters:
in

input stream to tokenize

tokenN

maximum number of tokens to read at a time

Source:
AnalyzerPipe.scala
def tokenizeStringsRaw(in: Stream[F, String], tokenN: Int): Stream[F, String]

Emits a string for every token, as determined by the Analyzer, in the input stream. Becareful, the end of one string will be joined with the beginning of the next in the Analyzer. See tokenizeStrings to automatically intersperse spaces.

Emits a string for every token, as determined by the Analyzer, in the input stream. Becareful, the end of one string will be joined with the beginning of the next in the Analyzer. See tokenizeStrings to automatically intersperse spaces.

Value parameters:
in

input stream to tokenize

tokenN

maximum number of tokens to read at a time

Source:
AnalyzerPipe.scala

Inherited methods

Inherited from:
Product