AnalyzerPipe provides methods to tokenize a possibly very long Stream[F, String]
or Stream[F, Byte]
, such as from a file. When possible, prefer starting with a
Stream[F, Byte]
and use tokenizeBytes
.
- Companion:
- object
- Source:
- AnalyzerPipe.scala
Value members
Concrete methods
Emits a string for every token, as determined by the Analyzer, in the input stream. Decoding from bytes to strings is done using the default charset.
Emits a string for every token, as determined by the Analyzer, in the input stream. Decoding from bytes to strings is done using the default charset.
- Value parameters:
- in
input stream to tokenize
- tokenN
maximum number of tokens to read at a time
- Source:
- AnalyzerPipe.scala
Emits a string for every token, as determined by the Analyzer, in the input stream. A space is
inserted between each element in the input stream to avoid accidentally combining words. See
tokenizeStringsRaw
to avoid this behaviour.
Emits a string for every token, as determined by the Analyzer, in the input stream. A space is
inserted between each element in the input stream to avoid accidentally combining words. See
tokenizeStringsRaw
to avoid this behaviour.
- Value parameters:
- in
input stream to tokenize
- tokenN
maximum number of tokens to read at a time
- Source:
- AnalyzerPipe.scala
Emits a string for every token, as determined by the Analyzer, in the input stream. Becareful,
the end of one string will be joined with the beginning of the next in the Analyzer. See
tokenizeStrings
to automatically intersperse spaces.
Emits a string for every token, as determined by the Analyzer, in the input stream. Becareful,
the end of one string will be joined with the beginning of the next in the Analyzer. See
tokenizeStrings
to automatically intersperse spaces.
- Value parameters:
- in
input stream to tokenize
- tokenN
maximum number of tokens to read at a time
- Source:
- AnalyzerPipe.scala