Class BertWordPieceStreamTokenizer
- java.lang.Object
-
- org.deeplearning4j.text.tokenization.tokenizer.BertWordPieceTokenizer
-
- org.deeplearning4j.text.tokenization.tokenizer.BertWordPieceStreamTokenizer
-
- All Implemented Interfaces:
Tokenizer
public class BertWordPieceStreamTokenizer extends BertWordPieceTokenizer
-
-
Field Summary
-
Fields inherited from class org.deeplearning4j.text.tokenization.tokenizer.BertWordPieceTokenizer
splitPattern
-
-
Constructor Summary
Constructors Constructor Description BertWordPieceStreamTokenizer(InputStream tokens, Charset encoding, NavigableMap<String,Integer> vocab, TokenPreProcess preTokenizePreProcessor, TokenPreProcess tokenPreProcess)
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static String
readAndClose(InputStream is, Charset encoding)
-
Methods inherited from class org.deeplearning4j.text.tokenization.tokenizer.BertWordPieceTokenizer
checkIfEmpty, countTokens, findLongestSubstring, getTokens, hasMoreTokens, nextToken, setTokenPreProcessor
-
-
-
-
Constructor Detail
-
BertWordPieceStreamTokenizer
public BertWordPieceStreamTokenizer(InputStream tokens, Charset encoding, NavigableMap<String,Integer> vocab, TokenPreProcess preTokenizePreProcessor, TokenPreProcess tokenPreProcess)
-
-
Method Detail
-
readAndClose
public static String readAndClose(InputStream is, Charset encoding)
-
-