Package org.deeplearning4j.iterator.bert
Class BertMaskedLMMasker
- java.lang.Object
-
- org.deeplearning4j.iterator.bert.BertMaskedLMMasker
-
- All Implemented Interfaces:
BertSequenceMasker
public class BertMaskedLMMasker extends Object implements BertSequenceMasker
-
-
Field Summary
Fields Modifier and Type Field Description static double
DEFAULT_MASK_PROB
static double
DEFAULT_MASK_TOKEN_PROB
static double
DEFAULT_RANDOM_WORD_PROB
protected double
maskProb
protected double
maskTokenProb
protected Random
r
protected double
randomTokenProb
-
Constructor Summary
Constructors Constructor Description BertMaskedLMMasker()
Create a BertMaskedLMMasker with all default probabilitiesBertMaskedLMMasker(Random r, double maskProb, double maskTokenProb, double randomTokenProb)
See:BertMaskedLMMasker
for details.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description org.nd4j.common.primitives.Pair<List<String>,boolean[]>
maskSequence(List<String> input, String maskToken, List<String> vocabWords)
-
-
-
Field Detail
-
DEFAULT_MASK_PROB
public static final double DEFAULT_MASK_PROB
- See Also:
- Constant Field Values
-
DEFAULT_MASK_TOKEN_PROB
public static final double DEFAULT_MASK_TOKEN_PROB
- See Also:
- Constant Field Values
-
DEFAULT_RANDOM_WORD_PROB
public static final double DEFAULT_RANDOM_WORD_PROB
- See Also:
- Constant Field Values
-
r
protected final Random r
-
maskProb
protected final double maskProb
-
maskTokenProb
protected final double maskTokenProb
-
randomTokenProb
protected final double randomTokenProb
-
-
Constructor Detail
-
BertMaskedLMMasker
public BertMaskedLMMasker()
Create a BertMaskedLMMasker with all default probabilities
-
BertMaskedLMMasker
public BertMaskedLMMasker(Random r, double maskProb, double maskTokenProb, double randomTokenProb)
See:BertMaskedLMMasker
for details.- Parameters:
r
- Random number generatormaskProb
- Probability of masking each tokenmaskTokenProb
- Probability of replacing a selected token with the mask tokenrandomTokenProb
- Probability of replacing a selected token with a random token
-
-
Method Detail
-
maskSequence
public org.nd4j.common.primitives.Pair<List<String>,boolean[]> maskSequence(List<String> input, String maskToken, List<String> vocabWords)
- Specified by:
maskSequence
in interfaceBertSequenceMasker
- Parameters:
input
- Input sequence of tokensmaskToken
- Token to use for masking - usually something like "[MASK]"vocabWords
- Vocabulary, as a list- Returns:
- Pair: The new input tokens (after masking out), along with a boolean[] for whether the token is masked or not (same length as number of tokens). boolean[i] is true if token i was masked.
-
-