Class BasicModelUtils<T extends SequenceElement>
- java.lang.Object
-
- org.deeplearning4j.models.embeddings.reader.impl.BasicModelUtils<T>
-
- All Implemented Interfaces:
ModelUtils<T>
- Direct Known Subclasses:
FlatModelUtils
public class BasicModelUtils<T extends SequenceElement> extends Object implements ModelUtils<T>
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
BasicModelUtils.ArrayComparator
static class
BasicModelUtils.SimilarityComparator
static class
BasicModelUtils.WordSimilarity
-
Field Summary
Fields Modifier and Type Field Description static String
CORRECT
static String
EXISTS
protected WeightLookupTable<T>
lookupTable
protected boolean
normalized
protected VocabCache<T>
vocabCache
static String
WRONG
-
Constructor Summary
Constructors Constructor Description BasicModelUtils()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description Map<String,Double>
accuracy(List<String> questions)
Accuracy based on questions which are a space separated list of strings where the first word is the query word, the next 2 words are negative, and the last word is the predicted word to be nearestprotected org.nd4j.linalg.api.ndarray.INDArray
adjustRank(org.nd4j.linalg.api.ndarray.INDArray words)
static List<String>
getLabels(List<BasicModelUtils.WordSimilarity> results, int limit)
void
init(@NonNull WeightLookupTable<T> lookupTable)
This method implementations should accept given lookup table, and use them in further calls to interface methodsdouble
similarity(@NonNull String label1, @NonNull String label2)
Returns the similarity of 2 words.List<String>
similarWordsInVocabTo(String word, double accuracy)
Find all words with a similar characters in the vocabCollection<String>
wordsNearest(@NonNull Collection<String> positive, @NonNull Collection<String> negative, int top)
Words nearest based on positive and negative wordsCollection<String>
wordsNearest(String label, int n)
This method implementations should return N nearest elements labels to given element's labelCollection<String>
wordsNearest(org.nd4j.linalg.api.ndarray.INDArray words, int top)
Words nearest based on positive and negative words * @param top the top n wordsCollection<String>
wordsNearestSum(String word, int n)
Get the top n words most similar to the given wordCollection<String>
wordsNearestSum(Collection<String> positive, Collection<String> negative, int top)
Words nearest based on positive and negative wordsCollection<String>
wordsNearestSum(org.nd4j.linalg.api.ndarray.INDArray words, int top)
Words nearest based on positive and negative words * @param top the top n words
-
-
-
Field Detail
-
EXISTS
public static final String EXISTS
- See Also:
- Constant Field Values
-
CORRECT
public static final String CORRECT
- See Also:
- Constant Field Values
-
WRONG
public static final String WRONG
- See Also:
- Constant Field Values
-
vocabCache
protected volatile VocabCache<T extends SequenceElement> vocabCache
-
lookupTable
protected volatile WeightLookupTable<T extends SequenceElement> lookupTable
-
normalized
protected volatile boolean normalized
-
-
Method Detail
-
init
public void init(@NonNull @NonNull WeightLookupTable<T> lookupTable)
Description copied from interface:ModelUtils
This method implementations should accept given lookup table, and use them in further calls to interface methods- Specified by:
init
in interfaceModelUtils<T extends SequenceElement>
-
similarity
public double similarity(@NonNull @NonNull String label1, @NonNull @NonNull String label2)
Returns the similarity of 2 words. Result value will be in range [-1,1], where -1.0 is exact opposite similarity, i.e. NO similarity, and 1.0 is total match of two word vectors. However, most of time you'll see values in range [0,1], but that's something depends of training corpus. Returns NaN if any of labels not exists in vocab, or any label is null- Specified by:
similarity
in interfaceModelUtils<T extends SequenceElement>
- Parameters:
label1
- the first wordlabel2
- the second word- Returns:
- a normalized similarity (cosine similarity)
-
wordsNearest
public Collection<String> wordsNearest(String label, int n)
Description copied from interface:ModelUtils
This method implementations should return N nearest elements labels to given element's label- Specified by:
wordsNearest
in interfaceModelUtils<T extends SequenceElement>
- Parameters:
label
- label to return nearest elements forn
- number of nearest words to return- Returns:
-
accuracy
public Map<String,Double> accuracy(List<String> questions)
Accuracy based on questions which are a space separated list of strings where the first word is the query word, the next 2 words are negative, and the last word is the predicted word to be nearest- Specified by:
accuracy
in interfaceModelUtils<T extends SequenceElement>
- Parameters:
questions
- the questions to ask- Returns:
- the accuracy based on these questions
-
similarWordsInVocabTo
public List<String> similarWordsInVocabTo(String word, double accuracy)
Find all words with a similar characters in the vocab- Specified by:
similarWordsInVocabTo
in interfaceModelUtils<T extends SequenceElement>
- Parameters:
word
- the word to compareaccuracy
- the accuracy: 0 to 1- Returns:
- the list of words that are similar in the vocab
-
wordsNearest
public Collection<String> wordsNearest(@NonNull @NonNull Collection<String> positive, @NonNull @NonNull Collection<String> negative, int top)
Description copied from interface:ModelUtils
Words nearest based on positive and negative words- Specified by:
wordsNearest
in interfaceModelUtils<T extends SequenceElement>
- Parameters:
positive
- the positive wordsnegative
- the negative wordstop
- the top n words- Returns:
- the words nearest the mean of the words
-
wordsNearestSum
public Collection<String> wordsNearestSum(String word, int n)
Get the top n words most similar to the given word- Specified by:
wordsNearestSum
in interfaceModelUtils<T extends SequenceElement>
- Parameters:
word
- the word to comparen
- the n to get- Returns:
- the top n words
-
adjustRank
protected org.nd4j.linalg.api.ndarray.INDArray adjustRank(org.nd4j.linalg.api.ndarray.INDArray words)
-
wordsNearest
public Collection<String> wordsNearest(org.nd4j.linalg.api.ndarray.INDArray words, int top)
Words nearest based on positive and negative words * @param top the top n words- Specified by:
wordsNearest
in interfaceModelUtils<T extends SequenceElement>
- Returns:
- the words nearest the mean of the words
-
wordsNearestSum
public Collection<String> wordsNearestSum(org.nd4j.linalg.api.ndarray.INDArray words, int top)
Words nearest based on positive and negative words * @param top the top n words- Specified by:
wordsNearestSum
in interfaceModelUtils<T extends SequenceElement>
- Returns:
- the words nearest the mean of the words
-
wordsNearestSum
public Collection<String> wordsNearestSum(Collection<String> positive, Collection<String> negative, int top)
Words nearest based on positive and negative words- Specified by:
wordsNearestSum
in interfaceModelUtils<T extends SequenceElement>
- Parameters:
positive
- the positive wordsnegative
- the negative wordstop
- the top n words- Returns:
- the words nearest the mean of the words
-
getLabels
public static List<String> getLabels(List<BasicModelUtils.WordSimilarity> results, int limit)
-
-