Class BasicModelUtils<T extends SequenceElement>
- java.lang.Object
-
- org.deeplearning4j.models.embeddings.reader.impl.BasicModelUtils<T>
-
- All Implemented Interfaces:
ModelUtils<T>
- Direct Known Subclasses:
FlatModelUtils
public class BasicModelUtils<T extends SequenceElement> extends Object implements ModelUtils<T>
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classBasicModelUtils.ArrayComparatorstatic classBasicModelUtils.SimilarityComparatorstatic classBasicModelUtils.WordSimilarity
-
Field Summary
Fields Modifier and Type Field Description static StringCORRECTstatic StringEXISTSprotected WeightLookupTable<T>lookupTableprotected booleannormalizedprotected VocabCache<T>vocabCachestatic StringWRONG
-
Constructor Summary
Constructors Constructor Description BasicModelUtils()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description Map<String,Double>accuracy(List<String> questions)Accuracy based on questions which are a space separated list of strings where the first word is the query word, the next 2 words are negative, and the last word is the predicted word to be nearestprotected org.nd4j.linalg.api.ndarray.INDArrayadjustRank(org.nd4j.linalg.api.ndarray.INDArray words)static List<String>getLabels(List<BasicModelUtils.WordSimilarity> results, int limit)voidinit(@NonNull WeightLookupTable<T> lookupTable)This method implementations should accept given lookup table, and use them in further calls to interface methodsdoublesimilarity(@NonNull String label1, @NonNull String label2)Returns the similarity of 2 words.List<String>similarWordsInVocabTo(String word, double accuracy)Find all words with a similar characters in the vocabCollection<String>wordsNearest(@NonNull Collection<String> positive, @NonNull Collection<String> negative, int top)Words nearest based on positive and negative wordsCollection<String>wordsNearest(String label, int n)This method implementations should return N nearest elements labels to given element's labelCollection<String>wordsNearest(org.nd4j.linalg.api.ndarray.INDArray words, int top)Words nearest based on positive and negative words * @param top the top n wordsCollection<String>wordsNearestSum(String word, int n)Get the top n words most similar to the given wordCollection<String>wordsNearestSum(Collection<String> positive, Collection<String> negative, int top)Words nearest based on positive and negative wordsCollection<String>wordsNearestSum(org.nd4j.linalg.api.ndarray.INDArray words, int top)Words nearest based on positive and negative words * @param top the top n words
-
-
-
Field Detail
-
EXISTS
public static final String EXISTS
- See Also:
- Constant Field Values
-
CORRECT
public static final String CORRECT
- See Also:
- Constant Field Values
-
WRONG
public static final String WRONG
- See Also:
- Constant Field Values
-
vocabCache
protected volatile VocabCache<T extends SequenceElement> vocabCache
-
lookupTable
protected volatile WeightLookupTable<T extends SequenceElement> lookupTable
-
normalized
protected volatile boolean normalized
-
-
Method Detail
-
init
public void init(@NonNull @NonNull WeightLookupTable<T> lookupTable)Description copied from interface:ModelUtilsThis method implementations should accept given lookup table, and use them in further calls to interface methods- Specified by:
initin interfaceModelUtils<T extends SequenceElement>
-
similarity
public double similarity(@NonNull @NonNull String label1, @NonNull @NonNull String label2)Returns the similarity of 2 words. Result value will be in range [-1,1], where -1.0 is exact opposite similarity, i.e. NO similarity, and 1.0 is total match of two word vectors. However, most of time you'll see values in range [0,1], but that's something depends of training corpus. Returns NaN if any of labels not exists in vocab, or any label is null- Specified by:
similarityin interfaceModelUtils<T extends SequenceElement>- Parameters:
label1- the first wordlabel2- the second word- Returns:
- a normalized similarity (cosine similarity)
-
wordsNearest
public Collection<String> wordsNearest(String label, int n)
Description copied from interface:ModelUtilsThis method implementations should return N nearest elements labels to given element's label- Specified by:
wordsNearestin interfaceModelUtils<T extends SequenceElement>- Parameters:
label- label to return nearest elements forn- number of nearest words to return- Returns:
-
accuracy
public Map<String,Double> accuracy(List<String> questions)
Accuracy based on questions which are a space separated list of strings where the first word is the query word, the next 2 words are negative, and the last word is the predicted word to be nearest- Specified by:
accuracyin interfaceModelUtils<T extends SequenceElement>- Parameters:
questions- the questions to ask- Returns:
- the accuracy based on these questions
-
similarWordsInVocabTo
public List<String> similarWordsInVocabTo(String word, double accuracy)
Find all words with a similar characters in the vocab- Specified by:
similarWordsInVocabToin interfaceModelUtils<T extends SequenceElement>- Parameters:
word- the word to compareaccuracy- the accuracy: 0 to 1- Returns:
- the list of words that are similar in the vocab
-
wordsNearest
public Collection<String> wordsNearest(@NonNull @NonNull Collection<String> positive, @NonNull @NonNull Collection<String> negative, int top)
Description copied from interface:ModelUtilsWords nearest based on positive and negative words- Specified by:
wordsNearestin interfaceModelUtils<T extends SequenceElement>- Parameters:
positive- the positive wordsnegative- the negative wordstop- the top n words- Returns:
- the words nearest the mean of the words
-
wordsNearestSum
public Collection<String> wordsNearestSum(String word, int n)
Get the top n words most similar to the given word- Specified by:
wordsNearestSumin interfaceModelUtils<T extends SequenceElement>- Parameters:
word- the word to comparen- the n to get- Returns:
- the top n words
-
adjustRank
protected org.nd4j.linalg.api.ndarray.INDArray adjustRank(org.nd4j.linalg.api.ndarray.INDArray words)
-
wordsNearest
public Collection<String> wordsNearest(org.nd4j.linalg.api.ndarray.INDArray words, int top)
Words nearest based on positive and negative words * @param top the top n words- Specified by:
wordsNearestin interfaceModelUtils<T extends SequenceElement>- Returns:
- the words nearest the mean of the words
-
wordsNearestSum
public Collection<String> wordsNearestSum(org.nd4j.linalg.api.ndarray.INDArray words, int top)
Words nearest based on positive and negative words * @param top the top n words- Specified by:
wordsNearestSumin interfaceModelUtils<T extends SequenceElement>- Returns:
- the words nearest the mean of the words
-
wordsNearestSum
public Collection<String> wordsNearestSum(Collection<String> positive, Collection<String> negative, int top)
Words nearest based on positive and negative words- Specified by:
wordsNearestSumin interfaceModelUtils<T extends SequenceElement>- Parameters:
positive- the positive wordsnegative- the negative wordstop- the top n words- Returns:
- the words nearest the mean of the words
-
getLabels
public static List<String> getLabels(List<BasicModelUtils.WordSimilarity> results, int limit)
-
-