Interface WordVectors
-
- All Superinterfaces:
org.deeplearning4j.nn.weights.embeddings.EmbeddingInitializer
,Serializable
- All Known Implementing Classes:
FastText
,Node2Vec
,ParagraphVectors
,SequenceVectors
,StaticWord2Vec
,Word2Vec
,WordVectorsImpl
public interface WordVectors extends Serializable, org.deeplearning4j.nn.weights.embeddings.EmbeddingInitializer
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description Map<String,Double>
accuracy(List<String> questions)
Accuracy based on questions which are a space separated list of strings where the first word is the query word, the next 2 words are negative, and the last word is the predicted word to be nearestString
getUNK()
double[]
getWordVector(String word)
Get the word vector for a given matrixorg.nd4j.linalg.api.ndarray.INDArray
getWordVectorMatrix(String word)
Get the word vector for a given matrixorg.nd4j.linalg.api.ndarray.INDArray
getWordVectorMatrixNormalized(String word)
Returns the word vector divided by the norm2 of the arrayorg.nd4j.linalg.api.ndarray.INDArray
getWordVectors(Collection<String> labels)
This method returns 2D array, where each row represents corresponding word/labelorg.nd4j.linalg.api.ndarray.INDArray
getWordVectorsMean(Collection<String> labels)
This method returns mean vector, built from words/labels passed inboolean
hasWord(String word)
Returns true if the model has this word in the vocabint
indexOf(String word)
WeightLookupTable
lookupTable()
Lookup table for the vectorsboolean
outOfVocabularySupported()
Does implementation vectorize words absent in vocabularyvoid
setModelUtils(ModelUtils utils)
Specifies ModelUtils to be used to access modelvoid
setUNK(String newUNK)
double
similarity(String word, String word2)
Returns the similarity of 2 wordsList<String>
similarWordsInVocabTo(String word, double accuracy)
Find all words with a similar characters in the vocabVocabCache
vocab()
Vocab for the vectorsCollection<String>
wordsNearest(String word, int n)
Get the top n words most similar to the given wordCollection<String>
wordsNearest(Collection<String> positive, Collection<String> negative, int top)
Words nearest based on positive and negative wordsCollection<String>
wordsNearest(org.nd4j.linalg.api.ndarray.INDArray words, int top)
Collection<String>
wordsNearestSum(String word, int n)
Get the top n words most similar to the given wordCollection<String>
wordsNearestSum(Collection<String> positive, Collection<String> negative, int top)
Words nearest based on positive and negative wordsCollection<String>
wordsNearestSum(org.nd4j.linalg.api.ndarray.INDArray words, int top)
-
-
-
Method Detail
-
getUNK
String getUNK()
-
setUNK
void setUNK(String newUNK)
-
hasWord
boolean hasWord(String word)
Returns true if the model has this word in the vocab- Parameters:
word
- the word to test for- Returns:
- true if the model has the word in the vocab
-
wordsNearest
Collection<String> wordsNearest(org.nd4j.linalg.api.ndarray.INDArray words, int top)
-
wordsNearestSum
Collection<String> wordsNearestSum(org.nd4j.linalg.api.ndarray.INDArray words, int top)
-
wordsNearestSum
Collection<String> wordsNearestSum(String word, int n)
Get the top n words most similar to the given word- Parameters:
word
- the word to comparen
- the n to get- Returns:
- the top n words
-
wordsNearestSum
Collection<String> wordsNearestSum(Collection<String> positive, Collection<String> negative, int top)
Words nearest based on positive and negative words- Parameters:
positive
- the positive wordsnegative
- the negative wordstop
- the top n words- Returns:
- the words nearest the mean of the words
-
accuracy
Map<String,Double> accuracy(List<String> questions)
Accuracy based on questions which are a space separated list of strings where the first word is the query word, the next 2 words are negative, and the last word is the predicted word to be nearest- Parameters:
questions
- the questions to ask- Returns:
- the accuracy based on these questions
-
indexOf
int indexOf(String word)
-
similarWordsInVocabTo
List<String> similarWordsInVocabTo(String word, double accuracy)
Find all words with a similar characters in the vocab- Parameters:
word
- the word to compareaccuracy
- the accuracy: 0 to 1- Returns:
- the list of words that are similar in the vocab
-
getWordVector
double[] getWordVector(String word)
Get the word vector for a given matrix- Parameters:
word
- the word to get the matrix for- Returns:
- the ndarray for this word
-
getWordVectorMatrixNormalized
org.nd4j.linalg.api.ndarray.INDArray getWordVectorMatrixNormalized(String word)
Returns the word vector divided by the norm2 of the array- Parameters:
word
- the word to get the matrix for- Returns:
- the looked up matrix
-
getWordVectorMatrix
org.nd4j.linalg.api.ndarray.INDArray getWordVectorMatrix(String word)
Get the word vector for a given matrix- Parameters:
word
- the word to get the matrix for- Returns:
- the ndarray for this word
-
getWordVectors
org.nd4j.linalg.api.ndarray.INDArray getWordVectors(Collection<String> labels)
This method returns 2D array, where each row represents corresponding word/label- Parameters:
labels
-- Returns:
-
getWordVectorsMean
org.nd4j.linalg.api.ndarray.INDArray getWordVectorsMean(Collection<String> labels)
This method returns mean vector, built from words/labels passed in- Parameters:
labels
-- Returns:
-
wordsNearest
Collection<String> wordsNearest(Collection<String> positive, Collection<String> negative, int top)
Words nearest based on positive and negative words- Parameters:
positive
- the positive wordsnegative
- the negative wordstop
- the top n words- Returns:
- the words nearest the mean of the words
-
wordsNearest
Collection<String> wordsNearest(String word, int n)
Get the top n words most similar to the given word- Parameters:
word
- the word to comparen
- the n to get- Returns:
- the top n words
-
similarity
double similarity(String word, String word2)
Returns the similarity of 2 words- Parameters:
word
- the first wordword2
- the second word- Returns:
- a normalized similarity (cosine similarity)
-
vocab
VocabCache vocab()
Vocab for the vectors- Returns:
-
lookupTable
WeightLookupTable lookupTable()
Lookup table for the vectors- Returns:
-
setModelUtils
void setModelUtils(ModelUtils utils)
Specifies ModelUtils to be used to access model- Parameters:
utils
-
-
outOfVocabularySupported
boolean outOfVocabularySupported()
Does implementation vectorize words absent in vocabulary- Returns:
- boolean
-
-