Class ParagraphVectors
- java.lang.Object
-
- org.deeplearning4j.models.embeddings.wordvectors.WordVectorsImpl<T>
-
- org.deeplearning4j.models.sequencevectors.SequenceVectors<VocabWord>
-
- org.deeplearning4j.models.word2vec.Word2Vec
-
- org.deeplearning4j.models.paragraphvectors.ParagraphVectors
-
- All Implemented Interfaces:
Serializable,WordVectors,org.deeplearning4j.nn.weights.embeddings.EmbeddingInitializer
public class ParagraphVectors extends Word2Vec
- See Also:
- Serialized Form
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description classParagraphVectors.BlindInferenceCallablestatic classParagraphVectors.BuilderclassParagraphVectors.InferenceCallable-
Nested classes/interfaces inherited from class org.deeplearning4j.models.sequencevectors.SequenceVectors
SequenceVectors.AsyncSequencer
-
-
Field Summary
Fields Modifier and Type Field Description protected AtomicLongcountFinishedprotected AtomicLongcountSubmittedprotected org.threadly.concurrent.PrioritySchedulerinferenceExecutorprotected ObjectinferenceLockerprotected LabelAwareIteratorlabelAwareIteratorprotected List<VocabWord>labelsListprotected org.nd4j.linalg.api.ndarray.INDArraylabelsMatrixprotected LabelsSourcelabelsSourceprotected booleannormalizedLabels-
Fields inherited from class org.deeplearning4j.models.word2vec.Word2Vec
sentenceIter, tokenizerFactory
-
Fields inherited from class org.deeplearning4j.models.sequencevectors.SequenceVectors
configuration, configured, elementsLearningAlgorithm, enableScavenger, eventListeners, existingModel, intersectModel, iterator, lockFactor, log, scoreElements, scoreSequences, sequenceLearningAlgorithm, unknownElement, vocabLimit
-
Fields inherited from class org.deeplearning4j.models.embeddings.wordvectors.WordVectorsImpl
batchSize, DEFAULT_UNK, layerSize, learningRate, learningRateDecayWords, lookupTable, minLearningRate, minWordFrequency, modelUtils, negative, numEpochs, numIterations, resetModel, sampling, seed, stopWords, trainElementsVectors, trainSequenceVectors, useAdeGrad, useUnknown, variableWindows, vocab, window, workers
-
-
Constructor Summary
Constructors Modifier Constructor Description protectedParagraphVectors()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description voidextractLabels()voidfit()Starts training overstatic ParagraphVectorsfromJson(String jsonString)org.nd4j.linalg.api.ndarray.INDArrayinferVector(@NonNull List<VocabWord> document)This method calculates inferred vector for given list of words, with default parameters for learning rate and iterationsorg.nd4j.linalg.api.ndarray.INDArrayinferVector(@NonNull List<VocabWord> document, double learningRate, double minLearningRate, int iterations)This method calculates inferred vector for given documentorg.nd4j.linalg.api.ndarray.INDArrayinferVector(String text)This method calculates inferred vector for given text, with default parameters for learning rate and iterationsorg.nd4j.linalg.api.ndarray.INDArrayinferVector(String text, double learningRate, double minLearningRate, int iterations)This method calculates inferred vector for given textorg.nd4j.linalg.api.ndarray.INDArrayinferVector(LabelledDocument document)This method calculates inferred vector for given document, with default parameters for learning rate and iterationsorg.nd4j.linalg.api.ndarray.INDArrayinferVector(LabelledDocument document, double learningRate, double minLearningRate, int iterations)This method calculates inferred vector for given documentFuture<org.nd4j.linalg.api.ndarray.INDArray>inferVectorBatched(@NonNull String document)This method implements batched inference, based on Java Future parallelism model.List<org.nd4j.linalg.api.ndarray.INDArray>inferVectorBatched(@NonNull List<String> documents)This method does inference on a given List<String>Future<org.nd4j.common.primitives.Pair<String,org.nd4j.linalg.api.ndarray.INDArray>>inferVectorBatched(@NonNull LabelledDocument document)This method implements batched inference, based on Java Future parallelism model.protected voidinitInference()Collection<String>nearestLabels(@NonNull String rawText, int topN)This method returns top N labels nearest to specified textCollection<String>nearestLabels(@NonNull Collection<VocabWord> document, int topN)This method returns top N labels nearest to specified set of vocab wordsCollection<String>nearestLabels(LabelledDocument document, int topN)This method returns top N labels nearest to specified documentCollection<String>nearestLabels(org.nd4j.linalg.api.ndarray.INDArray labelVector, int topN)This method returns top N labels nearest to specified features vectorStringpredict(String rawText)Deprecated.Stringpredict(List<VocabWord> document)This method predicts label of the document.Stringpredict(LabelledDocument document)This method predicts label of the document.Collection<String>predictSeveral(@NonNull LabelledDocument document, int limit)Predict several labels based on the document.Collection<String>predictSeveral(String rawText, int limit)Predict several labels based on the document.Collection<String>predictSeveral(List<VocabWord> document, int limit)Predict several labels based on the document.protected voidreassignExistingModel()voidsetSequenceIterator(@NonNull SequenceIterator<VocabWord> iterator)This method defines SequenceIterator instance, that will be used as training corpus source.doublesimilarityToLabel(String rawText, String label)Deprecated.doublesimilarityToLabel(List<VocabWord> document, String label)This method returns similarity of the document to specific label, based on mean valuedoublesimilarityToLabel(LabelledDocument document, String label)This method returns similarity of the document to specific label, based on mean valueStringtoJson()-
Methods inherited from class org.deeplearning4j.models.word2vec.Word2Vec
setSentenceIterator, setTokenizerFactory
-
Methods inherited from class org.deeplearning4j.models.sequencevectors.SequenceVectors
buildVocab, getElementsScore, getSequencesScore, getUNK, getWordVectorMatrix, initLearners, setUNK, trainSequence
-
Methods inherited from class org.deeplearning4j.models.embeddings.wordvectors.WordVectorsImpl
accuracy, getLayerSize, getWordVector, getWordVectorMatrixNormalized, getWordVectors, getWordVectorsMean, hasWord, indexOf, jsonSerializable, loadWeightsInto, lookupTable, outOfVocabularySupported, setLookupTable, setModelUtils, setVocab, similarity, similarWordsInVocabTo, update, update, vectorSize, vocab, vocabSize, wordsNearest, wordsNearest, wordsNearest, wordsNearestSum, wordsNearestSum, wordsNearestSum
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.deeplearning4j.nn.weights.embeddings.EmbeddingInitializer
jsonSerializable, loadWeightsInto, vectorSize, vocabSize
-
Methods inherited from interface org.deeplearning4j.models.embeddings.wordvectors.WordVectors
accuracy, getWordVector, getWordVectorMatrixNormalized, getWordVectors, getWordVectorsMean, hasWord, indexOf, lookupTable, outOfVocabularySupported, setModelUtils, similarity, similarWordsInVocabTo, vocab, wordsNearest, wordsNearest, wordsNearest, wordsNearestSum, wordsNearestSum, wordsNearestSum
-
-
-
-
Field Detail
-
labelsSource
protected LabelsSource labelsSource
-
labelAwareIterator
protected transient LabelAwareIterator labelAwareIterator
-
labelsMatrix
protected org.nd4j.linalg.api.ndarray.INDArray labelsMatrix
-
normalizedLabels
protected boolean normalizedLabels
-
inferenceLocker
protected final transient Object inferenceLocker
-
inferenceExecutor
protected transient org.threadly.concurrent.PriorityScheduler inferenceExecutor
-
countSubmitted
protected transient AtomicLong countSubmitted
-
countFinished
protected transient AtomicLong countFinished
-
-
Method Detail
-
initInference
protected void initInference()
-
predict
@Deprecated public String predict(String rawText)
Deprecated.This method takes raw text, applies tokenizer, and returns most probable label- Parameters:
rawText-- Returns:
-
setSequenceIterator
public void setSequenceIterator(@NonNull @NonNull SequenceIterator<VocabWord> iterator)This method defines SequenceIterator instance, that will be used as training corpus source. Main difference with other iterators here: it allows you to pass already tokenized Sequencefor training - Overrides:
setSequenceIteratorin classWord2Vec- Parameters:
iterator-
-
predict
public String predict(LabelledDocument document)
This method predicts label of the document. Computes a similarity wrt the mean of the representation of words in the document- Parameters:
document- the document- Returns:
- the word distances for each label
-
extractLabels
public void extractLabels()
-
inferVector
public org.nd4j.linalg.api.ndarray.INDArray inferVector(String text, double learningRate, double minLearningRate, int iterations)
This method calculates inferred vector for given text- Parameters:
text-- Returns:
-
reassignExistingModel
protected void reassignExistingModel()
-
inferVector
public org.nd4j.linalg.api.ndarray.INDArray inferVector(LabelledDocument document, double learningRate, double minLearningRate, int iterations)
This method calculates inferred vector for given document- Parameters:
document-- Returns:
-
inferVector
public org.nd4j.linalg.api.ndarray.INDArray inferVector(@NonNull @NonNull List<VocabWord> document, double learningRate, double minLearningRate, int iterations)This method calculates inferred vector for given document- Parameters:
document-- Returns:
-
inferVector
public org.nd4j.linalg.api.ndarray.INDArray inferVector(String text)
This method calculates inferred vector for given text, with default parameters for learning rate and iterations- Parameters:
text-- Returns:
-
inferVector
public org.nd4j.linalg.api.ndarray.INDArray inferVector(LabelledDocument document)
This method calculates inferred vector for given document, with default parameters for learning rate and iterations- Parameters:
document-- Returns:
-
inferVector
public org.nd4j.linalg.api.ndarray.INDArray inferVector(@NonNull @NonNull List<VocabWord> document)This method calculates inferred vector for given list of words, with default parameters for learning rate and iterations- Parameters:
document-- Returns:
-
inferVectorBatched
public Future<org.nd4j.common.primitives.Pair<String,org.nd4j.linalg.api.ndarray.INDArray>> inferVectorBatched(@NonNull @NonNull LabelledDocument document)
This method implements batched inference, based on Java Future parallelism model. PLEASE NOTE: In order to use this method, LabelledDocument being passed in should have Id field defined.- Parameters:
document-- Returns:
-
inferVectorBatched
public Future<org.nd4j.linalg.api.ndarray.INDArray> inferVectorBatched(@NonNull @NonNull String document)
This method implements batched inference, based on Java Future parallelism model. PLEASE NOTE: This method will return you Future<INDArray>, so tracking relation between document and INDArray will be your responsibility- Parameters:
document-- Returns:
-
inferVectorBatched
public List<org.nd4j.linalg.api.ndarray.INDArray> inferVectorBatched(@NonNull @NonNull List<String> documents)
This method does inference on a given List<String>- Parameters:
documents-- Returns:
- INDArrays in the same order as input texts
-
predict
public String predict(List<VocabWord> document)
This method predicts label of the document. Computes a similarity wrt the mean of the representation of words in the document- Parameters:
document- the document- Returns:
- the word distances for each label
-
predictSeveral
public Collection<String> predictSeveral(@NonNull @NonNull LabelledDocument document, int limit)
Predict several labels based on the document. Computes a similarity wrt the mean of the representation of words in the document- Parameters:
document- raw text of the document- Returns:
- possible labels in descending order
-
predictSeveral
public Collection<String> predictSeveral(String rawText, int limit)
Predict several labels based on the document. Computes a similarity wrt the mean of the representation of words in the document- Parameters:
rawText- raw text of the document- Returns:
- possible labels in descending order
-
predictSeveral
public Collection<String> predictSeveral(List<VocabWord> document, int limit)
Predict several labels based on the document. Computes a similarity wrt the mean of the representation of words in the document- Parameters:
document- the document- Returns:
- possible labels in descending order
-
nearestLabels
public Collection<String> nearestLabels(LabelledDocument document, int topN)
This method returns top N labels nearest to specified document- Parameters:
document-topN-- Returns:
-
nearestLabels
public Collection<String> nearestLabels(@NonNull @NonNull String rawText, int topN)
This method returns top N labels nearest to specified text- Parameters:
rawText-topN-- Returns:
-
nearestLabels
public Collection<String> nearestLabels(@NonNull @NonNull Collection<VocabWord> document, int topN)
This method returns top N labels nearest to specified set of vocab words- Parameters:
document-topN-- Returns:
-
nearestLabels
public Collection<String> nearestLabels(org.nd4j.linalg.api.ndarray.INDArray labelVector, int topN)
This method returns top N labels nearest to specified features vector- Parameters:
labelVector-topN-- Returns:
-
similarityToLabel
@Deprecated public double similarityToLabel(String rawText, String label)
Deprecated.This method returns similarity of the document to specific label, based on mean value- Parameters:
rawText-label-- Returns:
-
fit
public void fit()
Description copied from class:SequenceVectorsStarts training over- Overrides:
fitin classSequenceVectors<VocabWord>
-
similarityToLabel
public double similarityToLabel(LabelledDocument document, String label)
This method returns similarity of the document to specific label, based on mean value- Parameters:
document-label-- Returns:
-
similarityToLabel
public double similarityToLabel(List<VocabWord> document, String label)
This method returns similarity of the document to specific label, based on mean value- Parameters:
document-label-- Returns:
-
toJson
public String toJson() throws org.nd4j.shade.jackson.core.JsonProcessingException
-
fromJson
public static ParagraphVectors fromJson(String jsonString) throws IOException
- Throws:
IOException
-
-