public class WordVectorSerializer extends Object
Constructor and Description |
---|
WordVectorSerializer() |
Modifier and Type | Method and Description |
---|---|
static WordVectors |
fromPair(Pair<InMemoryLookupTable,VocabCache> pair)
Load word vectors from the given pair
|
static WordVectors |
fromTableAndVocab(WeightLookupTable table,
VocabCache vocab)
Load word vectors for the given vocab and table
|
static float |
getFloat(byte[] b)
Read a string from a data input stream Credit to:
https://github.com/NLPchina/Word2VEC_java/blob/master/src/com/ansj/vec/Word2VEC.java
|
static Word2Vec |
loadGoogleModel(File modelFile,
boolean binary)
Loads the google model
|
static Word2Vec |
loadGoogleModel(File modelFile,
boolean binary,
boolean lineBreaks)
Loads the Google model.
|
static Pair<InMemoryLookupTable,VocabCache> |
loadTxt(File vectorsFile)
Loads an in memory cache from the given path (sets syn0 and the vocab)
|
static WordVectors |
loadTxtVectors(File vectorsFile)
Loads an in memory cache from the given path (sets syn0 and the vocab)
|
static float |
readFloat(InputStream is)
Read a float from a data input stream Credit to:
https://github.com/NLPchina/Word2VEC_java/blob/master/src/com/ansj/vec/Word2VEC.java
|
static String |
readString(DataInputStream dis)
Read a string from a data input stream Credit to:
https://github.com/NLPchina/Word2VEC_java/blob/master/src/com/ansj/vec/Word2VEC.java
|
static void |
writeTsneFormat(Glove vec,
org.nd4j.linalg.api.ndarray.INDArray tsne,
File csv)
Write the tsne format
|
static void |
writeTsneFormat(Word2Vec vec,
org.nd4j.linalg.api.ndarray.INDArray tsne,
File csv)
Write the tsne format
|
static void |
writeWordVectors(InMemoryLookupTable lookupTable,
InMemoryLookupCache cache,
String path)
Writes the word vectors to the given path.
|
static void |
writeWordVectors(Word2Vec vec,
String path)
Writes the word vectors to the given path.
|
public static Word2Vec loadGoogleModel(File modelFile, boolean binary) throws IOException
modelFile
- the path to the google modelbinary
- read from binary file format (if set to true) or from text file format.IOException
public static Word2Vec loadGoogleModel(File modelFile, boolean binary, boolean lineBreaks) throws IOException
modelFile
- the input filebinary
- read from binary or text file formatlineBreaks
- if true, the input file is expected to terminate each line with a line break. This
is typically the case for files created with recent versions of Word2Vec, but not
for the downloadable model files.Word2Vec
objectIOException
public static float readFloat(InputStream is) throws IOException
is
- IOException
public static float getFloat(byte[] b)
b
- IOException
public static String readString(DataInputStream dis) throws IOException
dis
- IOException
public static void writeWordVectors(InMemoryLookupTable lookupTable, InMemoryLookupCache cache, String path) throws IOException
lookupTable
- cache
- path
- the path to writeIOException
public static void writeWordVectors(Word2Vec vec, String path) throws IOException
vec
- the word2vec to writepath
- the path to writeIOException
public static WordVectors fromTableAndVocab(WeightLookupTable table, VocabCache vocab)
table
- the weights to usevocab
- the vocab to usepublic static WordVectors fromPair(Pair<InMemoryLookupTable,VocabCache> pair)
pair
- the given pairpublic static WordVectors loadTxtVectors(File vectorsFile) throws FileNotFoundException
vectorsFile
- the path of the file to load\FileNotFoundException
- if the file does not existpublic static Pair<InMemoryLookupTable,VocabCache> loadTxt(File vectorsFile) throws FileNotFoundException
vectorsFile
- the path of the file to loadFileNotFoundException
public static void writeTsneFormat(Glove vec, org.nd4j.linalg.api.ndarray.INDArray tsne, File csv) throws Exception
vec
- the word vectors to use for labelingtsne
- the tsne array to writecsv
- the file to useException
Copyright © 2015. All Rights Reserved.