I used DeepLearning4j to train word2vec model. Then I had to save the dictionary to CSV so I can run some clustering algorithms on it.
Sounded like a simple task, but it took a while, and here is the code to do this:
private void writeIndexToCsv(String csvFileName, Word2Vec model) { CSVWriter writer = null; try { writer = new CSVWriter(new FileWriter(csvFileName)); } catch (IOException e) { e.printStackTrace(); } VocabCache<VocabWord> vocCache = model.vocab(); Collection<VocabWord> wrds = vocCache.vocabWords(); for(VocabWord w : wrds) { String s = w.getWord(); System.out.println("Looking into the word:"); System.out.println(s); StringBuilder sb = new StringBuilder(); sb.append(s).append(","); double[] wordVector = model.getWordVector(s); for(int i = 0; i < wordVector.length; i++) { sb.append(wordVector[i]).append(","); } writer.writeNext(sb.toString().split(","), false); } try { writer.close(); } catch (IOException e) { e.printStackTrace(); } }